# A General, Practicable Deﬁnition of Perfect Bayesian Equilibrium

Download A General, Practicable Deﬁnition of Perfect Bayesian Equilibrium

## Preview text

A General, Practicable Deﬁnition of Perfect Bayesian Equilibrium

Joel Watson∗

February 2017

Abstract This paper develops a general deﬁnition of perfect Bayesian equilibrium (PBE) for extensive-form games. It is based on a new consistency condition for the players’ beliefs, called plain consistency, that requires proper conditional-probability updating on independent dimensions of the strategy space. The condition is familiar and convenient for applications because it constrains only how a player’s belief is updated on consecutive information sets. The PBE concept is deﬁned for inﬁnite games, implies subgame perfection, and captures the notion of “no signaling what you don’t know.” A key element of the approach taken herein is to express a player’s belief at an information set as a probability distribution over strategy proﬁles.

1 Introduction

Standard solution concepts for dynamic games are based on the notion of sequential rationality, which requires players to maximize their expected payoffs not just in the ex ante sense (strategy selection before the game is played) but at all contingencies within the game where they are called upon to take actions. Trembling-hand perfect equilibrium (Selten 1975) and sequential equilibrium (Kreps and Wilson 1982) ensure that the rationality test is applied to all information sets in an extensive-form game, because these concepts are deﬁned relative to convergent sequences of fully mixed behavior strategies.

Trembling-hand perfect equilibrium and sequential equilibrium aren’t always the best choice for applications, for the following reasons. First, constructing sequences of fully mixed strategies with the desired properties can be difﬁcult in complex games. Second, for some applications, more permissive concepts—allowing for a greater range of beliefs at information sets—may be desired. Third, while many applications are conveniently formulated with inﬁnite action spaces, trembling-hand perfect equilibrium and sequential equilibrium are not

∗UC San Diego; http://econ.ucsd.edu/∼jwatson/. The author thanks the NSF for ﬁnancial support (SES1227527) and the following people for their insightful input: Nageeb Ali, Pierpaolo Battigalli, Jesse Bull, Jack Fanning, Simone Galperti, Keri Hu, David Miller, Phil Reny, Karl Schlag, Joel Sobel, and two anonymous referees. This paper owes a great deal to Pierpaolo Battigalli and his work identifying independence conditions central to major solution concepts.

1

deﬁned for such games.1 All three of these factors are relevant for many modern applications, including some models of private contracting on networks, dynamic contracting, sequential evidence production, and repeated games on networks.

Practitioners therefore often turn to the perfect Bayesian equilibrium (PBE) concept, which is usually described in the same way as is sequential equilibrium—a behavior strategy proﬁle and a system of assessments that give the players’ beliefs at information sets as probability distributions over nodes—but puts structure on the assessments with consistency conditions that are formulated without reference to strategy trembles. At the heart of PBE is the idea that the players’ beliefs should be consistent with proper conditional probability updating (Bayes’ rule) where applicable. To state a formal deﬁnition of PBE, one must make “where applicable” precise.

Fudenberg and Tirole (1991) provide the leading formal deﬁnition of PBE in the literature, but this deﬁnition applies only to the class of “ﬁnite multi-period games with observed actions and independent types” whereas applications of PBE are increasingly outside this class. A more general deﬁnition of PBE based on Battigalli’s (1996) independence property for conditional probability systems has not been utilized in applications because it lacks a practicable formulation and because verifying the independence condition appears tantamount to ﬁnding a suitable sequence of fully mixed behavior strategies in a sequential-equilibrium construction.2 Further, an inﬁnite-game extension has not been worked out.

Although applications of “perfect Bayesian equilibrium” are widespread in the literature, a measure of ambiguity persists regarding the technical conditions that practitioners are actually utilizing in individual modeling exercises. In some articles, PBE is the stated solution concept but there is no reference to a formal deﬁnition. Thus, for other than ﬁnite multi-period games with observed actions and independent types, it is not always clear what researchers have in mind. There is a range of possible consistency assumptions, and the assumptions matter in applications. Questions linger about what “Bayes’ rule where applicable” should mean.

For the analysis of complex games, researchers sometimes retreat to the concept of weak PBE because of its simple structure and ﬂexibility, despite that it was advanced as a pedagogical stepping stone.3 Weak PBE imposes no constraints on beliefs off the equilibrium path. It does not imply subgame perfection.

This paper endeavors to support wider application of PBE by providing a general deﬁnition of perfect Bayesian equilibrium that meets several goals. First, it constrains only how individual players update beliefs on consecutive information sets—that is, from one information set to the next one that arises for the same player—thus lending itself to straightforward application in a way familiar to practitioners. Second, it applies to all ﬁnite games as well as to inﬁnite games with the appropriate measurability structure. Third, it is a reﬁnement

1Myerson and Reny (2015) discuss technical problems with extending sequential equilibrium to inﬁnite games and propose a new concept called open sequential equilibrium.

2On the unwieldy point, working toward a PBE deﬁnition using Fudenberg and Tirole’s (1991) and Battigalli’s (1996) framework would amount to the following: Postulate a conditional probability system over strategy proﬁles, ensure that it has the independence property, calculate an implied strategy proﬁle and a conditional conjecture system over terminal nodes, derive from it the assessments at information sets, and verify sequential rationality.

3See Myerson (1991) and Mas-Colell, Whinston, and Green (1995). Some stronger deﬁnitions that imply subgame perfection are discussed in Section 5.

2

of subgame perfection. And fourth, it captures the notion of “no signaling what you don’t know,” implying Fudenberg and Tirole’s (1991) reasonableness condition for multi-period games with observed actions and independent types.4

The PBE deﬁnition proposed herein is based on a new consistency notion called plain consistency, which emulates some of the structure on beliefs inherent in sequential equilibrium. Central to the deﬁnitions of trembling-hand perfect equilibrium and sequential equilibrium is the assumption that choices made at different information sets are independent of one another, as represented by behavior strategies. This assumption puts structure on the players’ beliefs, including at off-path information sets. Likewise, plain consistency imposes some independence in the operation of conditional-probability updating. For ease of use, the condition is limited to updating on consecutive information sets, but it is also strong enough to deliver the other desired properties stated above.5

To get the basic idea in the abstract, consider a setting in which values x ∈ {a, b, c} and y ∈ {d, e, f} will be realized. Suppose the prior belief of a Bayesian decision maker has x and y independently distributed, with positive probability on all outcomes. Imagine that the decision maker then learns that (x, y) ∈ E = {a, b}×{d, e}. Note that E, as a product set, is a conjunctive event: “x ∈ {a, b} AND y ∈ {d, e}.” Because the prior satisﬁes independence and E is conjunctive, updating about x and y can be done separately. The decision maker’s posterior belief will be given by the product of the conditional marginal probabilities,

Prob [(x, y) | {a, b}×{d, e}] = Prob [x | {a, b}] · Prob [y | {d, e}],

(1)

with the marginal conditional probabilities deﬁned by the conditional-probability formula. The key idea is that we want the conditional probabilities on the right side of Equation 1 to

be deﬁned, and the equation to hold, even if the prior belief puts zero probability on x ∈ {a, b} or on y ∈ {d, e}. For instance, if the prior satisﬁes Prob [{a, b}] > 0 then the marginal conditional probability for x should be deﬁned by the conditional-probability formula, so that Prob [{a} | {a, b}] = Prob [{a}]/ Prob [{a, b}]. If Prob [{a, b}] = 0 then the marginal conditional probability for x is arbitrary but we still require it to be deﬁned, and we still require Equation 1 to hold. It is easy to verify that the conditional properties would be deﬁned and Equation 1 would always be satisﬁed if the decision maker’s prior and posterior beliefs were given by the limit of a sequence of fully mixed joint distributions with x and y independent.

Shifting our attention back to games, components x and y in the abstract example now represent different dimensions of the strategy proﬁle, and E now represents information a player receives about the strategy proﬁle by virtue of arriving at an information set. The independence condition is imposed on updating from this player’s previous information set if E is a conjunctive event.

A key element of the approach taken herein is to describe a player’s belief at an information set as a probability distribution over the strategy proﬁle, which I call an appraisal; it captures the player’s conjecture about both how the information set was reached and what will happen from this point in the game. Each player is assumed to have a conjecture system that maps the player’s information sets into appraisals. I specify that every player has a (possibly artiﬁcial)

4The idea is that a player i would not use the observation of a surprise choice by a player j to change i’s belief about a move of some other player k (which could be nature) that player j did not observe.

5Battigalli’s (1996) independence condition would be the strongest condition along these lines.

3

1b2

(0)

a (1)

h

1d

(0)

c (1)

cd a b

Figure 1: Independence across information sets of the same player.

information set representing the beginning of the game. Thus, a player starts the game with an appraisal and then updates it from one information set to the next as play proceeds.

The following examples illustrate the building blocks of the theory. Consider ﬁrst the game fragment shown in Figure 1 above. Just part of the extensive-form is pictured; the rest is inconsequential to the discussion here. Also pictured is a table indicating the possible combinations of actions for player 1 at his two information sets pictured in the tree. Suppose that backward induction identiﬁes strategy ac for player 1 so that, at the beginning of the game, player 2 believes that player 1 will select action a at his ﬁrst information set and would select action c at his second information set. This is indicated in the ﬁgure by the probability 1 on actions a and c, and probability 0 on actions b and d.

How should player 2 update her belief in the event that her information set h is reached? Backward-induction reasoning provides an answer: Player 2 has observed that player 1 selected b at his ﬁrst information set—which is a surprise given player 2’s initial belief—but this does not cause player 2 to change her belief that player 1 would select action c at his second information set.

We can dissect the logic as follows. At the beginning of the game, player 2 initially treats the actions at player 1’s information sets as independent. Further, from the structure of the game, arriving at h provides player 2 with information that we can describe as a conjunctive event: “Player 1’s action at his ﬁrst information set is in {b} AND player 1’s action at his second (as yet unreached) information set is in {c, d}.” Thus, the combination of player 1’s actions that are consistent with information set h being reached (the shaded region of the table) forms a product set, {b} × {c, d}. And in these circumstances, player 2 updates her belief about the action at player 1’s second information set based on only what she has learned from the structure of the game about the action at player 1’s second information set—which is, of course, nothing. Thus, she maintains the belief that player 1 would select c. Importantly, this logic applies even though reaching h is a surprise for player 2 in that her initial belief puts zero probability on h being reached.

Now let us apply the same logic to a player’s belief about the actions of two other players, which also illustrates the idea of “no signaling what you don’t know.” Consider the game fragment shown in Figure 2. At the beginning of the game, player 3 believes that player 1 will select b for sure and that player 2 will choose c, d, and e with probabilities 0.2, 0.2, and 0.6 respectively. At information set h, player 3 has learned that player 1 chose a and that player 2 did not select c. That is, the set of action proﬁles that reach h is the product set {a} × {d, e}, and so the information at h is a conjunctive event. Player 3 then updates her belief about player 2’s action on the basis of only what she has observed about player 2’s action. Player 3 does not use the surprise regarding player 1’s choice as an excuse to take liberties with the

4

1 b 2 c (.2)

(1)

a (0)

e d(.2)

(.6)

c (.2)

e

(.6)

d

(.2)

xh 3

cd e a b

Figure 2: Independence and signaling.

probabilities of c, d, and e. Thus, player 3’s updated belief puts probability 0.25 on d and probability 0.75 on e, as Bayes’ rule requires for the marginal distribution over player 2’s action.

The next example, shown in Figure 3 below, demonstrates that the logic of independence is embedded in the concept of subgame perfection and that weak PBE does not imply subgame perfection. The table in the ﬁgure shows the strategy proﬁles, with rows representing player 2’s actions and columns representing proﬁles of actions taken at the information sets of players 1 and 3. The shaded region denotes the strategy proﬁles that reach player 3’s information set, a conjunctive event. This game has a single subgame-perfect equilibrium, (w, a, x). In the proper subgame, player 3 selects x in response to player 2’s equilibrium action a. There is also a weak PBE in which (z, a, y) is the strategy proﬁle and, at information set h, player 3 believes that player 2 selected action b. Weak PBE allows for this belief because it imposes no restrictions on how beliefs are updated off the equilibrium path. Player 3 changes his belief about player 2 based on player 1’s surprise selection of action w, contrary to the independence notion. Clearly, (z, a, y) is not a subgame-perfect equilibrium.

In the formal deﬁnition of plain consistency, the foregoing logic is applied to the extent possible for belief updating on consecutive information sets. For instance, consider a player i who is on the move at information set h in a game, and let L denote any subset of the other players’ information sets (not necessarily a proper subset and possibly including information sets of nature). At h, player i will have a belief about the strategies that the other players are using, including the actions they take at the information sets in L. Suppose that this belief exhibits independence between the behavior at the information sets in L and the behavior

3x

1 w 2a

y

bh

zc

x

1,0,1 0,0,0

y

2,2,2 0,1,0 0,0,0 0,0,1

wx wy zx zy a b c

Figure 3: Weak PBE and subgame perfection.

5

at other information sets, in that player i views behavior in L as uncorrelated with behavior outside of L.

Suppose that the next information set encountered by player i is h , and suppose that the set of strategy proﬁles consistent with reaching h is a product set with respect to L. Then plain consistency requires that player i’s belief at h must be the product of marginals with respect to L, and the marginal distribution on the information sets in L should be consistent with the conditional-probability formula restricted to these information sets. That is, player i’s updated belief about actions taken at information sets in L is conditioned on only what player i has observed about these particular actions.

The deﬁnition of plain consistency goes a bit further by applying the same idea to subsets of strategy proﬁles. That is, we can impose the same logic on any subset Z of strategy proﬁles on which player i puts positive probability at both information sets h and h . If Z is a product set with respect to L and the subset that reaches h is also a product set, then what player i learns by arriving at h allows for the L dimension to be separated from its complement. Assuming that player i’s belief at h, when restricted to the set Z, exhibits independence between the behavior at the information sets in L and the behavior at other information sets, then his belief at h ought to exhibit similar independence and updating should obey Bayes’ rule on each dimension (as applicable).

Plain consistency puts no other restrictions on how players update their beliefs. The PBE deﬁnition combines plain consistency with the assumptions that the players’ beliefs at the beginning of the game are concentrated on the the actual strategy proﬁle and that each player’s strategy is sequentially rational.

Before launching into the deﬁnitions, let me elaborate on why it is helpful to express beliefs in terms of appraisals rather than assessments. To describe whether beliefs exhibit independence regarding actions taken at different information sets, one must keep track of these various actions as separate components, as a strategy proﬁle does. To accomplish this with assessments, one must put structure on the nodes at every information set so that each node x describes the actions taken on the path to x. But such a structure is equivalent to keeping track of the strategy proﬁle, at least its restriction to information sets that came before the current information set. Further, even if we imagine adding this structure and then using the equilibrium strategy proﬁle for “future” information sets when calculating expected payoffs, another complication arises: At a given information set, the other information sets in the game generally cannot be neatly classiﬁed as coming either before or after the current one.6 Thus, I suggest that the most straightforward approach is to focus on strategy proﬁles and account for beliefs as appraisals.

The next section lays out the basic notation and deﬁnitions. Section 3 develops the notion of plain consistency and the equilibrium concept. Section 4 provides details on how to apply the PBE deﬁnition to inﬁnite games and extends the deﬁnition of plain consistency accordingly. Section 5 compares plain PBE with other equilibrium deﬁnitions. The Appendix contains additional deﬁnitions related to the one-deviation property and a proof.

6See Kreps and Ramey (1987) for an example in which the player on the move does not know whether a particular information set for another player was already reached. These complications presumably led Fudenberg and Tirole (1991) and Battigalli (1996) to describe equilibrium in general games as a combination of assessments and conditional probability systems on terminal nodes.

6

2 Basic Concepts

Information Sets, Strategies, and Payoffs

Consider any extensive-form game of perfect recall, with n players and nature taking the role of “player 0.” For convenience, in this and the next section the deﬁnitions and results are put in a form that applies to ﬁnite games; Section 4 extends the deﬁnitions to games with inﬁnite strategy spaces. The deﬁnitions also apply to other dynamic representations (substitute “personal history” or “contingency” for “information set”).

Deﬁne N ≡ {0, 1, . . . , n} and let N+ ≡ N \{0} denote the set of strategic players. Let H be the set of information sets. It is partitioned into sets H0, H1, . . . , Hn, where Hi denotes the set of information sets for player i. (As is standard, information sets are distinctly labeled so that the players’ individual sets of information sets are disjoint.) Let H+ ≡ ∪i∈N+Hi be the set of information sets for the strategic players. Denote by S the space of pure strategy proﬁles, including nature’s strategy. Let ∆S denote the space of probability distributions over S, which we call the mixed strategy proﬁles. For any subset T ⊂ S, let us take “∆T ” to mean the subset of ∆S with support in T . Note that I use the symbol “⊂” to mean “subset,” not “proper subset.” Finally, let u : S → Rn be the payoff function and extend it to the space of mixed strategies by the usual expected payoff calculation.

We will be dealing essentially with the agent form of the game, in that beliefs and choice are analyzed at individual information sets. A key element is that a player’s choice at one information set is independent of his choice at another information set. For example, if player i has two different information sets, h and h , then we think of player i as being separated into two agents whose names are the information sets themselves: Agent h takes the action at information set h, agent h takes the action at information set h , and these are independent choices. The agents in Hi all share the payoff function ui.

Note that a strategy proﬁle s ∈ S maps H to the space of actions and, for each information set, speciﬁes an action that is feasible at this information set. For any subset of information sets L ⊂ H, let sL denote the restriction of s to the subdomain L. That is, sL gives the proﬁle of actions that strategy s speciﬁes for the information sets in L. For any L ⊂ H, deﬁne −L ≡ H \L. Note that we can then write s = sLs−L.

For X ⊂ S, deﬁne XL ≡ {sL | s ∈ X}. In the case of L = {h} for a single h ∈ H, we simplify notation by dropping the brackets; so, for instance, we write Xh and sh instead of X{h} and s{h}. Note that Sh is the set of actions available at information set h. Also, for a given player i, the subscript “i” refers to the information sets Hi. For example, si means the same thing as sHi. Likewise, “−i” refers to H−i. Thus, subscripts “i” and “−i” have their usual meaning of identifying the strategies of player i and the other players.

Deﬁnition 1: For a given set L ⊂ H, say that a set X ⊂ S is a product set (relative to L) if X = XL × X−L.

The next deﬁnition identiﬁes whether a mixture of strategy proﬁles treats a speciﬁc set of information sets L ⊂ H independently of the rest, meaning that it can be expressed as the product of the marginal distribution on L and the marginal distribution on −L.

7

Deﬁnition 2: Given L ⊂ H and a product set Y = YL × Y−L ⊂ S, say that a distribution p ∈ ∆S exhibits independence on Y relative to L if for every product set X = XL×X−L ⊂ Y , we have p(X)p(Y ) = p(XL × Y−L) · p(YL × X−L). In the case of L = {h} for a single h ∈ H, then let us drop the brackets and say “. . . relative to h.” Say that p exhibits complete independence if, for every h ∈ H, p exhibits independence relative to h on S.

Independence on Y means that the distribution conditional on Y , which is given by the standard conditional probability formula, exhibits independence across L and −L. That is, the conditional probability of a product set X = XL ×X−L ⊂ Y is the product of the conditional marginal probabilities:

Prob [X | Y ] = p(X) = p(XL ×Y−L) · p(X−L ×YL)

p(Y )

p(Y )

p(Y )

= Prob [XL ×Y−L | Y ] · Prob [X−L ×YL | Y ].

In the expression above, the second equality is due to independence on Y relative to L. Note that independence relative to L on S implies the same on every product set Y ⊂ S.

Let ∆S be the set of mixed strategy proﬁles that exhibit complete independence. Note that a mixture in ∆S is equivalent to a behavior strategy proﬁle: It speciﬁes, for every information set h, a probability distribution over the actions available at h, and the speciﬁcation is independent across information sets. We typically write such a mixture as σ = (σh)h∈H, where σh denotes the mixed action choice at information set h. Nature’s mixed strategy is taken as exogenous and we assume that it exhibits independence relative to all of nature’s information sets.

It will be useful to think about information sets in terms of subsets of strategy proﬁles. For each h ∈ H and s ∈ S, let us say that s reaches h if the path of strategy proﬁle s includes a node in h. Denote by S(h) the set of strategy proﬁles that reach h. Note that, for any L ⊂ H, S(h)L is the set of action proﬁles for the information sets in L that are consistent with h being reached.7

Because of perfect recall, the information sets for an individual player have a particular product structure and precedence relation. For every pair of information sets h, h ∈ Hi for player i, it is the case that S(h) is a product set relative to h . Further, for h, h ∈ Hi with h = h , either h is a successor of h , in which case S(h) ⊂ S(h ); or h is a predecessor of h , in which case S(h ) ⊂ S(h); or neither, in which case S(h) ∩ S(h ) = ∅. If h is a successor of h then every path through h also passes through h. We call h ∈ Hi an immediate successor of h ∈ Hi for player i if h is a successor of h and there is no other information set for player i between the two; that is, there is no g ∈ Hi such that g is a successor of h and h is a successor of g.

7Expressing extensive-form information sets as subsets of strategy proﬁles is standard. Mailath, Samuelson, and Swinkels (1993) formulate solution concepts on the basis of “normal form information sets,” where there is no reference to an extensive form, and Shimoji and Watson (1998) take such “restrictions” as given (whether or not they are derived from extensive-form information sets). Note that, here, I am taking the conventional approach of examining standard extensive-form information sets but simply represent them as subsets of the strategy space.

8

0,1,0

x

3d

2y

0,1,0 e

a

bw d

1c z

e

z hd

2w e

3,1,0

2,2,1 0,0,0

xw xz yw yz xw xz yw yz ddddeeee

1,2,0 a

b

1,0,1 c

1,2,1

0,0,0

Figure 4: Example to illustrate theoretical components.

An Example

To review some of the deﬁnitions just described, consider the three-player game shown in Figure 4. The space of strategy proﬁles is S = {a, b, c}×{x, y}×{w, z}×{d, e}, which is depicted by the table on the right side of the picture. Note that the rows of the table are the actions feasible at player 1’s information set, whereas the columns are the proﬁles of actions for the other players’ information sets. For the information set h identiﬁed in the picture (player 3’s information set), we have

Sh = {d, e} and S−h = {a, b, c}×{x, y}×{w, z}.

The subset of strategy proﬁles that are consistent with reaching h is

S(h) = S(h)−h ×S(h)h = {(a, y, w), (a, y, z), (b, x, z), (b, y, z), (c, x, z), (c, y, z)}×{d, e}.

This set corresponds to the shaded region of the table in the ﬁgure. Clearly S(h) is a product set relative to h but it is not a product set relative to the information set of player 1. That is, letting h denote player 1’s information set, we have

S(h)h = {a, b, c} and S(h)−h = {(x, z), (y, w), (y, z)}×{d, e},

and S(h) does not equal the Cartesian product of S(h)h and S(h)−h .

Appraisal Systems

We must consider the beliefs that the strategic players hold at their various information sets. It will be useful to think of each player as having an information set that refers to “before the game begins.” For this purpose, deﬁne initial information sets h1, h2, . . . , hn and extended sets H1, H2, . . . , Hn for the strategic players as follows. For each strategic player i:

9

• If there exists hˆ ∈ Hi such that S(hˆ) = S then deﬁne hi ≡ hˆ and let Hi ≡ Hi.

• Otherwise let hi be deﬁned as an artiﬁcial information set with the property S(hi) ≡ S and let Hi ≡ Hi ∪ {hi}.

Assume the players’ artiﬁcial information sets are distinctly labelled so that H1, H2, . . . , Hn are disjoint sets, and let H+ ≡ ∪i∈N+ Hi.

Deﬁnition 3: For any strategic player i and h ∈ Hi, call a distribution ph ∈ ∆S an appraisal at h if ph ∈ ∆S(h) and if ph exhibits independence on S relative to g for every g ∈ Hi. An

appraisal system is a collection of appraisals, one for each information set of the strategic

players, written P = ph

.

h∈H +

An appraisal contains two things: The marginal on Si gives player i’s own strategy and the marginal on S−i gives player i’s belief about the strategy proﬁle of the other players. In terms of player-agents, an appraisal at information set h describes agent h’s belief about the other agents’ behavior (this is the marginal on S−h) as well as agent h’s planned behavior (the marginal on Sh). The condition ph ∈ ∆S(h) means that the appraisal at h puts probability one on reaching h. The independence condition means that, at any information set h ∈ H+, player i views his strategy as independent of the other players’ strategy proﬁle, and player i’s

strategy is represented as a behavior strategy. Conditions on the relation between appraisals at

different information sets are developed in the following sections.

Sequential Best Responses

We can test whether an appraisal at h speciﬁes rational behavior for the player on the move, meaning that the actions given positive probability at information set h maximize the player’s expected payoff.

Deﬁnition 4: For a given information set h ∈ H+ and two appraisals ph and pˆh, say that pˆh is an h-deviation from ph if ph and pˆh are identical on all other information sets; that is, ph(X−h ×Sh) = pˆh(X−h ×Sh), for all X−h ⊂ S−h.

Deﬁnition 5: For a strategic player i and an information set h ∈ Hi, say that an appraisal ph is rational at h if ui(ph) ≥ ui(pˆh) for every h-deviation pˆh. Say that an appraisal system P = ph h∈H+ is sequentially rational if ph is rational at h, for every h ∈ H+.

Here sequential rationality is deﬁned in terms of what are commonly called “one-shot deviations,” meaning that we evaluate player i’s rationality at a given information set h ∈ Hi by looking just at alternative choices at h rather than alternatives that would also adjust player i’s behavior at other information sets that may be reached in the continuation of the game.8 The familiar one-deviation principle—equivalence between single-deviation optimality and strategy-deviation optimality—holds here, assuming that player i’s appraisal system has the property I call “minimal consistency.” See the Appendix for more details. Minimal consistency is implied by the plain consistency condition deﬁned in the next section.

8Note that since all strategy proﬁles in the support of ph and pˆh reach h, the expected payoffs shown in the rationality deﬁnition are conditional on reaching information set h.

10

Joel Watson∗

February 2017

Abstract This paper develops a general deﬁnition of perfect Bayesian equilibrium (PBE) for extensive-form games. It is based on a new consistency condition for the players’ beliefs, called plain consistency, that requires proper conditional-probability updating on independent dimensions of the strategy space. The condition is familiar and convenient for applications because it constrains only how a player’s belief is updated on consecutive information sets. The PBE concept is deﬁned for inﬁnite games, implies subgame perfection, and captures the notion of “no signaling what you don’t know.” A key element of the approach taken herein is to express a player’s belief at an information set as a probability distribution over strategy proﬁles.

1 Introduction

Standard solution concepts for dynamic games are based on the notion of sequential rationality, which requires players to maximize their expected payoffs not just in the ex ante sense (strategy selection before the game is played) but at all contingencies within the game where they are called upon to take actions. Trembling-hand perfect equilibrium (Selten 1975) and sequential equilibrium (Kreps and Wilson 1982) ensure that the rationality test is applied to all information sets in an extensive-form game, because these concepts are deﬁned relative to convergent sequences of fully mixed behavior strategies.

Trembling-hand perfect equilibrium and sequential equilibrium aren’t always the best choice for applications, for the following reasons. First, constructing sequences of fully mixed strategies with the desired properties can be difﬁcult in complex games. Second, for some applications, more permissive concepts—allowing for a greater range of beliefs at information sets—may be desired. Third, while many applications are conveniently formulated with inﬁnite action spaces, trembling-hand perfect equilibrium and sequential equilibrium are not

∗UC San Diego; http://econ.ucsd.edu/∼jwatson/. The author thanks the NSF for ﬁnancial support (SES1227527) and the following people for their insightful input: Nageeb Ali, Pierpaolo Battigalli, Jesse Bull, Jack Fanning, Simone Galperti, Keri Hu, David Miller, Phil Reny, Karl Schlag, Joel Sobel, and two anonymous referees. This paper owes a great deal to Pierpaolo Battigalli and his work identifying independence conditions central to major solution concepts.

1

deﬁned for such games.1 All three of these factors are relevant for many modern applications, including some models of private contracting on networks, dynamic contracting, sequential evidence production, and repeated games on networks.

Practitioners therefore often turn to the perfect Bayesian equilibrium (PBE) concept, which is usually described in the same way as is sequential equilibrium—a behavior strategy proﬁle and a system of assessments that give the players’ beliefs at information sets as probability distributions over nodes—but puts structure on the assessments with consistency conditions that are formulated without reference to strategy trembles. At the heart of PBE is the idea that the players’ beliefs should be consistent with proper conditional probability updating (Bayes’ rule) where applicable. To state a formal deﬁnition of PBE, one must make “where applicable” precise.

Fudenberg and Tirole (1991) provide the leading formal deﬁnition of PBE in the literature, but this deﬁnition applies only to the class of “ﬁnite multi-period games with observed actions and independent types” whereas applications of PBE are increasingly outside this class. A more general deﬁnition of PBE based on Battigalli’s (1996) independence property for conditional probability systems has not been utilized in applications because it lacks a practicable formulation and because verifying the independence condition appears tantamount to ﬁnding a suitable sequence of fully mixed behavior strategies in a sequential-equilibrium construction.2 Further, an inﬁnite-game extension has not been worked out.

Although applications of “perfect Bayesian equilibrium” are widespread in the literature, a measure of ambiguity persists regarding the technical conditions that practitioners are actually utilizing in individual modeling exercises. In some articles, PBE is the stated solution concept but there is no reference to a formal deﬁnition. Thus, for other than ﬁnite multi-period games with observed actions and independent types, it is not always clear what researchers have in mind. There is a range of possible consistency assumptions, and the assumptions matter in applications. Questions linger about what “Bayes’ rule where applicable” should mean.

For the analysis of complex games, researchers sometimes retreat to the concept of weak PBE because of its simple structure and ﬂexibility, despite that it was advanced as a pedagogical stepping stone.3 Weak PBE imposes no constraints on beliefs off the equilibrium path. It does not imply subgame perfection.

This paper endeavors to support wider application of PBE by providing a general deﬁnition of perfect Bayesian equilibrium that meets several goals. First, it constrains only how individual players update beliefs on consecutive information sets—that is, from one information set to the next one that arises for the same player—thus lending itself to straightforward application in a way familiar to practitioners. Second, it applies to all ﬁnite games as well as to inﬁnite games with the appropriate measurability structure. Third, it is a reﬁnement

1Myerson and Reny (2015) discuss technical problems with extending sequential equilibrium to inﬁnite games and propose a new concept called open sequential equilibrium.

2On the unwieldy point, working toward a PBE deﬁnition using Fudenberg and Tirole’s (1991) and Battigalli’s (1996) framework would amount to the following: Postulate a conditional probability system over strategy proﬁles, ensure that it has the independence property, calculate an implied strategy proﬁle and a conditional conjecture system over terminal nodes, derive from it the assessments at information sets, and verify sequential rationality.

3See Myerson (1991) and Mas-Colell, Whinston, and Green (1995). Some stronger deﬁnitions that imply subgame perfection are discussed in Section 5.

2

of subgame perfection. And fourth, it captures the notion of “no signaling what you don’t know,” implying Fudenberg and Tirole’s (1991) reasonableness condition for multi-period games with observed actions and independent types.4

The PBE deﬁnition proposed herein is based on a new consistency notion called plain consistency, which emulates some of the structure on beliefs inherent in sequential equilibrium. Central to the deﬁnitions of trembling-hand perfect equilibrium and sequential equilibrium is the assumption that choices made at different information sets are independent of one another, as represented by behavior strategies. This assumption puts structure on the players’ beliefs, including at off-path information sets. Likewise, plain consistency imposes some independence in the operation of conditional-probability updating. For ease of use, the condition is limited to updating on consecutive information sets, but it is also strong enough to deliver the other desired properties stated above.5

To get the basic idea in the abstract, consider a setting in which values x ∈ {a, b, c} and y ∈ {d, e, f} will be realized. Suppose the prior belief of a Bayesian decision maker has x and y independently distributed, with positive probability on all outcomes. Imagine that the decision maker then learns that (x, y) ∈ E = {a, b}×{d, e}. Note that E, as a product set, is a conjunctive event: “x ∈ {a, b} AND y ∈ {d, e}.” Because the prior satisﬁes independence and E is conjunctive, updating about x and y can be done separately. The decision maker’s posterior belief will be given by the product of the conditional marginal probabilities,

Prob [(x, y) | {a, b}×{d, e}] = Prob [x | {a, b}] · Prob [y | {d, e}],

(1)

with the marginal conditional probabilities deﬁned by the conditional-probability formula. The key idea is that we want the conditional probabilities on the right side of Equation 1 to

be deﬁned, and the equation to hold, even if the prior belief puts zero probability on x ∈ {a, b} or on y ∈ {d, e}. For instance, if the prior satisﬁes Prob [{a, b}] > 0 then the marginal conditional probability for x should be deﬁned by the conditional-probability formula, so that Prob [{a} | {a, b}] = Prob [{a}]/ Prob [{a, b}]. If Prob [{a, b}] = 0 then the marginal conditional probability for x is arbitrary but we still require it to be deﬁned, and we still require Equation 1 to hold. It is easy to verify that the conditional properties would be deﬁned and Equation 1 would always be satisﬁed if the decision maker’s prior and posterior beliefs were given by the limit of a sequence of fully mixed joint distributions with x and y independent.

Shifting our attention back to games, components x and y in the abstract example now represent different dimensions of the strategy proﬁle, and E now represents information a player receives about the strategy proﬁle by virtue of arriving at an information set. The independence condition is imposed on updating from this player’s previous information set if E is a conjunctive event.

A key element of the approach taken herein is to describe a player’s belief at an information set as a probability distribution over the strategy proﬁle, which I call an appraisal; it captures the player’s conjecture about both how the information set was reached and what will happen from this point in the game. Each player is assumed to have a conjecture system that maps the player’s information sets into appraisals. I specify that every player has a (possibly artiﬁcial)

4The idea is that a player i would not use the observation of a surprise choice by a player j to change i’s belief about a move of some other player k (which could be nature) that player j did not observe.

5Battigalli’s (1996) independence condition would be the strongest condition along these lines.

3

1b2

(0)

a (1)

h

1d

(0)

c (1)

cd a b

Figure 1: Independence across information sets of the same player.

information set representing the beginning of the game. Thus, a player starts the game with an appraisal and then updates it from one information set to the next as play proceeds.

The following examples illustrate the building blocks of the theory. Consider ﬁrst the game fragment shown in Figure 1 above. Just part of the extensive-form is pictured; the rest is inconsequential to the discussion here. Also pictured is a table indicating the possible combinations of actions for player 1 at his two information sets pictured in the tree. Suppose that backward induction identiﬁes strategy ac for player 1 so that, at the beginning of the game, player 2 believes that player 1 will select action a at his ﬁrst information set and would select action c at his second information set. This is indicated in the ﬁgure by the probability 1 on actions a and c, and probability 0 on actions b and d.

How should player 2 update her belief in the event that her information set h is reached? Backward-induction reasoning provides an answer: Player 2 has observed that player 1 selected b at his ﬁrst information set—which is a surprise given player 2’s initial belief—but this does not cause player 2 to change her belief that player 1 would select action c at his second information set.

We can dissect the logic as follows. At the beginning of the game, player 2 initially treats the actions at player 1’s information sets as independent. Further, from the structure of the game, arriving at h provides player 2 with information that we can describe as a conjunctive event: “Player 1’s action at his ﬁrst information set is in {b} AND player 1’s action at his second (as yet unreached) information set is in {c, d}.” Thus, the combination of player 1’s actions that are consistent with information set h being reached (the shaded region of the table) forms a product set, {b} × {c, d}. And in these circumstances, player 2 updates her belief about the action at player 1’s second information set based on only what she has learned from the structure of the game about the action at player 1’s second information set—which is, of course, nothing. Thus, she maintains the belief that player 1 would select c. Importantly, this logic applies even though reaching h is a surprise for player 2 in that her initial belief puts zero probability on h being reached.

Now let us apply the same logic to a player’s belief about the actions of two other players, which also illustrates the idea of “no signaling what you don’t know.” Consider the game fragment shown in Figure 2. At the beginning of the game, player 3 believes that player 1 will select b for sure and that player 2 will choose c, d, and e with probabilities 0.2, 0.2, and 0.6 respectively. At information set h, player 3 has learned that player 1 chose a and that player 2 did not select c. That is, the set of action proﬁles that reach h is the product set {a} × {d, e}, and so the information at h is a conjunctive event. Player 3 then updates her belief about player 2’s action on the basis of only what she has observed about player 2’s action. Player 3 does not use the surprise regarding player 1’s choice as an excuse to take liberties with the

4

1 b 2 c (.2)

(1)

a (0)

e d(.2)

(.6)

c (.2)

e

(.6)

d

(.2)

xh 3

cd e a b

Figure 2: Independence and signaling.

probabilities of c, d, and e. Thus, player 3’s updated belief puts probability 0.25 on d and probability 0.75 on e, as Bayes’ rule requires for the marginal distribution over player 2’s action.

The next example, shown in Figure 3 below, demonstrates that the logic of independence is embedded in the concept of subgame perfection and that weak PBE does not imply subgame perfection. The table in the ﬁgure shows the strategy proﬁles, with rows representing player 2’s actions and columns representing proﬁles of actions taken at the information sets of players 1 and 3. The shaded region denotes the strategy proﬁles that reach player 3’s information set, a conjunctive event. This game has a single subgame-perfect equilibrium, (w, a, x). In the proper subgame, player 3 selects x in response to player 2’s equilibrium action a. There is also a weak PBE in which (z, a, y) is the strategy proﬁle and, at information set h, player 3 believes that player 2 selected action b. Weak PBE allows for this belief because it imposes no restrictions on how beliefs are updated off the equilibrium path. Player 3 changes his belief about player 2 based on player 1’s surprise selection of action w, contrary to the independence notion. Clearly, (z, a, y) is not a subgame-perfect equilibrium.

In the formal deﬁnition of plain consistency, the foregoing logic is applied to the extent possible for belief updating on consecutive information sets. For instance, consider a player i who is on the move at information set h in a game, and let L denote any subset of the other players’ information sets (not necessarily a proper subset and possibly including information sets of nature). At h, player i will have a belief about the strategies that the other players are using, including the actions they take at the information sets in L. Suppose that this belief exhibits independence between the behavior at the information sets in L and the behavior

3x

1 w 2a

y

bh

zc

x

1,0,1 0,0,0

y

2,2,2 0,1,0 0,0,0 0,0,1

wx wy zx zy a b c

Figure 3: Weak PBE and subgame perfection.

5

at other information sets, in that player i views behavior in L as uncorrelated with behavior outside of L.

Suppose that the next information set encountered by player i is h , and suppose that the set of strategy proﬁles consistent with reaching h is a product set with respect to L. Then plain consistency requires that player i’s belief at h must be the product of marginals with respect to L, and the marginal distribution on the information sets in L should be consistent with the conditional-probability formula restricted to these information sets. That is, player i’s updated belief about actions taken at information sets in L is conditioned on only what player i has observed about these particular actions.

The deﬁnition of plain consistency goes a bit further by applying the same idea to subsets of strategy proﬁles. That is, we can impose the same logic on any subset Z of strategy proﬁles on which player i puts positive probability at both information sets h and h . If Z is a product set with respect to L and the subset that reaches h is also a product set, then what player i learns by arriving at h allows for the L dimension to be separated from its complement. Assuming that player i’s belief at h, when restricted to the set Z, exhibits independence between the behavior at the information sets in L and the behavior at other information sets, then his belief at h ought to exhibit similar independence and updating should obey Bayes’ rule on each dimension (as applicable).

Plain consistency puts no other restrictions on how players update their beliefs. The PBE deﬁnition combines plain consistency with the assumptions that the players’ beliefs at the beginning of the game are concentrated on the the actual strategy proﬁle and that each player’s strategy is sequentially rational.

Before launching into the deﬁnitions, let me elaborate on why it is helpful to express beliefs in terms of appraisals rather than assessments. To describe whether beliefs exhibit independence regarding actions taken at different information sets, one must keep track of these various actions as separate components, as a strategy proﬁle does. To accomplish this with assessments, one must put structure on the nodes at every information set so that each node x describes the actions taken on the path to x. But such a structure is equivalent to keeping track of the strategy proﬁle, at least its restriction to information sets that came before the current information set. Further, even if we imagine adding this structure and then using the equilibrium strategy proﬁle for “future” information sets when calculating expected payoffs, another complication arises: At a given information set, the other information sets in the game generally cannot be neatly classiﬁed as coming either before or after the current one.6 Thus, I suggest that the most straightforward approach is to focus on strategy proﬁles and account for beliefs as appraisals.

The next section lays out the basic notation and deﬁnitions. Section 3 develops the notion of plain consistency and the equilibrium concept. Section 4 provides details on how to apply the PBE deﬁnition to inﬁnite games and extends the deﬁnition of plain consistency accordingly. Section 5 compares plain PBE with other equilibrium deﬁnitions. The Appendix contains additional deﬁnitions related to the one-deviation property and a proof.

6See Kreps and Ramey (1987) for an example in which the player on the move does not know whether a particular information set for another player was already reached. These complications presumably led Fudenberg and Tirole (1991) and Battigalli (1996) to describe equilibrium in general games as a combination of assessments and conditional probability systems on terminal nodes.

6

2 Basic Concepts

Information Sets, Strategies, and Payoffs

Consider any extensive-form game of perfect recall, with n players and nature taking the role of “player 0.” For convenience, in this and the next section the deﬁnitions and results are put in a form that applies to ﬁnite games; Section 4 extends the deﬁnitions to games with inﬁnite strategy spaces. The deﬁnitions also apply to other dynamic representations (substitute “personal history” or “contingency” for “information set”).

Deﬁne N ≡ {0, 1, . . . , n} and let N+ ≡ N \{0} denote the set of strategic players. Let H be the set of information sets. It is partitioned into sets H0, H1, . . . , Hn, where Hi denotes the set of information sets for player i. (As is standard, information sets are distinctly labeled so that the players’ individual sets of information sets are disjoint.) Let H+ ≡ ∪i∈N+Hi be the set of information sets for the strategic players. Denote by S the space of pure strategy proﬁles, including nature’s strategy. Let ∆S denote the space of probability distributions over S, which we call the mixed strategy proﬁles. For any subset T ⊂ S, let us take “∆T ” to mean the subset of ∆S with support in T . Note that I use the symbol “⊂” to mean “subset,” not “proper subset.” Finally, let u : S → Rn be the payoff function and extend it to the space of mixed strategies by the usual expected payoff calculation.

We will be dealing essentially with the agent form of the game, in that beliefs and choice are analyzed at individual information sets. A key element is that a player’s choice at one information set is independent of his choice at another information set. For example, if player i has two different information sets, h and h , then we think of player i as being separated into two agents whose names are the information sets themselves: Agent h takes the action at information set h, agent h takes the action at information set h , and these are independent choices. The agents in Hi all share the payoff function ui.

Note that a strategy proﬁle s ∈ S maps H to the space of actions and, for each information set, speciﬁes an action that is feasible at this information set. For any subset of information sets L ⊂ H, let sL denote the restriction of s to the subdomain L. That is, sL gives the proﬁle of actions that strategy s speciﬁes for the information sets in L. For any L ⊂ H, deﬁne −L ≡ H \L. Note that we can then write s = sLs−L.

For X ⊂ S, deﬁne XL ≡ {sL | s ∈ X}. In the case of L = {h} for a single h ∈ H, we simplify notation by dropping the brackets; so, for instance, we write Xh and sh instead of X{h} and s{h}. Note that Sh is the set of actions available at information set h. Also, for a given player i, the subscript “i” refers to the information sets Hi. For example, si means the same thing as sHi. Likewise, “−i” refers to H−i. Thus, subscripts “i” and “−i” have their usual meaning of identifying the strategies of player i and the other players.

Deﬁnition 1: For a given set L ⊂ H, say that a set X ⊂ S is a product set (relative to L) if X = XL × X−L.

The next deﬁnition identiﬁes whether a mixture of strategy proﬁles treats a speciﬁc set of information sets L ⊂ H independently of the rest, meaning that it can be expressed as the product of the marginal distribution on L and the marginal distribution on −L.

7

Deﬁnition 2: Given L ⊂ H and a product set Y = YL × Y−L ⊂ S, say that a distribution p ∈ ∆S exhibits independence on Y relative to L if for every product set X = XL×X−L ⊂ Y , we have p(X)p(Y ) = p(XL × Y−L) · p(YL × X−L). In the case of L = {h} for a single h ∈ H, then let us drop the brackets and say “. . . relative to h.” Say that p exhibits complete independence if, for every h ∈ H, p exhibits independence relative to h on S.

Independence on Y means that the distribution conditional on Y , which is given by the standard conditional probability formula, exhibits independence across L and −L. That is, the conditional probability of a product set X = XL ×X−L ⊂ Y is the product of the conditional marginal probabilities:

Prob [X | Y ] = p(X) = p(XL ×Y−L) · p(X−L ×YL)

p(Y )

p(Y )

p(Y )

= Prob [XL ×Y−L | Y ] · Prob [X−L ×YL | Y ].

In the expression above, the second equality is due to independence on Y relative to L. Note that independence relative to L on S implies the same on every product set Y ⊂ S.

Let ∆S be the set of mixed strategy proﬁles that exhibit complete independence. Note that a mixture in ∆S is equivalent to a behavior strategy proﬁle: It speciﬁes, for every information set h, a probability distribution over the actions available at h, and the speciﬁcation is independent across information sets. We typically write such a mixture as σ = (σh)h∈H, where σh denotes the mixed action choice at information set h. Nature’s mixed strategy is taken as exogenous and we assume that it exhibits independence relative to all of nature’s information sets.

It will be useful to think about information sets in terms of subsets of strategy proﬁles. For each h ∈ H and s ∈ S, let us say that s reaches h if the path of strategy proﬁle s includes a node in h. Denote by S(h) the set of strategy proﬁles that reach h. Note that, for any L ⊂ H, S(h)L is the set of action proﬁles for the information sets in L that are consistent with h being reached.7

Because of perfect recall, the information sets for an individual player have a particular product structure and precedence relation. For every pair of information sets h, h ∈ Hi for player i, it is the case that S(h) is a product set relative to h . Further, for h, h ∈ Hi with h = h , either h is a successor of h , in which case S(h) ⊂ S(h ); or h is a predecessor of h , in which case S(h ) ⊂ S(h); or neither, in which case S(h) ∩ S(h ) = ∅. If h is a successor of h then every path through h also passes through h. We call h ∈ Hi an immediate successor of h ∈ Hi for player i if h is a successor of h and there is no other information set for player i between the two; that is, there is no g ∈ Hi such that g is a successor of h and h is a successor of g.

7Expressing extensive-form information sets as subsets of strategy proﬁles is standard. Mailath, Samuelson, and Swinkels (1993) formulate solution concepts on the basis of “normal form information sets,” where there is no reference to an extensive form, and Shimoji and Watson (1998) take such “restrictions” as given (whether or not they are derived from extensive-form information sets). Note that, here, I am taking the conventional approach of examining standard extensive-form information sets but simply represent them as subsets of the strategy space.

8

0,1,0

x

3d

2y

0,1,0 e

a

bw d

1c z

e

z hd

2w e

3,1,0

2,2,1 0,0,0

xw xz yw yz xw xz yw yz ddddeeee

1,2,0 a

b

1,0,1 c

1,2,1

0,0,0

Figure 4: Example to illustrate theoretical components.

An Example

To review some of the deﬁnitions just described, consider the three-player game shown in Figure 4. The space of strategy proﬁles is S = {a, b, c}×{x, y}×{w, z}×{d, e}, which is depicted by the table on the right side of the picture. Note that the rows of the table are the actions feasible at player 1’s information set, whereas the columns are the proﬁles of actions for the other players’ information sets. For the information set h identiﬁed in the picture (player 3’s information set), we have

Sh = {d, e} and S−h = {a, b, c}×{x, y}×{w, z}.

The subset of strategy proﬁles that are consistent with reaching h is

S(h) = S(h)−h ×S(h)h = {(a, y, w), (a, y, z), (b, x, z), (b, y, z), (c, x, z), (c, y, z)}×{d, e}.

This set corresponds to the shaded region of the table in the ﬁgure. Clearly S(h) is a product set relative to h but it is not a product set relative to the information set of player 1. That is, letting h denote player 1’s information set, we have

S(h)h = {a, b, c} and S(h)−h = {(x, z), (y, w), (y, z)}×{d, e},

and S(h) does not equal the Cartesian product of S(h)h and S(h)−h .

Appraisal Systems

We must consider the beliefs that the strategic players hold at their various information sets. It will be useful to think of each player as having an information set that refers to “before the game begins.” For this purpose, deﬁne initial information sets h1, h2, . . . , hn and extended sets H1, H2, . . . , Hn for the strategic players as follows. For each strategic player i:

9

• If there exists hˆ ∈ Hi such that S(hˆ) = S then deﬁne hi ≡ hˆ and let Hi ≡ Hi.

• Otherwise let hi be deﬁned as an artiﬁcial information set with the property S(hi) ≡ S and let Hi ≡ Hi ∪ {hi}.

Assume the players’ artiﬁcial information sets are distinctly labelled so that H1, H2, . . . , Hn are disjoint sets, and let H+ ≡ ∪i∈N+ Hi.

Deﬁnition 3: For any strategic player i and h ∈ Hi, call a distribution ph ∈ ∆S an appraisal at h if ph ∈ ∆S(h) and if ph exhibits independence on S relative to g for every g ∈ Hi. An

appraisal system is a collection of appraisals, one for each information set of the strategic

players, written P = ph

.

h∈H +

An appraisal contains two things: The marginal on Si gives player i’s own strategy and the marginal on S−i gives player i’s belief about the strategy proﬁle of the other players. In terms of player-agents, an appraisal at information set h describes agent h’s belief about the other agents’ behavior (this is the marginal on S−h) as well as agent h’s planned behavior (the marginal on Sh). The condition ph ∈ ∆S(h) means that the appraisal at h puts probability one on reaching h. The independence condition means that, at any information set h ∈ H+, player i views his strategy as independent of the other players’ strategy proﬁle, and player i’s

strategy is represented as a behavior strategy. Conditions on the relation between appraisals at

different information sets are developed in the following sections.

Sequential Best Responses

We can test whether an appraisal at h speciﬁes rational behavior for the player on the move, meaning that the actions given positive probability at information set h maximize the player’s expected payoff.

Deﬁnition 4: For a given information set h ∈ H+ and two appraisals ph and pˆh, say that pˆh is an h-deviation from ph if ph and pˆh are identical on all other information sets; that is, ph(X−h ×Sh) = pˆh(X−h ×Sh), for all X−h ⊂ S−h.

Deﬁnition 5: For a strategic player i and an information set h ∈ Hi, say that an appraisal ph is rational at h if ui(ph) ≥ ui(pˆh) for every h-deviation pˆh. Say that an appraisal system P = ph h∈H+ is sequentially rational if ph is rational at h, for every h ∈ H+.

Here sequential rationality is deﬁned in terms of what are commonly called “one-shot deviations,” meaning that we evaluate player i’s rationality at a given information set h ∈ Hi by looking just at alternative choices at h rather than alternatives that would also adjust player i’s behavior at other information sets that may be reached in the continuation of the game.8 The familiar one-deviation principle—equivalence between single-deviation optimality and strategy-deviation optimality—holds here, assuming that player i’s appraisal system has the property I call “minimal consistency.” See the Appendix for more details. Minimal consistency is implied by the plain consistency condition deﬁned in the next section.

8Note that since all strategy proﬁles in the support of ph and pˆh reach h, the expected payoffs shown in the rationality deﬁnition are conditional on reaching information set h.

10

## Categories

## You my also like

### An Implementation of Deep Belief Networks Using Restricted

822.3 KB24.1K5.5K### Assessment of Paranormal Beliefs among Hindu, Muslim, Sikh

214 KB23K3K### Structure, solvent, and relativistic effects on the NMR

2.7 MB95.2K19K### Electrostatics in proteins and proteinligand complexes

2.6 MB8.4K1.6K### Energetics of MnO2 polymorphs in density functional theory

890.5 KB29.2K8.7K### ACN Permitted Business Entity Addendum

83.7 KB3.1K1.1K### National Functional Guidelines for Inorganic Superfund

987.6 KB13.1K1.4K### Neuropragmatics: Neuropsychological Constraints on Formal Theories of Dialogue

237 KB8.2K3.6K### The Case for Research in Game Engine Architecture

71 KB65.3K22.9K