ECON0027 Section 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

ECON0027 Lecture notes

Section 1: Games in strategic form.


Nash equilibrium. Rationalizability
Reading: Osborne: Ch 2, 3, 12

1.1 Introduction
What is a game or more precisely what does Game Theory study? Roughly speaking, a game
is a mathematical model of a situation in which:
• there are several economic agents, who are usually called players;

• these players make decisions;

• the outcome depends on the decisions made by all the players.


The main question that we ask in the context of a game-theoretic model is: How does a
rational player make decisions? If we examine this question, we will find that a rational player
would need a guess (or a theory) of how her opponents make decisions because the outcome of
the game depends on the opponents’ choices as well as on her own choices.
It is immediately obvious that, in order to answer the main question, we need to overcome
a fundamental difficulty: Player’s optimal choices depend on her opponent’s choices, but the
converse is true as well. Throughout the course we will find a way to cut into this circular logic
and provide an answer.
Throughout the course we will build different kind of games—i.e., different kind of math-
ematical models—starting with the simplest and gradually adding various additional elements
and features. The first model is called a game in normal form.

1.2 Games in normal (or strategic) form


Let us start with an example that is called “prisoner’s dilemma”.
Two criminals who committed a crime together are put into separate rooms and questioned
by the police. Each criminal has to take one of the two available actions: he can either defect
and provide the police with evidence incriminating his accomplice (we will call this action
D), or cooperate with his accomplice and not reveal anything to the police (we will call this
action C). Suppose that the police rewards defectors and punishes criminals against whom they
have evidence. Let us assign numbers to each outcome of the game to represent the players’
preferences. We call these numbers payoffs. The payoffs are recorded in the table below:

player 2
c d
player 1 C 2,2 0,3
D 3,0 1,1

This is an example of a game in normal form. We can easily generalize this example and obtain
a formal definition of a game in normal form

1
1.3 Definition of a game in normal form
A set of players is
N = {1, 2, ..., n}.
Each player i ∈ N has a set of strategies Si .
A strategy profile
s = (s1 , s2 , ..., sn )
is a vector of strategies such that ith component of the vector—si ∈ Si —is a strategy of player i.
The set of all strategy profiles is
S = S1 × S2 × .. × Sn .
This set represents all possible outcomes of the game.
Player i’s payoff function assigns a real number (that we call a payoff ) to every possible
outcome of the game—i.e., to every strategy profile:
ui : S → R
A tuple Γ = (N, {Si }i∈N , {ui }i∈N ) is a game in normal (or strategic) form.
The way we interpret game Γ is the following: Each player has to choose a strategy si ∈ Si .
All players choose their strategies simultaneously and independently of each other (as if
they were locked in separate rooms and could not communicate with each other). This choice
results in a strategy profile s = (s1 , s2 , ..., sn ) that is translated into payoffs for each player
i ∈ N : ui (s).
One notation that will be extremely useful is the following: by s−i we denote the choices of
all players but i:

s−i = (s1 , s2 , .., si−1 , si+1 , .., sn ).

1.4 Payoffs
How should we interpret the payoff function ui (s)? This is a representation of player’s prefer-
ences over the outcomes of the game.
A common misperception is that payoffs represent money: players need not be concerned
only with money: players could be altruistic, they could value behaving according to some social
norm, etc. All of the players’ concerns are captured in the payoff.
There could be some uncertainty about the outcomes of the game:
• opponents’ choices could be uncertain;
• other circumstances of the game—e.g., opponents’ preferences/payoffs—can be uncertain;
• uncertainty can be objective or subjective.
We model uncertainty using probability distributions and assume that payoffs are von-
Neumann Morgenstern utilities. Players seek to maximize the expected value of ui (s), condi-
tional on all the information Ii that player i has:
E[ui (s) | Ii ].
This expectation is taken using the corresponding probability distribution (more on that later).

2
1.5 Beliefs about opponents choices
Belief of player i is a probability distribution on S−i . We can introduce some restriction on the
beliefs:

• opponents act independently

Σi ⊂ ∆(S−i ), such any σi ∈ Σi is a product of marginal probabilities: σi = Πj6=i pij .

This formulation implies (statistical) independence across choices of different opponents.

• opponents can coordinate (correlated strategies, this is less restrictive than the previous
assumption)
Σi = ∆(S−i ) = ∆(×j6=i Sj ).

Thus, σi ∈ Σi is a probability distribution over possible choices of the opponents.

1.6 Back to prisoners’ dilemma


2
c d
1 C 2,2 0,3
D 3,0 1,1

For player 1:

u1 (D, c) > u1 (C, c)

u1 (D, d) > u1 (C, d)


for any belief of player 1: (α, 1 − α) ∈ Σ1 ,

αu1 (D, c) + (1 − α)u1 (D, d) > αu1 (C, c) + (1 − α)u1 (C, d)

D strictly dominates C:
no matter what player 2 plays, D is strictly better than C.
So 1 should play D.
The same is true for 2: she should play d as well.
The game theoretic prediction: (D, d)

Note: This is not socially efficient. (D, d) is Pareto-dominated by (C, c).

3
1.7 Strict dominance
Recall our notation:
s−i ≡ (s1 , s2 , .., si−1 , si+1 , ..., sn )
Let
S−i = ×j6=i Sj
be the set of all such vectors.
A strategy si strictly dominates another strategy s0i if ∀s−i ∈ S−i :

ui (si , s−i ) > ui (s0i , s−i )


In this case, s0i is strictly dominated. If a strategy si strictly dominates every other
strategy in Si , then si is a strictly dominant strategy.

Observation: if a strategy s0i is strictly dominated, it should never be played by (a ratio-


nal) player i because there is another strategy that gives him strictly higher payoff independently
of other players’ actions.

1.8 A more complex example


L C R
T 2,5 3,2 0,3
M 5,4 1,1 7,5
B 3,1 0,1 5,0

M strictly dominates B. However, T and M are not dominated.


We cannot solve this game the same way we solved prisoners’ dilemma: We need more
assumptions.

1.9 Common knowledge of rationality


Each player seeks to maximize his expected payoff (in short, is rational). In addition to that,
we assume that he knows that every other player is also rational, and knows that every other
player knows that every player is rational and so on.
We assume that the common knowledge of rationality and of the game itself (the latter
assumption was always there, but we did not mention it earlier).
Is the assumption of common knowledge important? Can it make any difference?
Here is a puzzle to illustrate the importance of this assumption: Two monks with black dots
on their forehead live on an island with no mirrors and no other people. The monks do not
speak to each other and they obey the following rule: if a monk learns that he has a dot on his
forehead, he has to end his own life by the end of the day.
A traveler arrives on the island, meets the monks at dinner and says: “one of you has a black
dot on your forehead”. On the second day both monks end their lives.
Note that the traveler did not tell the monks something that they did not already know—
both of them saw the black dot on each others forehead.

4
However, since the traveler informed both of them at the same time he made a knowledge of
the statement “one of you has a black dot on your forehead” a common knowledge. That made
all the difference!

1.10 Iterated strict dominance


The assumption of common knowledge of rationality gives us power to use exclusion of strictly
dominated strategies iteratively:

1. Eliminate all strictly dominated strategies for all players. This will produce a new game
with fewer strategies.

2. In this new game, eliminate strictly dominated strategies for all player.

3. Repeat, until no further elimination is possible.

1.11 Example: quality choice game


A firm chooses a quality of the product and a consumer decides whether to buy the product or
not (without observing the firm’s choice).

Firm
H(igh) L(ow)
Consumer B(uy) 2, 2 −1, 3
N (o) 0, 0 0, 1

First round of exclusion of strictly dominated strategies:


For the firm, L strictly dominates H (3 > 2 and 1 > 0). For the consumer: neither B nor N
is strictly dominated.
Second round of exclusion of strictly dominated strategies:
Since the firm is rational, it will choose L and will not choose H. The consumer knows, that
the firm is rational (thanks to common knowledge of rationality) and he concludes that the firm
will not choose H.
In the new game without strategy H, the consumer has a strictly dominant strategy N :
indeed, N is strictly dominates B.
Prediction: (N, L)
Here is another simple example:
L C R
T 2,5 3,2 0,3
M 5,4 1,1 7,5
B 3,1 0,1 5,0

We can iteratively exclude strictly dominated strategies in the following order: B, C, T, L. Thus,
players will play (M, R) in this game.

5
1.12 Some questions to think about
1. Can we exclude all the strategies for one of the players and be left with nothing?

2. Does the order of exclusion matter for the final answer?

The answer to both questions is no (but you should not believe me, instead, you should
formally prove both statements).
A strategy si weakly dominates another strategy s0i if ∀s−i ∈ S−i :

ui (si , s−i ) ≥ ui (s0i , s−i ).


Note that the only difference with the definition for strict dominance is ≥ instead of >.

3. Can we safely exclude weakly dominated strategies?

The answer to this one is also no, but the more interesting question is why not?

1.13 Three-player example


The first player chooses a row (a, b or c), the second—a column (A, B or C), and the third—a
table (U or D).

A B C
a 3, 1, 1 3, 2, 1 3, 1, 1
U:
b 0, 2, 1 0, 1, 1 0, 2, 1
c 2, 1, 2 2, 2, 2 2, 1, 2

A B C
a 0, 2, 2 0, 1, 2 0, 2, 2
D:
b 3, 1, 2 3, 2, 2 3, 1, 2
c 2, 2, 1 2, 1, 1 2, 2, 1

If we restrict our attention to pure strategies (it’s ok if you do not know what pure strategy
is, more on that later), there are no strictly dominated strategies.
We can use a different approach: let’s look for strategies that are not the best in any of
the circumstances.
Step 1: consider Player 1’s incentives

• Suppose P1 believes that P3 plays U . Then no matter what P1 believes about P2, it is
best for P1 to play a.

• Suppose P1 believes that P3 plays D. Then no matter what P1 believes about P2, it is
best for P1 to play b.

6
Since strategy c was not mentioned, it will not be played by the rational player 1. Let’s get
rid of it.
Step 2: consider Player 3’s incentives
If player 3 believes that c will not be played, strategy D is always best for her.
Since strategy U was not mentioned, it will not be played by the rational player 3. Let’s get
rid of it.

Continue until (b, B, D) remain.


What we did here is we found the set of rationalizable strategies. Now we can formalize
and extend this approach to any game in normal form. To do that, we need to define a notion
of a best reply.

1.14 Best reply


Suppose a player “knows” what others will play (or believes in it). What should she play?

Take a strategy profile s. A best reply of a player i to a strategy profile s is a strategy


that maximizes her payoff conditional on others playing s:

BRi (s) = arg max{ui (x, s−i )}


x∈Si

Similarly, a best reply of a player i to her belief σi ∈ Σi is

BRi (σi ) = arg max{ui (x, σi )}


x∈Si

Observation: a strictly dominated strategy can never be a best reply (Why?).

1.15 Rationalizability
• Take
Ri0 = Si

• Define sequence {Rik }i∈N ;k=1,2,... using the following induction

Rik = {si ∈ Rik−1 | ∃σi ∈ Πj∈N ∆(Rjk−1 ) : si ∈ BRi (σi )}

A strategy si of player i is rationalizable if



\
si ∈ Ri∞ = Rik
k=0

This is similar to iterated strict dominance. If we allow for correlated beliefs, identical to
iterated strict dominance. The intuition behind this statement is simple: If a strategy is not a
best reply to any strategy the opponents may play, it will not be played. Let us examine this
similarity more formally.

7
1.16 Never-best replies and iterated strict dominance
If a strategy does not survive iterated strict dominance exclusion, it is never a best reply to any
strategy. The converse is not true if beliefs are independent.
A B A B A B A B
X1 : a 8 0 X2 : a 4 0 X3 : a 0 0 X4 : a 3 3
b 0 0 b 0 4 b 0 8 b 3 3

Let p and q be probabilities of playing a and A.


Claim: X2 is never a best reply.
Proof: By contradiction, assume it is a BR for some (p, q):
u3 (X1 , p, q) = 8pq
u3 (X2 , p, q) = 4pq + 4(1 − p)(1 − q) = 8pq + 4 − 4p − 4q
u3 (X2 , p, q) = 8(1 − p)(1 − q) = 8 + 8pq − 8p − 8q
u3 (X2 , p, q) = 3.
Thus,
8pq + 4 − 4p − 4q ≥ 8pq
8pq + 4 − 4p − 4q ≥ 8 + 8pq − 8p − 8q
8pq + 4 − 4p − 4q ≥ 3.
The first two inequalities imply that p + q = 1 since
p+q ≤1
p+q ≥1

The third inquality is


3 ≤ 8pq = 8p − 8p2
. This inequality does not have solutions in [0, 1] hence the contradiction. X2 is indeed never a
best reply.
To complete the argument, note that X2 survives the iterated elimination of strictly domi-
nated strategies. By contradiction suppose it is strictly dominated by σ = αX1 + βX3 + (1 −
α − β)X4 . First, note that
3(1 − α − β) > 0
or, equivalently α + β < 1. Also note that
8α + 3(1 − α − β) > 4
or 5α − 3β > 1. Similarly, 5β − 3α > 1. Adding these two inequalities together we get α + β > 1.

Note that X2 is a best reply to a correlated strategy 21 (a, A) + 21 (b, B). In a finite game, the
set of correlated rationalizable strategies always coincides with the set of strategies that survive
iterated exclusion of strictly dominated strategies.

8
1.17 Nash equilibrium
A strategy profile s∗ = (s∗1 , s∗2 , ..., s∗n ) is a Nash equilibrium if for every player i ∈ N ,

ui (s∗1 , s∗2 , .., s∗i ., s∗n ) ≥ ui (s∗1 , s∗2 , ., si .., s∗n )∀si ∈ Si .
or, equivalently, for every player i ∈ N

s∗i ∈ BRi (s∗ )

There are several interpretations of this definition. One of them is that player i assumes
that his opponents are playing according to a Nash equilibrium—i.e., s∗−i , and finds that it is
optimal for her to play according to a Nash equilibrium as well—i.e., it is optimal to play s∗i .
Put differently, when the others are playing s∗−i , player i has no strategy si which gives her
a strictly higher payoff than s∗i .

1.18 Three-player example and Nash equilibria


A B C
a 3, 1, 1 3, 2, 1 3, 1, 1
b 0, 2, 1 0, 1, 1 0, 2, 1
c 2, 1, 2 2, 2, 2 2, 1, 2

A B C
a 0, 2, 2 0, 1, 2 0, 2, 2
b 3, 1, 2 3, 2, 2 3, 1, 2
c 2, 2, 1 2, 1, 1 2, 2, 1

In this example, there is a unique Nash equilibrium: (b, B, D).


Here is another example of a game in which two animals competing for territory (Maynard
Smith), or two motorcycles heading towards each other on a collision course. This game is
commonly referred to as Hawk-Dove game. There are two Nash equilibria (in pure strategies)
in this game: (H, D) and (D, H). Note that the game is symmetric but the Nash equilibria are
asymmetric.
H D
H −1, −1 4, 0
D 0, 4 2, 2

1.19 Existence of Nash equilibria


Consider the following example. Neither of the four (pure) strategy profiles are Nash equilibria.

2
a b
1 A 1,-1 -1,1
B -1,1 1,-1

9
This is a game of hide and seek, so working under the presumption that each player correctly
guesses what the other does is against the spirit of this game. Intuitively, each player’ best
strategy is to confuse the opponent as much as possible—to randomize. This is the key to
restoring the existence of Nash equilibria in this and many other games.

1.20 Example with randomization


Suppose that instead of choosing A or B, player 1 would choose the probability q ∈ [0, 1] with
which he plays A (and, therefore, the probability 1 − q with which he plays B). Player 2 makes
an analogous choice of the probability p ∈ [0, 1].
This extension is called mixed strategy extension of the game. It will help us find Nash
equilibria.

p 1-p
a b
q A 1,-1 -1,1
1-q B -1,1 1,-1

Player 1’s best reply correspondence (note that it is not a function because there may be
multiple best replies) is: 
0, if p < 1/2;

qBR (p) = 1, if p > 1/2;

[0, 1], if p = 1/2.

Similarly, Player 2’s best reply correspondence is:



1, if q < 1/2;

pBR (q) = 0, if q > 1/2;

[0, 1], if q = 1/2.

The unique fixed point of these best reply correspondences is p = 1/2, q = 1/2. Now, let
us get back to the intuition of the hide and seek game: the player who hides wants to be as
unpredictable as possible (because if he is predictable, he is easy to find) so he randomizes and
the player who seeks wants to be as unpredictable as possible too (because if he is predictable,
he is easy to hide from) so he randomizes as well. The exact probabilities that the players use
for randomization depend on the payoffs in this game.

1.21 Some questions to think about


In the context of the original game and its mixed strategy extensions:

1. What do payoffs represent?

2. What operations can I apply to the payoffs without modifying the strategic nature of the
game? Is the answer to this question different for the original game and its mixed strategy
extensions?

10
1.22 Mixed strategies
A mixed strategy of player i is mi ∈ ∆(Si ). Take a pure strategy (action) s ∈ Si : mi (s) is the
probability that player i will play s if he follows mi .
As before, a profile of strategies is

m = (m1 , ..., mn ) ∈ ×i∈N ∆(Si )

.
Players randomize independently of each other:
Y
Pr{s | m} = mi (si )
i∈N

Player i’s expected payoff is


X Y
E[ui (mi , m−i )] = ui (si , s−i ) mj (sj ).
s∈S j∈N

However, we will never use this formula. Instead, we will always think in terms of the payoff
from a pure strategy when others play mixed ones:
X Y
E[ui (si , m−i )] = ui (si , s−i ) mj (sj ).
s∈S j6=i

The reason for this is that the expectation is a linear operator: if mi is a best reply to m,
then so is any pure strategy si ∈ Si : mi (si ) > 0 in its support. Conversely, for any subset
S̃i ⊂ Si , that only contains best replies to m, any mixed strategy mi ∈ ∆(S̃i ) is also a best reply
to m (Why?).

1.23 Existence of Nash Equilibrium in mixed strategies


Theorem (Nash): Let G be a game with finitely many players, where each player’s strategy
set is finite. G has a Nash equilibrium, possibly in mixed strategies.

Let m = (m1 , m2 , ..., mn ) be a Nash equilibrium. The support of mi is the set of pure
strategies s : mi (s) > 0. For every player i :

1. all strategies in the support of mi have the same payoff conditional on m−i .

2. any strategy that is not in the support of mi has a (weakly) lower payoff conditional on
m−i .

11
1.24 Cournot oligopoly
Two firms, 1 and 2, producing a homogeneous good. Both firms choose their quantity qi ≥ 0
simultaneously. Total quantity Q = q1 + q2 is placed on market, and gives rise to market price
P (Q). Firm i’s profits are πi (q1 , q2 ) = qi P (Q) (costs are zero). For simplicity, assume that
P (Q) = 90 − Q.

π1 (q1 , q2 ) = q1 (90 − q1 − q2 )

∂π1
= 90 − 2q1 − q2 .
∂q1
Suppose firm 1 correctly anticipates q2 .
To find optimal quantity, set ∂π
∂q1
1
= 0.
Firm 1’s best response function is
90 − q2
q̂1 (q2 ) = .
2
Similarly,
90 − q1
q̂2 (q1 ) = .
2
Setting q2 = q̂2 (q1 ),

90 − ( 90−q
2
1
)
q1 =
2
= 30.

Nash equilibrium is q1∗ = q2∗ = 30.


Monopoly quantity is 45. Hence firms total profits less than half monopoly profits.
Socially optimal quantity?
Cournot outputs : Nash equilibrium outputs in a game where firms choose quantity.

1.25 Bertrand Oligopoly


Same context as in Cournot model, but firms choose prices instead of quantities. Prices can be
continuously varied, i.e. pi is any real number. If prices are unequal, all consumers go to lower
price firm. If equal, market is shared.

1. There is a Nash equilibrium where p∗1 = p∗2 = 0.

2. There is no other Nash equilibrium in pure strategies.

Proof:
1) Check that player 1’s best reply contains p1 = 0. Indeed, player 1’s payoff is π1 (p1 , p2 =
0) = 0 for any p1 . The same is true for player 2.
2) By contradiction assume there is a PSNE (p∗1 , p∗2 ) 6= (0, 0). Cases:

12
• p∗1 = p∗2 = p. Player 1’s payoff is π1 (p∗1 , p∗2 ) = Q(p)p/2. Player 1 is better off deviating to
p01 = p −  if  > 0 is small:
π1 (p01 , p∗2 ) = Q(p − )(p − ) > Q(p)p/2,
hence (p∗1 , p∗2 ) is not a NE.
• p∗1 > p∗2 > 0. Player 1’s payoff is 0 in this case. He is better off matching player 2’s price
and collecting
π1 (p01 = p∗2 , p∗2 ) = Q(p∗2 )p∗2 /2 > 0.

• p∗1 > p∗2 = 0. Player 2’s payoff is 0. He is better off matching player 1’s price and collecting
π1 (p∗1 , p02 = p∗1 ) = Q(p∗1 )p∗1 /2 > 0.

1.26 Political Party Competition


Voters
Each voter has a preferred political position, represented by
x ∈ [0, 1]
Think of 0 as extreme left, and 1 as extreme right. If policy y is adopted, then the voter’s utility
is −(x − y)2 .
There is a continuum of voters—f (x) is the number of voters with preferred position x. More
accurately, f (x) is the density function of voters. A proportion of voters whose preferred point
is less than x is a cdf:
Zx
F (x) = f (t)dt
0

Median voter m : F (m) = 0.5. Voters are not strategic—i.e., they vote for a policy that
gives them the highest payoff.
Parties
Two parties, A and B. Both parties simultaneously choose platforms—numbers between 0
and 1. If a and b are the two platforms, then voter x votes for whichever platform is closer. If
the two are equidistant, then the voter votes for each with probability 0.5. A party’s payoff is
1 if it wins the election, 0 if it loses. If both parties get the same number of votes, payoff is 0.5.
Equilibrium
Proposition (median voter theorem): The party competition game has a unique Nash equi-
librium where both parties locate at m.
1. (a = m, b = m) is a Nash equilibrium.
2. There is no other equilibrium.
Proof:
a+m
1) Payoff for party A is 0.5. Suppose that it chooses a < m and gets all voters with x < 2
.
a+m
F ( 2 ) < 0.5, so it loses the election. If it chooses a > m similar thing happens.
2) Cases:

13
• If a < b, the party that does not win for sure can win with probability 0.5 by mimicking
the winner.
• If a = b 6= m, then A can do better by choosing m and winning.

1.27 Common knowledge of a game and profit maximization


Cournot duopoly: 2 firms face a demand P = max{0, 90 − Q}. Firm 2 is the same as before: it
maximizes its profit.
Firm 1 is overcompetitive: instead of maximizing profit it desires to be a leader in the
market. We will model it as if firm 1 maximizes the difference between its own and the other
firm’s profit:
u1 (q1 , q2 ) = π1 (q1 , q2 ) − π2 (q1 , q2 )
= (90 − q1 − q2 )(q1 − q2 )

First, let us recall a benchmark in which there are two profit-maximizing firms:
q1∗ = q2∗ = 30
π1 (q1∗ , q2∗ ) = π2 (q1∗ , q2∗ ) = 900
Now, suppose that firm 1 is overcompetitive:
90 − 2q1 = 0
or
90
q1 (q2 ) =
2
as before
90 − q1
q2 (q1 ) =
2
Nash equilibrium:

q1∗ = 45
q2∗ = 22.5
452
π1∗ = = 1012.5
2
 2
45
π2∗ = = 506.25
2
Firm 1 obtains a larger profit compared to the situation when firm 1 is a profit maximizer.
Why?..
If both firms are overcompetitive (zero-sum game):
q1∗ = q2∗ = 45
π1∗ = π2∗ = 0
The profit of both firms drops to 0. (Why? What does this example teach us?)

14

You might also like