Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
4 views5 pages

Chapter 6

The document discusses the concept of repeated games, particularly focusing on the Repeated Prisoners' Dilemma and the implications of Nash equilibria in such scenarios. It explains how strategies must be defined for each stage of the game, the concept of future discounted payoffs for indefinite repetitions, and presents the folk theorem which suggests that cooperation can be sustained in indefinitely repeated games. The document also details the GRIM strategy and its conditions for being a Nash equilibrium in the context of the Prisoners' Dilemma.

Uploaded by

ekosok1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views5 pages

Chapter 6

The document discusses the concept of repeated games, particularly focusing on the Repeated Prisoners' Dilemma and the implications of Nash equilibria in such scenarios. It explains how strategies must be defined for each stage of the game, the concept of future discounted payoffs for indefinite repetitions, and presents the folk theorem which suggests that cooperation can be sustained in indefinitely repeated games. The document also details the GRIM strategy and its conditions for being a Nash equilibrium in the context of the Prisoners' Dilemma.

Uploaded by

ekosok1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

6 Repeated Games

Example 6.1 (Repeated Prisoners’ Dilemma). Alice and Bob play the following version of
Prisoners’ Dilemma where they have to strategise whether to choose Y (yes, cooperate) or N
(no, do not co-operate) on a project:

Y N
Y 3, 3 1, 4
N 4, 1 1, 1

How do we model this when it is played repeatedly?


A strategy must specify what move to make at every possible juncture. So Alice needs to
specify what action to take initially, and what action to take at the end of each round. Now,
each game at the end of a round can be played in 4 di↵erent ways. So Alice has one decision
node initially, 4 decision nodes at the beginning of round one, 42 decision nodes at the end of
round two, and so on. So if the game is to be repeated three times, then Alice needs to specify
actions for 1 + 4 + 42 = 21 decision nodes. The total number of actions Alice must specify for
2
a game with 3 rounds is therefore 2 ⇥ 24 ⇥ 24 = 221 , which is unfeasible to study explicitly.
(In general, if the game is repeated n times, then the number of actions each player needs to
0 1 n 1
specify is 24 +4 +···+4 .)

Payo↵s of repeated games. If the game is played finitely many times, then the natural payo↵
should be sum of the payo↵s at each stage. For games which are played infinitely many times
or potentially infinitely many times (e.g., after each repetition a coin is tossed and the game is
stopped if Heads occurs), we will use the future discounted payo↵ defined below.

Definition 6.2. Let r0 , r1 , . . . be a given sequence of intermediate payo↵s. The future dis-
counted payo↵ with discount factor 0  < 1 is then given by the (infinite) sum

2
r 0 + r1 + r 2 + ....

Games with future discounted reward will be referred to as indefinite repetitions.

Remark 6.3. There are two main motivations behind the definition of future discounted payo↵s.
Firstly, it arises where players can play a game repeatedly but have to discount the payo↵ by
a factor of for each repeat. Secondly, if the game is repeated at each stage with probability
, then the payo↵ should be the expected value (which the future discounted payo↵ is). Note
that future discounted payo↵ always exists when the initial game has bounded payo↵s.

6.1 Nash equilibria of repeated games

A finitely repeated game where players know with certainty the number of repetitions works
just like a standard finite extensive game and is typically scrutinised as such.

40
Example 6.4 (Repeated Prisoner’s Dilemma, continued). Consider Prisoner’s Dilemma (Exam-
ple 6.1) repeated a finite number of times k where we add up the payo↵s at each stage. Since
N strictly dominates Y , we see that the backward induction solution of the repeated game is
to play N at every stage. One can in fact show that this is the only subgame-perfect Nash
equilibrium.

In general, we have the following result.

Theorem 6.5 (Nash equilibria of repeated games). Let G = (S, T, u1 , u2 ) be a game with
(pure- or mixed-strategy) Nash equilibrium (s, t). Suppose the game G is repeated finitely, or
indefinitely (with discounted payo↵). Then playing (s, t) repeatedly is a subgame perfect Nash
equilibrium.

Proof. If the game is played a finite number of times k, and Alice plays s1 , . . . , sk , against Bob’s
t, her payo↵ is u1 (s1 , t) + · · · + u1 (sk , t)  u1 (s, t) + · · · + u1 (s, t). Since u1 (si , t)  u1 (s, t),
Alice does not gain by deviating from s. Similarly, Bob does not gain from deviating from t,
and we conclude that playing (s, t) repeatedly is a Nash Equilibrium. This argument works for
all subgames, so it is a subgame perfect Nash equilibrium.
We now consider indefinite games. Note that subgames of indefinite repeated games are
identical to the whole game, so the restriction of the strategy of repeated (s, t) gives the
same strategy on all subgames. So if the proposed strategy is a Nash equilibrium, then it is
automatically subgame perfect.
It remains to verify that repeated (s, t) is a Nash equilibrium. Suppose that the first player
deviates and plays sk at the kth stage of the game. Since u1 (sk , t)  ui (s, t) for all k, we have
P1 k 
P1 k
k=0 u1 (sk , t) k=0 u1 (s, t) . In both cases, Alice does not gain anything by deviating
from playing s repeatedly. Similarly for Bob.

6.2 The folk theorem: indefinite games

In many games, the Nash Equilibrium solution(s) are not always the most desirable outcome.
For instance, in Prisoners’ Dilemma (Example 6.1), we would like the players to come to an
agreement and get a better payo↵ by playing (Y, Y ) of their own volition without external
regulations. Intuitively, it seems that players may give compliance a go if they know the game
will be repeated. We will make this idea precise.
Given a game G = (S, T, u1 , u2 ) and 0 < < 1 we define G1 ( ) to be the indefinitely
repeated game G with discount factor .

Theorem 6.6. Let G = (S, T, u1 , u2 ) be a game with (pure or mixed strategy) Nash equilibrium
(p, q). If (s, t) is a strategy profile with u1 (s, t) > u1 (p, q) and u2 (s, t) > u2 (p, q), then there
exists a 0  0 < 1 such that for all 0  < 1 there is subgame perfect Nash equilibrium of
G1 ( ) with same payo↵ as that of playing (s, t) repeatedly.

41
Proof. Let G1 be the strategy for player 1 in which, if player 2 ever deviated from t she plays p
and she plays s as long as player 2 sticks to t. Let G2 be the strategy for player 2 in which, if
player 1 ever deviated from s he plays q and he plays t as long as player 1 sticks to s.
We first show that if is close enough to 1, then (G1 , G2 ) is a Nash equilibrium for G1 ( ).
If (G1 , G2 ) is played, then the respective payo↵s of the players are given by
1
X ⇣ 1 ⌘ 1
X ⇣ 1 ⌘
i i
u1 (s, t) = u1 (s, t) and u2 (s, t) = u2 (s, t).
1 1
i=0 i=0

If player 1 deviates from playing s by playing s0 at the k stage of the game and si (i > k)
thereafter, she gets the payo↵
k 1
X 1
X k 1
X 1
X
i
u1 (s, t) + k
u1 (s0 , t) + i
u1 (si , q)  i
u1 (s, t) + k
u1 (s0 , t) + i
u1 (p, q)
i=0 i=k+1 i=0 i=k+1
⇣1 k⌘ ⇣ k+1 ⌘
= u1 (s, t) + k
u1 (s0 , t) + u1 (p, q).
1 1

We need to consider values of for which Player 1’s payo↵ from (G1 , G2 ) is strictly bigger
than the upper bound above in the case of deviation. That is, we need
⇣ 1 ⌘ ⇣1 k⌘ ⇣ k+1 ⌘
u1 (s, t) > u1 (s, t) + k
u1 (s0 , t) + u1 (p, q).
1 1 1
After simplification, we obtain

u1 (s, t) > (1 )u1 (s0 , t) + u1 (p, q).

As ! 1, the right hand side converges to u1 (p, q) < u1 (s, t), hence a continuity argument
shows that there exists a 0  1 < 1 for which the inequality holds for all 1 < < 1.
A similar argument shows that there exists a 0  2 < 1 such that whenever 2 < < 1, player
2 would not deviate from playing G2 . We deduce that with 0 = max{ 1 , 2 } and 2 ( 0 , 1),
(G1 , G2 ) is a Nash equilibrium whose payo↵ is identical to the payo↵ of playing repeated (s, t)
repeatedly.
To show that (G1 , G2 ) is a subgame-perfect Nash equilibrium we need to show that it is a
Nash equilibrium of its subgames: such a subgame occurs either after a defection or not. After
a defection (G1 , G2 ) calls for playing (p, q) in every repetition of the game, which is a Nash
equilibrium by Theorem 6.5, and if no defection occurs the subgame is identical to the original
game.

Indefinitely repeated Prisoners’ Dilemma. Consider an indefinitely repeated game of Pris-


oners’ Dilemma (Example 6.1) where we measure payo↵s with the the future discounted reward
method with discount factor .
Let us label the strategy of always agreeing to cooperate YES and the strategy of never co-
opertating as NO strategy. We also introduce the GRIM strategy: agree to co-operate and play

42
Y until the other person plays N , after which always play N . We know that (N O, N O) is a
(subgame-perfect) Nash equilibrium. The profile (Y ES, Y ES) is not a Nash Equilibrium but
gives a better payo↵. Note that if both players play GRIM , then both players will play Y at
every stage and get a payo↵ 3 + 3 1 + 3 2 + · · · = 3/(1 ). The folk theorem says that this
is infact an equilibrium payo↵ provided the discount factor is large enough. More precisely, we
will show the following.
1
Claim 6.7. The profile (GRIM, GRIM ) is a Nash Equilibrium if and only if 3.

Suppose Alice deviates from GRIM for the first time at the kth stage of the game (where
k 0). In other words, Alice plays N at the k-th stage and always played Y before that. Then
P
Alice’s payo↵ before the k-th stage is kj=01 3 j (which is 0 when k = 0 as the sum is empty).
On the k-th round, Alice’s payo↵ is 4 k . However, from the k + 1-th round onwards, Bob will
stick to N and so Alice’s best response is to play N as well. Hence the best payo↵ Alice can
obtain is
⇣Xk 1 ⌘
3 j + 4 k + k+1 + k+2 + . . . .
j=0

Alice therefore has a profitable deviation from playing (GRIM, GRIM ) if and only if

⇣X
k 1 ⌘
j k k+1 k+2 2
3 +4 + + + . . . > 3(1 + + + ...)
j=0
k k+1 k+2 k k+1 k+2
() 4 + + + ... > 3 + + + ...
k k+1 k+2 k+1
() >2 + + ... = 2 /(1 )
() 1 >2
1
() 3 > .

The same analysis works when Bob deviates as the payo↵s are symmetric. We can therefore
1
concluded that (GRIM, GRIM ) is a Nash Equilibrium precisely when 3.
1
Alternative argument for the equivalence (GRIM, GRIM ) is a NE () 3.

First, suppose that (GRIM, GRIM ) is a Nash Equilibrium. Then, for either player, playing
N O against GRIM cannot be a profitable deviation from (GRIM, GRIM ). That is, we must
have
4 + + 2 + ···  3 + 3 + 3 2 + ....
1
Simplification gives 1  2 /(1 ), which leads to 3.
1
Conversely, assume 3 and suppose, for a contradiction, that Alice has a profitable deviation
from playing (GRIM, GRIM ). Since both strategy profiles lead to the same play before
deviation, we may as well assume that Alice’s profitable deviation starts by playing N initially.
Bob will therefore play N forever after the initial game and the maximum payo↵ for Alice has
run of games
(N, Y ), (N, N ), (N, N ), . . .

43
with payo↵ 4+ + 2 +. . . . This must be strictly greater than then payo↵ from (GRIM, GRIM ).
Now

2 2
4+ + + ··· > 3 + 3 + 3 + ...
2
() 1 > 2 + 2 + · · · = 2 /(1 )

implies that 13 > , which is a contradiction. Therefore Alice cannot have a profitable deviation.
By the same argument, Bob does not have a profitable deviation. Hence (GRIM, GRIM ) is
a Nash Equilibrium.

Remark 6.8. The proof of Theorem 6.6 shows that (GRIM, GRIM ) is a subgame-perfect
Nash equilibrium. The argument, which we now repeat, is as follows.
To show (GRIM, GRIM ) is subgame perfect we need to show that it is a Nash equilibrium of
its subgames. Now, such a subgame occurs either after a defection or not. If no defection occurs,
then the subgame is identical to the original game and both players are playing (GRIM, GRIM )
which is a Nash Equilibrium. If there is a defection however, then (GRIM, GRIM ) calls for
playing (N, N ) in every repetition of the game, which is a Nash equilibrium by Theorem 6.5.

44

You might also like