COGS2010 Week 8 Questions: The Iterated Prisoner’s Dilemma (IPD)

 

(due at the end of the lab session, or before the lab by email to the tutor)

                      

Evolution is driven by “selfish” genes, which promote their own welfare above that of rival genes.  And yet, alongside the competition that one would expect from such a situation, cooperation is rife in nature.  Many animals cooperate with others of their own species, and not just with their own relatives.  The evolution of cooperation has a long history in politics, as well as computer science and biology.  The topic for this lab is a game based on an ethical puzzle known as the prisoner’s dilemma, which has been used in game theory since the 1950s.

 

The Prisoner’s dilemma

Two partners-in-crime have been nabbed.

q       If neither confess, the police have enough evidence to send both away for 1 year.

q       If one confesses, he will be let off on good behaviour and the other will get 20 years.

q       If both confess they will each get 10 years.

q       They are held incommunicado. What should they do?

What would you choose? Why?

 

The dilemma is that if the other person defects, your best option is to confess (10 vs. 20 years), but even if the other person doesn’t confess, your best option is still to confess (you out free  vs. 1 year). So both prisoners will reason that the other person will confess, and they both get 10 years.  If they had both kept quite, they would both have had only 1 year.

 

 

If the situation is repeated between the same two players many times, each individual situation is a PD, but knowledge of the past and the possibilities of future interactions can engender trust, and hence modify the other player’s behaviour, and cooperation can become the best option.

 

Axelrod explains how the prisoner’s dilemma came to be involved in his studies of cooperation:

“My original interest in game theory arose from a concern with international politics and especially the risk of nuclear war. The iterated Prisoner's Dilemma game seemed to me to capture the essence of the tension between doing what is good for the individual (a selfish defection) and what is good for everyone (a cooperative choice).” Axelrod http://pscs.physics.lsa.umich.edu/Software/CC/ECHome/ECCitationClassic.html

 

Axelrod’s study of the prisoner’s dilemma and its iterated version, in which repeated PDs occur.  Understanding the IPD as a toy problem provides a “hard-headed” rationale for cooperation and helps theorists in a wide range of disciplines to understand situations under which cooperation is likely to be evolutionarily advantageous, and when it is likely to break down.

 

The goal of this week’s lab is to explore the behaviour of different strategies in the iterated prisoner’s dilemma.  This week’s lab uses online software that provides three applets as specified below.


Part 1: One round of the prisoner’s dilemma

 

q       Read through the basic description of the IPD used in the lab and the strategies encoded in the software at

http://www.lifl.fr/IPD/ipd.html#cipd

http://www.lifl.fr/IPD/ipd.html#strategy.

 

q       Load the first applet  http://www.lifl.fr/IPD/applet-match.html

 

The first applet allows you to choose two strategies and the number of iterations (moves). Then it shows the choices made by that strategy for each move, and the final score. (Note that the results are deterministic, and hence only have to be calculated once.)

 

Cooperate

Defect

Cooperate

R = 3
R = 3

S = 0
T = 5

Defect

T = 5
S = 0

P = 1
P = 1

The total score is the sum of the moves, using the payoff matrix for each move

 

Question 1.

a.       Set the number of moves to 10, and by choosing the relevant pairs of strategies, record the scores in the table of pairwise contests below for all_c, all_d, and tit_for_tat.

b.      The 4th strategy, “contrary”, is not included in the online program. The contrary strategy is to do the opposite of the opponent’s previous move. On the first move, it defects.  For example, against the “always defect” strategy, the moves would be

DDDDDDDDDD all_d = 46

DCCCCCCCCC contrary = 1

Write down the contests as shown above for “contrary” against itself, all_c and tit_for_tat, calculate the total scores for their contests and fill in the rest of the table.

c.       What is the highest score achieved in any of the contests? What is the highest score from a symmetric strategy (where both players benefit equally)?

 

Table of pairwise contests

 

all-c

all_d

tit_for_tat

Contrary

Always cooperate

all_c

30

 

 

 

Always defect

all_d

 

 

 

46

tit_for_tat

 

 

 

 

 

contrary

 

 

1

 

 

 


Part 2. Tournaments

http://www.lifl.fr/IPD/applet-tournament.html

 

 

 

Question 2. The round robin tournament allows you to see the contest scores for all the strategies, and the total score that would result from a round robin against one of each of those players.  Select all_c, all_d, tit_for_tat and 10 moves, and check your answers from Q1a.

a.       Which strategy has the highest score in a round robin with just these three strategies?

b.      The presence of other players and the number of moves can have a large impact on the total scores. Remove all_c from the round robin tournament.  What impact does it have on the score?  Explain the changes.

c.       Calculate a round robin tournament for 20 moves with all_c, all_d,  tit_for_tat and “random” (the last strategy in the list – which cooperates or defects with 50% probability). In tournaments with “random” included, final scores vary somewhat, and at times different strategies score the highest (click the play button several times to run different random number trials). Explain which strategy wins most often, which other one(s) win on occasions, and why.

 


Part 3. Evolution: variation in strategy frequencies over time

http://www.lifl.fr/IPD/applet-evolution.html

 

     

 

One of the simplest definitions of evolution is “the change in gene frequencies over time (measured in generations)”.  This applet demonstrates how the strategies fare when the payoffs from the round robin determine their frequencies in subsequent  generations (under fitness-proportional selection).

 

Question 3.

q       Select the four strategies used in question 2 (all_c, all_d,  tit_for_tat and random), and click <next>
[NB random only generates a new random sequence when it is initially selected. Hence, to test a different random trial, deselect and then reselect the random strategy from the first screen]

q       Set the length of each meeting to 20 moves <next>

q       Use the default payoff matrix <next>

q       The score matrix shows the results for round one <next>

q       Use the default population sizes of 100 of each strategy <next>

q       The graph shows the final results.

 

a.       Examine the graph. Which strategy wins in the long run? At the end of the run, what percentages of the population are the other strategies?

b.      In the scores matrix, all_d had the highest score.  Why doesn’t it win over the longer term?

c.       Remove all_c from the round robin (click <previous> back to the beginning) and run the simulation again. What changes are there in the final results?

d.      Run a round robin with all_c, all_d, and tit_for_tat.  Compare the long term behaviour of all_c with its behaviour in 3a. Explain the differences.

 

Question 4.    The number of iterations of the game has a pronounced effect on whether cooperation or defection is a better long term strategy.

a.       Exaggerate the temptation to defect (e.g, set T=15 in the payoff matrix). Run a trial with all_c, all_d, and tit_for_tat, for 20 moves.  What differences does changing the temptation from 5 to 15 make to the long term behaviour?

b.      Keeping the temptation high, change the number of moves in each meeting to 100. How does the length of the meeting affect the long term outcomes in the simulation? [Hint: check which of the strategies are wiped out in each case]

 

Question 5. A situation is only a Prisoner’s Dilemma if T>R>P>S in the payoff matrix.

a.       Set T=5, R=3, and reverse P and S, so that P=0 and S=1. Using the 4 strategies (all_c, all_d,  tit_for_tat and random), and #moves =100, explain the differences this makes to the course of evolution.  (Run several random trials.)

b.      Remove all_c and rerun the simulation. Explain the differences from Q5a.

 


Part 4. Summary and Applications

 

Question 6. Summarize your findings from the simulations by describing the conditions under which you would expect cooperation to survive in a world in which many different strategies are possible.

 

Question 7. There are many applications of iterated prisoner’s dilemma and game theory in biological and political studies.  Find a practical application of game theory from one of these areas described in the literature or online, describe it briefly and give the reference.