Balancing Cooperativeness and Adaptiveness in the (Noisy) Iterated Prisoner's Dilemma
Ever since Axelrod's seminal work, tournaments served as the main benchmark for evaluating strategies in the Iterated Prisoner's Dilemma (IPD). In this work, we first introduce a strategy for the IPD which outperforms previous tournament champions when evaluated against the 239 strategies in the Axelrod library, at noise levels in the IPD ranging from 0 behind our strategy is to start playing a version of tit-for-tat which forgives unprovoked defections if their rate is not significantly above the noise level, while building a (memory-1) model of the opponent; then switch to a strategy which is optimally adapted to the model of the opponent. We then argue that the above strategy (like other prominent strategies) lacks a couple of desirable properties which are not well tested for by tournaments, but which will be relevant in other contexts: we want our strategy to be self-cooperating, i.e., cooperate with a clone with high probability, even at high noise levels; and we want it to be cooperation-inducing, i.e., optimal play against it should entail cooperating with high probability. We show that we can guarantee these properties, at a modest cost in tournament performance, by reverting from the strategy adapted to the opponent to the forgiving tit-for-tat strategy under suitable conditions
READ FULL TEXT