This article which appeared recently in Scientific American really got my goat. Its unquestioned assertion that the theory of a crazy mathematician is the “rational” answer while the vastly different human intuition is “irrational” is just plain silly. I recently ran an experiment to determine if there was an answer – an objective answer – to the question of how to play Traveler’s Dilemma. By that I mean a number or numbers that net the best overall winnings when playing against all other possible strategies.
I previously outlined the basic algorithm but let me be more specific so that the results I present can be properly understood. The virtual tournament consists of 100 numbers competing in rounds of 10,000 games each, so on average each number competes in 200 games per round. At the end of each round the 50 losing numbers are replaced by the 50 winning numbers, plus ringers. Ringers come about by two forms of variation: drift and saltation. Drift is a small change from a winning number to a nearby number, while saltation is a change to a totally different number.
The other question I wanted to answer with this simulation was: is the Nash equilibrium of 2 a stable solution? In other words, perhaps if everyone is playing 2 there is no way for any different numbers to break out and do better. In some ways the equilibrium solution is compelling – after all, if everyone is playing 2 then any other number will always get zero while netting the Nash player 4 points instead of the nominal 2. Perhaps that means that 2, while pathetic in terms of absolute returns, might be able to dominate all other numbers. So the final condition of the simulation is that all the numbers start out as 2. In fact 2 is not stable, but it takes the experiment to understand why.
Figure 1 (click for a larger version) shows the results of two typical runs of the simulation with different random seeds. Time runs horizontally, and the vertical lines in each plot are color-coded histograms of the distribution of numbers that won each round (actually each column represents the average of two rounds). At the far left side we find that all the winners are at 2, the starting values for all 100 numbers. Very shortly after that, however, the mean strategy jumps to much higher values. Then it decreases slowly until it jumps up again to a higher value and starts its inexorable drop again. How should one explain this behavior?
Imagine the first round. All the numbers are the same: 2; so all the games are the same: both numbers win 2 points each. The only difference is that games are chosen randomly, so some numbers play more often and some less often, and the ones that play more often get a higher total score. So at the end of the first round the losing 2’s are replaced with the winning 2’s, but with variation. Drift, which is more common, will result in some of the numbers next round being 3’s (1 would also be selected, but is impossible because of the rules of the game). The less common saltation will result in some of the numbers being randomly chosen from the whole range, like 22 or 59 or 87. The large numbers of 2 and the few ringers then play another round, and for at least a few rounds the 2’s dominate. That is to say since every time a 2 plays against a ringer, be it 3 or 87, the 2 wins 4 points and the ringer wins 0. So there are nothing but 2’s for between 5 and 15 rounds. And then something happens.
2’s do well as long as they are playing against other 2’s and against a rare higher value. The change that occurs – remarkably quickly in this simulation – comes about when the population of numbers contains two or more saltations in the same round. Each saltation – on average a number much greater than 2 – will still do very badly playing against 2’s, but if by chance they play against each other they will do incredibly well. A 2 can expect to net about 400 points for an average round of 200 games against other 2’s. Suppose a ringer plays a value of 50. Playing against mostly 2’s it will net zero, but if there is one other ringer out there with a value of 50 or higher then it only has to play 8 games against that other ringer to do as well as the average for 2’s. If this happens by chance than in the next round there will be twice as many 50’s, and the chances that they will meet each other increases, improving their scores spectacularly. Consequently the 2’s are rapidly replaced by 50’s, and as we’ll see, by much larger values as well.
The first general principle to understand here is that populations of large numbers do better globally than populations of smaller numbers because large numbers playing against large numbers net much higher results overall than small numbers playing against other small numbers. This explains the immediate jump from 2 to a much larger value as the equilibrium among the population. After that we see a series of characteristic ramps where the value decreases (at a remarkably consistent rate of around 1 per every 8 rounds) until it jumps up again. The general downward trend is due to the payoff structure of the game. If all the other numbers are playing N, then the best individual strategy is to play N-1 in order to net N+1 while everyone else gets N-2. Over a lot of games the numbers just below the common average will tend to do slightly better. After a sequence of rounds selecting for the winners all of the numbers will drift downward.
The second general principle then is that numbers smaller than the population mean can be better but only marginally and only numbers that are slightly smaller. We never see saltations downward, from 90 to 85 for example. The downward pressure is extremely subtle. On the other hand once the population mean gets low enough we do see saltations upwards – in our examples increases in value from between 7 and 27. These upward jumps are motivated by the same cooperative success that overwhelmed the 2’s so early in the simulation. In Traveler’s Dilemma, where the differential rewards for defection are so minimal relative to the common rewards for cooperation we see that in a very real, objective sense cooperation dominates as the strategy for success in the game.
Figure 2 is a plot of the total winning numbers in every round (derived from the second image from Figure 1). The spike at 2 is a fossil – the 2’s went extinct within a few rounds. The small spike at 50 is likewise a historical artifact, since 50 was the number that this set of players first found as a good cooperative number. Except for those, the winning numbers distribute themselves very similarly to the “sub-100” strategy normally employed by humans. In fact a value of 91, which might be expected from asking regular people about how they would answer the dilemma, is the statistical winner. While it’s not definitive or completely deterministic, it is objective and impersonal. Far from being irrational, humans, with brains long evolved for evaluating complex tradeoffs between selfishness and cooperation, behave much more rationally – in terms of maximizing their own reward – than the game theorist give them credit for.
- jack*
Comments