Numb3rs 321: The Art of Reckoning

In this episode, the main mathematical concepts derive from a field called game theory. This active area of research is one of the main facets of modern economics. The goal of game theory is to take some social situation with some notion of winning and come up with a formal, mathematical way of discussing it. Thus the parties involved in the situation are called "players" and the set of rules we define is called "the game." Usually the discussion focuses around what the best strategy for playing the game is, i.e. how best to go about winning. Sometimes, though, determining the strategy is difficult, and mathematicians will focus on determining whether an optimal strategy exists. Taking chess as an example, is it possible for black to always win or regardless of what black does can white always force a draw? Since a general game could be almost anything, mathematicians usually focus on classes of games with a few aspects in common. Most real life games can be categorized pretty easily.

Activity 1:

Make a list of games people play every day. Chess and poker are definitely both games, but don't limit yourself to the usual sense of the word. For example, at the supermarket there are many check-out lanes. Often you see people moving back and forth looking for the "best" one in the sense that they are attempting to minimize the time they'll spend in line. You can think of two people ready to buy their groceries as the players, and the winner is the one who gets to leave first. Looking at these games, what kind of rules do they have in common? For example, do players move one after another or all simultaneously?

A Curious Result: The Monty Hall Problem

In the 1960s and 1970s, Monty Hall hosted a game show called Let's Make a Deal. Though the following problem was never a game on the show exactly as stated, it is similar in format to games that appeared.

Suppose there are three closed doors before you. Behind one door is some prize of great value, but nothing lies behind the other two doors. You are given the opportunity to select one door. The host of the show then opens up one of the other two doors - always one with nothing behind it. You are given the choice to keep your original guess or to switch.

Activity 2:

Is it better to always switch or always pick the same one or is neither the case? If neither was better, you should have a 1/2 probability of winning. In either of the other two cases, you would have a probability of winning strictly greater than 1/2.

Let's go through how to solve this problem. We'll consider two different cases, one where our first guess is correct and one where our first guess is incorrect.

On our first selection, suppose we unknowingly choose the door with the prize behind it. This happens exactly 1/3 of the time. The remaining two doors are both losers, so after the host opens one door the one remaining is a loser. Then the strategy of never switching wins. The strategy of always switching loses.

Supposing on the other hand that on our first selection we chose incorrectly. This happens exactly 2/3 of the time. The door the host opens is the other losing door, so the remaining door is the winning door. If we stick with our initial choice, we lose. If we change, though, we win.

The strategy of always staying with your first guess loses 2/3 of the time. The strategy of always switching wins 2/3 of the time! Our gut instinct (usually) is that it shouldn't matter whether we switch or not. However, this perfectly exemplifies why rationality is a better problem solving tool than common sense: it is always better to switch.

The Prisoner's Dilemma and the Nash Equilibrium

Charlie mentions the Prisoner's dilemma. This is one of the most common basic problems in Game Theory. In this problem, there are two prisoners, 1 and 2, who are guilty of some crime which carries a sentence of 10 years. The police only have enough evidence to convict the two prisoners on lesser crimes which carry only a sentence of one year, so officers will put the prisoners in separate rooms and offer them only 5 years in prison if they will swear under oath that the other prisoner commited the crime. The two prisoners each have a choice of remaining loyal or betraying their partner.

The prisoner's dilemma is most easily understood by putting it into what is called normal form. In a two player game, the normal form is given by a matrix. Each column corresponds to a choice by one party, and each row corresponds to a choice by the second party. The entry where a particular row and column intersect is the outcome of each party taking those two actions. To the right is the normal form for the prisoner's dilemma. Each entry gives the sentence of prisoner 1 and then prisoner 2.

	1 Loyal	1 Betrays
2 Loyal	1, 1	0, 10
2 Betrays	10, 0	5, 5
Prisoner's Dilemma: Normal Form

We call the 1 betrays-2 betrays strategy pair of this problem the Nash Equilibrium. A Nash Equilibrium of a game is a collection of strategies for all players where no player can improve their result by changing their strategy (without regard to what other players might do). In the prisoner's dilemma, only the betray-betray is a Nash Equilbrium. For instance, in the loyal-loyal case, prisoner 1 can improve his result by changing his strategy to betrayal. Unfortunately, prisoner 2 has the same feeling, so they'll more than likely betray one another.

The Nash Equilibrium is named after John Nash - the subject of the book and movie A Beautiful Mind. In 1951, Nash published an article in which he proved that any finite game with any finite number of players - i.e. a game with a finite number of possible actions on the part of any player - has an equilibrium with the properties stated in the previous paragraph.

In any event, the result of the prisoner's dilemma is that the stable equilbrium of Nash is not the optimal one for both players - they each will end up with five-year sentences instead of one-year terms. One could think instead of an analogous situation of two-equally sized companies manufacturing some product. Naturally, they are both trying to maximize their profit. One company might undercut the other to increase their sales. However, the other company will likely attempt the same strategy. This results in both companies actually reducing their profit. Using more careful analysis, one can use this idea to actually determine when two companies are acting as a duopoly - that is, working together.

Tit for Tat

Tit for tat is a surprisingly successful strategy for prisoner's dilemma type problems with multiple rounds of competition. In this strategy, the player will follow the following rules each round against a competitor X:

The player will cooperate unless X attacked last round.
If attacked last round by X, the player will attack X.

In games where competitors meet each other multiple times with high probability, this strategy has shown in numerical tests to be the best possible. The following is an example of how the tit for tat strategy plays out:

Suppose that there are now 5 prisoners. Each prisoner will play the prisoner's dilemma game against each other prisoner a total of six times. Suppose that 3 of the prisoners are greedy and that the other two take the tit for tat strategy. Call the greedy players Gs, and the tit for tat players Ts.

Whenever two Gs face off, they both betray and receive 5 years each.
Whenever two Ts face off, they remain loyal and receive only 1 year each.
Whenever a G and a T face off the first time, the T receives 10 years and the G receives 0 years. In successive face-offs, they both act greedily and receive 5 years each.
The Gs get 5 years for each round against a G.
The Ts get 1 year for each round against a T.
In the first round between a G and a T, the G gets 0 years and the T gets 10 years. In the rounds that follow, though, they each get 5 years apiece since the T will respond by betraying the G.
The Gs get 2*5*6 (rounds against Gs) + 2*0*1 (1st round against Ts) + 2*5*6 (last 5 rounds against Ts) = 120 years.
The Ts get 1*1*6 (against T) + 3*10*1 (1st round against Gs) + 3*5*5 (last 5 rounds against Gs) = 111 years.

Thus the tit for tat strategy works better in this case than the greedy method. The greater the number of tit for tat players, the greater the disparity between the greedy players and the tit for tat players.