A disturbed student, tied to an animal rights group, is implicated in the death of a researcher at CalSci. The arrestation attempt of this lead suspect turns into a hostage situation. In order to rescue the hostages and get into the student's mind, Charlie and Amita investigate the student's work in mathematics. They find his math to be highly creative but formally incorrect. In some of the papers they discover an ecological version of the Prisoner's dilemma (a concept already presented in Episode 110 and Episode 321), which reveals the suspect's immense appreciation for animals.
In the beginning of the episode, while playing poker with Amita, Liz and Nikki, Larry mentions a well-known result in the mathematics of shuffling cards discovered by mathematician and magician Persi Diaconis at Stanford University.
"Actually, Nikki makes a critical point. Persi Diaconis at Stanford proved a minimum of five shuffles are required before a deck starts to become random in the sense of variation distance described in Markov chain mixing time, but seven shuffles are optimum."
Below, we will explain this and other related results in more detail.
The riffle or dovetail shuffle is the most commonly used method of shuffling cards. In this shuffle the deck is cut in two halves, which are then interleaved or riffled together (see Tangent). Gilbert and Shannon in 1955 and independently Reeds in 1985 introduced a mathematical model of this type of shuffle. In this probabilistic model, the probability that exactly k cards are cut off a deck with n cards is C(n,k)/2n, where C(n,k) is the binomial coefficient; i.e. the size of the cut off has a binomial distribution with p=1/2. The k cards that have been cut are held in the left hand and the remaining n-k in the right hand. After the cut, the cards are dropped from a given hand with probability proportional to the packet size. Thus, if at some point there are c cards on the left packet and d cards have been already dropped on the table the probability that the next card is dropped from the left is c/n-d. Diaconis in 1988 found evidence showing that this mathematical description is a reasonable approximation of the actual behavior of human shufflers.
There is a backwards description of the riffle shuffle that is helpful for its mathematical analysis. In an inverse riffle shuffle, each card in a sorted deck goes to either the left or right packet with probability 1/2. Then, the left packet is placed on top of the deck, keeping the cards in the same relative order. It turns out that the Gilbert-Shannon-Reeds model is equivalent to the inverse riffle shuffle from the probabilistic point of view. To illustrate this fact we present an example.
Suppose that we have a deck consisting of only 4 cards with initial configuration A♠ 2♠ 3♠ 4♠. Let's calculate the probability that after 1 forward riffle shuffle the new configuration is A♠ 2♠ 4♠ 3♠. In this case 3 cards are cut from the top, namely A♠ 2♠ 3♠. This happens with probability C(4,3)/24=1/4. Then the first card, 3♠, is dropped from the left packet and the second, 4♠, from the right packet. This occurs with probability 3/4*1/3=1/4. Multiplying these two probabilities out we conclude that under the Gilbert-Shannon-Reeds model the probability of this configuration is 1/16.
Let's now calculate the probability that after one inverse riffle shuffle we obtain the same configuration. In this case, A♠ 2♠ 4♠ go to the left packet, while 3♠ goes to the right packet. This occurs with probability 1/2*1/2*1/2*1/2=1/16. The probabilities under the two models are the same. This fact can be generalized to an arbitrary number of shuffles and the exact probability of a given configuration will be given below.
First we highlight an important observation about the riffle shuffle. Suppose that we number our cards from 1 to n. A riffle shuffle generates a permutation of this n numbers. If we denote this permutation by π, in our example above
π(1)=1, π(2)=2, π(3)=4 and π(4)=3,
i.e. π(i) gives us the new position of the i-th card. A rising sequence of a permutation is a maximal consecutively increasing sequence in the permutation. For instance, in our example there are two rising sequences, namely
π(1)<π(2)<π(3) and π(4).
If we had six cards and
π(1)=1, π(2)=2, π(3)=4, π(4)=3, π(5)=6 and π(6)=5,
there would be three rising sequences, namely
π(1)<π(2)<π(3), π(4)<π(5) and π(6).
It is easily seen that a permutation π of any number of cards made by a riffle shuffle has exactly two rising sequences, unless no shuffle is made whatsoever. Hence for instance the permutation of 6 cards presented above cannot come from a riffle shuffle. Conversely any permutation with 1 or 2 rising sequences can be achieved by a riffle shuffle and this type of shuffle can be characterized by this fact (see Tangent). It is not surprising then that the probability of a given configuration after an arbitrary number of riffle shuffles can be given in terms of the number of rising sequences. The following theorem was proven by Diaconis and Bayer in their celebrated paper "Trailing the dovetail shuffle to its lair" in 1992.
Theorem: If n cards are shuffled m times, then the chance that the deck is in arrangement π is
C(2m+n-r,n)/2nm,
where r is the number of rising sequences in π.
A♠ 2♠ 3♠ 4♠ A♥ 2♥ 3♥ 4♥.
Calculate the probability that after 2 riffle shuffles the new configuration is3♠ 3♥ 4♥ A♠ 4♠ 2♠ A♥ 2♥
A deck of n cards is perfectly random if the probability of any of the n!=1*2...*n possible orders of the deck is the same, i.e. if the distribution of all possible orders is uniform. One possible way of measuring how random is a deck of cards after m shuffles is by measuring the total variation distance between the distribution of the permutations after the m shuffles and the distribution in a perfectly random deck. Below we explain the definition of the total variation distance in our context. We also present Persi Diaconis and Dave Bayer's famous result on the number of riffle shuffles needed to achieve a good level of randomness in a deck of cards.
The total variation distance between two probabilities P and Q over the set of permutations is defined as
where P(π) and Q(π) denote the respective probabilities of seeing the configuration π. It can be shown that this distance is a number between 0 and 1. It can also be shown that this distance is the largest possible difference between the probabilities that the two probability distributions can assign to the same event. In the example above, all the possible 16 configurations of 4 cards after 1 riffle shuffle have the same probability equal to 1/16. There are 4!=24 possible permutations of 4 cards. Hence the total variation distance between the distribution after 1 riffle shuffle and the distribution of a perfectly random deck (for which all permutations have probability 1/24) is equal to 1/2*(1/24*8+16*(1/16-1/24))=1/3. The following table gives the total variation distance for different numbers of shuffles of decks with different number of cards.
Cards\Shuffles | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
25 | 1.000 | 1.000 | 0.999 | 0.775 | 0.437 | 0.231 | 0.114 | 0.056 |
32 | 1.000 | 1.000 | 1.000 | 0.929 | 0.597 | 0.322 | 0.164 | 0.084 |
52 | 1.000 | 1.000 | 1.000 | 1.000 | 0.924 | 0.614 | 0.334 | 0.167 |
Observe that, as Larry explained in the episode, for a regular deck with 52 cards the deck starts becoming random (i.e. the total variation distance is less than 1) after 5 shuffles and after 7 shuffles the total variation distance is less than a half. After this point, the total variation distance decreases by approximately a factor of 2 after each shuffle. When the number of cards in the deck increases towards infinity the graph of the number of shuffle vs. the total variation distance to the uniform distribution starts looking like the graph shown in the figure below.
This phenomenon is known as the cutoff phenomenon. The number K is known as the cutoff time. Persi Diaconis and Dave Bayer proved in 1992 in a famous paper titled "Trailing the dovetail shuffle to its lair" that the cutoff time for a deck with n cards is given by (3/2)log2n, where log2n denotes the logarithm in base 2 of n (see Tangent).