The main mathematical topic in this episode is the question of determining whether something is random or not. There is also a brief discussion of choosing road widths to optimize traffic flow, which is mentioned in a Tangent.
Let's study what we just did a little more carefully. First, let's look
at one roll of the dice. This is an example of a random variable. For
our purposes, we will define a random variable as set of pairs (number,
probability) such that the sum of all the probabilities is 1. The first number
in each pair is supposed to be the outcome of some random event, and the
second number is the probability of that outcome. For example,
we could describe the random variable corresponding to a fair dice as
X = {(1,1/6), (2,1/6), (3,1/6), (4,1/6), (5,1/6), (6,1/6)}. The random variable
corresponding to the unfair dice that was rolled in the first paragraph
seems likely to be X = {(1,1)}, i.e. the only possibility is to roll a 1.
In the above paragraph we said that the sum of the rolls we were testing was
100, which was far from the expected value of 350. Now can we describe
this in a more rigorous way? Yes, using the mathematical idea of
expected value (this is one of the cases where the name for a
mathematical concept means almost exactly what it does in normal English).
First we define the expected value, or expectation, of one random variable X.
The formula is
The equation has a couple funny symbols in it, but in words this means
that you take each possible outcome, multiply it by the
probability that it occurs, and then sum all of these numbers up. For a fair
dice with 6 sides, this leads to E(X) = 1/6 + 2/6 + 3/6 + 4/6 + 5/6 + 6/6
= 3.5. Taking the expectetation of a random variable is essentially
figuring out what its average value is.
Now that we've defined expectation, how can we apply it to multiple dice rolls, or random variables in general? We'd like to figure out the expected value of the sum of several dice rolls. To do this, we'll define addition of random variables in general, and then see how addition interacts with expectation.
This formula probably contains some symbols you aren't familiar with, so we'll
explain them. First, the outer squiggly braces mean that we are creating set.
In general, the expression
refers to a set. The
elements of the set are described in blank 1, and these elements are subject
to the restrictions listed in blank 2. In this case, the elements are pairs
(m+n,pq), and the restrictions are that (m,p) must be an possible event for
the random variable X and (n,q) must be a possible event for the random
variable Y. An observant reader will notice that we are making an
assumption here that hasn't been stated yet. We are implicitely assuming
that the probability of the event "m and n both happen" is the
product pq. However, this is only the case if the random variables X, Y are
completely unrelated to each other, or independent. For example,
if X is 1 roll of a dice, and Y is a second roll of a dice, then X and Y are
independent, but if X is the sum of a red and a blue dice and Y is the
value of the red one, then X and Y aren't independent. From now on we'll
assume that all of our random variables are independent.
Let's look at an example where X, Y are both random variables
corresponding to a flip of a fair coin with sides labelled by 1 and 2. Since
the possible outcomes of the new random variable X + Y correspond to pairs
of outcomes, one each from X and Y, we can write down the possible outcomes
for X + Y in a table.
| (1,1/2) in X | (2,1/2) in X | |
| (1,1/2) in Y | (2,1/4) | (3,1/4) |
| (2,1/2) in Y | (3,1/4) | (4,1/4) |
Going from the first to the second line (and the fifth to sixth)
we use the definition of expectation,
and going from the fourth to the fifth we use the fact that the sum of all the
probabilities for any random variable is 1. The rest is properties of
arithmatic, i.e. multiplication and addition distribute over each other.
Now we have completed the goal stated in the first paragraph. If we roll
a dice 100 times then the expected value of the sum of the rolls
is 100E(X) = 350. This would indicate that if the sum of the rolls is
100, then the dice probably isn't fair.
However, this isn't the whole story. For example, if you said you got 50
1's and 50 6's, then the sum would be 350, which is exactly what the
expected value is. We need some other kind of measurement to go along with
the expected value measurement. One candidate is the variance. This
is supposed to measure how different the values are from the average
value, and is defined as
. Notice that in the first
equation we are subtracting E(X) from X. This is the same as adding the
random variable {(-E(X),1)} to X. (E(X) is a constant, because it doesn't
depend on the possible outcomes of X, which we are summing over.) Now when
we square the random variable (X - E(X)) we aren't actually multiplying
two random variables together, we're just squaring the numbers that are the
possible outcomes. More precisely, the square is defined as
.
The obvious next question is how does variance interact with addition?
If the random variables are independent (which means that the outcome of one
doesn't depend at all on the outcome of the other), then variance is also
additive, as we show below. Throughout the mu's with the subscript refer
to the expectation of X and Y.
Plugging into the formula, if X is the random variable corresponding to the
roll of 1 dice, then V(X) = 1/6(2.5^2+1.5^2+.5^2+.5^2+1.5^2+2.5^2) =
3 - 1/12. (Unfortunately it isn't an integer or very nice fraction.)
This gives us another way to test if a sequence of dice rolls is random. If
someone claimed they rolled 50 1's and 50 6's, then the sum of their rolls
would be 350, which is exactly the expected value of the sum of 100 dice
rolls. However, the variance would be 50*(6-3.5)^2 + 50*(1-3.5)^2 = 625,
but the variance of the random variable corresponding to the sum of 100
dice rolls would be 300 - 25/3, or about 292, which is much smaller. Now we
have two ways to tell if a sequence isn't random. Of course, these will
miss some ways that sequences can fail to be random, for example, the
sequence of rolls 1,2,3,4,5,6,1,2,3,4,5,6,... will appear to be random
using these tests. Of course, this sequence of rolls will be impossible to
reject as nonrandom by any test which doesn't depend on the order of the
rolls, since the distribution of the numbers is exactly what it should be.
We've talked about addition of random variables, but we haven't talked
about multiplying them by a scalar quantity (this would be like multiplying
the values on a dice by some fixed number). If c is a fixed number
and we write cX to denote this process, then we can figure out what
happens when we take the expectation and variance of cX.
Therefore, if instead of adding up 100 dice rolls we wanted to average
100 dice rolls, our expectation would remain the same but the variance would
decrease. Specifically, if X_i corresponds to the roll of dice number i,
then
is the average of 100 rolls, and E(A) = 3.5 and
V(A) = (3-1/12)/100.