In this episode the most significant mathematics involved was the discussion of sabermetrics, which are various methods of collecting statistics in baseball to evaluate players. There was also a discussion of recovering data from hard drives and a reference to evaluating people's economic potential from some variables in their life in a way similar to sabermetrics.

There are many different possible ways to measure the performance of baseball
players. However, before we talk about these we should probably review the
rules of baseball briefly first (since it is a very American game, so some
foreigners aren't familiar with its rules, or with apple pie). Of course,
the rules described here are only a brief summary of the main ideas.
If you are already familiar with
these rules, you can skip to the next paragraph. In the game of baseball,
there are two teams, and the object of the game is for your team
to score more *runs*
than the other team. At each point in time there is one team on offense
trying to score runs, and the other team is on defense trying to prevent this.
There are 9 players on the defensive team, 3 in the *outfield*, 4 in the
*infield*, one pitcher, and one catcher. The field is approximately
a quarter of a circle with an approximately 400 foot radius. There are
4 *bases* arranged in a diamond pattern with *home base* at the
center of the circle which the field is a quarter of. The side length of the
diamond the bases are vertices of is 90 feet, and two of the edges lie on the
edges of the field. The offensive player who is *at bat* stands on
home plate holding a wooden stick called a *bat* (which cannot be
filled with cork). The pitcher stands 60.5 feet away on a raised mound of dirt
and throws the *baseball* to the catcher, who is standing behind the
batter. The batter's goal is to hit the ball with the stick, or bat, and
the pitcher's goal is to prevent this. If the batter hits the ball, then
he gets to run towards *first base*, which is the closest base to
him in the counterclockwise direction. If the batter hits the ball, then
several things can happen. If a defender catches the ball before
it touches the ground, then the batter is *out*. Similarly, if a defender
touches first base while holding the ball before the batter does, the
batter is out. If the batter touches a base and stays there, he is
*safe*. Then he gets to stay on the base and the next batter in line
gets to bat. If an offensive player ever gets back to home base, then his
team gets a point, and if the defense gets 3 outs, then the teams switch
offensive an defensive roles. Also, when the pitcher is pitching, he must
throw the ball through a specified zone. If he doesn't, it is a *ball*,
and if he does and the batter misses then it is a *strike*. If there
are 4 balls, the batter automatically advances to first base, and if there
are 3 strikes, the defensive team gets an out. If the pitcher hits the batter
with the ball, the batter gets to go to first base. The game consists of 9
*innings*, each of which consists of each team playing offense and
defense once. At the end of the game, the team with the most runs wins
(and if at the end of 9 innings the score is tied, more innings are played).

Now that we know the important rules and terminology of baseball, we can start talking about what ways we can measure the value of different players mathematically. Some obvious choices are number of hits, batting average (percentage of times at bat a batter gets a hit), number of bases a batter has attained by his own hitting, etc. However, these simple traditional measurements aren't necessarily all that well correlated with the actual performance of a team. In the last couple decades several different measurements have been devised that predict a team's success or a player's value more accurately.

The famous
Babe Ruth has some of the top scores under these measurements.
His on base percentage is second ever at .4740 (Ted Williams was first
with .4817). His slugging percentage was first at .6898 (Ted Williams was
second with .6338). Combining these two, his on base plus slugging is first
at 1.1638 (Ted Williams is second at 1.1155 and Barry Bonds is fourth at
1.0513).

One example of a sabermetric is the However, at least one of these metrics, the Pythagorean expectation,
has been studied rigorously. The name of the metric comes from its similarity
to the Pythagorean formula, which says that if a right triangle has
side lengths *a,b* and hypoteneus length *c*, then *a^2 + b^2 =
c^2*. If *RS,RA* are the runs scored and runs allowed, respectively,
by some particular team in a season,
then the Pythagorean expectation formula states that the winning percentage
of the team is approximately *RS^2/(RS^2+RA^2)*. In actuality, this
formula works better if the power 2 is changed to 1.81. At first glance
this formula seems as ad hoc as the formulas mentioned before, but Professor
Steven J. Miller has written a which gives a theoretical derivation of this formula assuming
that the runs for each team follow a Weibull
distribution.

Finally, we should discuss one of the statements in the episode which was somewhat misleading. In the show, the head researcher who was murdered at the beginning was supposedly coming up with a method of evaluating a person's economic potential, or their potential contribution to the economy and hence to society, using techniques similar to these metrics for baseball that we have been talking about. However, one of the reasons these metrics are useful in baseball is that a player will presumably play for many seasons, and his performance in one season should be similar to his performance in the next. The difficulty of applying these ideas to economics is that each person only lives one life, and predicting later events or contributions accurately for an individual person based on some numbers derived from measurements of various quantities in their upbringing would be nearly impossible. Of course, one could in theory derive probability distributions for predicting a person's "contribution to the economy", but it is likely that these distributions would have a much higher variance than the metrics for baseball, and would hence be much less useful to apply to individuals.

As you may know, hard drives are the main device which is used for storing permanent data in personal computers. They have several platters, which have a similar shape to CDs and have many concentric rings of tiny segments of magnetic material. These tiny magnets are used to store a sequence of 0's and 1's which are organized into groups of 8 bits, each of which is called a byte. Modern hard drives can hold around 100 gigabytes of information, which means they hold approximately 100 billion bytes.

During the episode, Charlie is asked to try to recover data that has been erased from the murder victim's hard drive. At first one might think this is impossible since the data has been erased. However, when computers erase data, they don't actually write over the data, they just erase the information about where the files start and stop. They only write over the space that the deleted files occupied if they need that space to store other files. This means that it is often possible to recover erased data. In addition, due to slight misalignments of the hard drive, each of the bits often holds a slight trace of the previous magnetic state that it was in. That means that even if the data has been written over, it may be possible to see what had been written on it before, making recovery of some of the old data possible. However, this is very difficult in practice, and rarely leads to full recovery of the data. The lesson here is that if you really want to erase data, you should erase the files and then fill your hard drive up with surperflous files, delete these files and fill it up again, and repeat this process several times. This will practically ensure that the segments that hold the data you want to completely erase will have been written over several times, which will make recovery of the old data extremely difficult at best.