Numb3rs 501: High Exposure

In this episode, two climbers are found dead in a national park, one with a large high-value uncut diamond hidden in his pack. A report comes in that a diamond shipment has gone missing en route, and the team discovers that it was supposed to be delivered by a charter plane which left one airport, but never arrived at its destination. Without transponder or radar information to help determine the flight path, the possible crash area is too big to cover in the amount of time they have. The question is to determine a probabilistic flight path, that is, to narrow the search by finding the probability that the plane took that route, given a set of conditions. These constraints could include the topography of the land, weather conditions, and whether the pilot wanted to minimize distance or take the scenic route.

Let’s take our map and overlay a grid, and just consider paths along the edges of the grid as a model. For each vertex, labeled by a letter and a number, we assign a probability to each edge leaving the vertex.

If we sum all the edges leaving a vertex, what should they add up to? Why?

To get the probability of a given path, we take the product of the probabilities of each step.

EXAMPLE

Consider the path C3  C2  C1  B1  A1. Then, the probability that the plane took the path is,

p_C3__C2__C1__B1__A1= p_C3__C2
*p_C2__C1* p_C1__B1* p_B1__A1

= (50%)(50%)(100%)(100%)

= 25%

So there is a 25% probability that the plane took this path.

For each of the following paths, calculate the probability that the plane took that route:

C3  C2  B2  B1  A1

C3  C2  B2  A2  A1

C3  B3  B2  B1  A1

C3  B3  B2  A2  A1

C3  B3  A3  A2  A1

Assume that we know the plane arrived, that it made it to A1. Then there were 6 possible paths that it could have followed. What is the probability that the plane is at A1?
What can you guess this means about the sum of the probabilities of all the possible routes? Check your answer.
Let’s think mathematically about why this would be true: Assuming that we know the plane landed at its destination, what is the probability of it being at A1?

Let us generalize this to say that the probability of the plane passing through a given point is the sum of the probabilities of all the paths leading to that point.

What is the probability that the plane passed through C2? B3? If we knew that the plane had only flown one distance unit, where would we want to start our search?
What is the probability that the plane passed through B2? C1? A3? If we knew that the plane had flown two distance units, where would we want to start our search?
Repeat the question for 3 distance units.
Repeat the question for 4 distance units.
In the episode, the team had no information about how long the plane was in the air. Assuming there is an equal probability that the plane went 1,2, or 3 distance units, what is the location with the highest probability overall?

In the show, the team actually has even more information. Before going missing, the climbers had talked about a great new cliff they had found. The team has a topographical map from the U.S. Geological Survey, and together with the climbers’ location and reasonable hiking distance, they can narrow down where this new cliff might be. If they can narrow down the path that the plane took, then they have a better chance of finding it, along with hopefully more evidence that will lead them to the killers.

Now let us incorporate more information into our model. Let’s say that at the time the plane left the airport, there was a thunderstorm at C1, so moving in the direction of that vertex is less desirable. Maybe we also read off the USGS map that the land around A3 is mostly flat and probably doesn’t have any great climbing cliffs. While this wouldn’t necessarily have been a factor in the pilot’s choice, we know that he went down near a great climbing cliff and therefore his path would be less probable in that region. Note that the choices are no longer symmetric!

Repeat the above exercise to determine where the plane most likely went down in this scenario.

We can see that the more information about the situation we have, the better our model is and the better our guesses. This is important when we have finite resources, like only a few field agents able to cover the terrain, not enough helicopters, and limited search time. We also have to balance the number of grid lines with the information we have—if we have more grid lines, we can better localize our probabilities, but it might be harder to determine what those probabilities might be.