Olympic Long Jumps

The modern Olympic Games, a modified revival of the Ancient Greek Olympian Games, were inaugurated in 1896, largely through the efforts of French sportsman and educator Baron Pierre de Coubertin. Since then, the Games have been held nearly every four years at various sites around the world, and have become a major international athletic competition.
Table of Contents

During the past century, performances in the Olympic events have improved dramatically. In this assignment, you will examine the men's long jump. You want to look at historical trends, explain unusual results, construct a good linear model, then make predictions and interpret the results. Here are the step-by-step instructions. Include the items marked with ‡ in your Layout for printing.

1. Datafile

Open the datafile. You should see these variables of interest:

  • year - the years the Games were held, with 1900 = 0 (so the first modern Olympics have year = -4; the 2000 Olympics would have year = 100).
  • dist - the gold-medal winning jumps, in inches.

2. History

First, look at the history of this event.

  • Select dist as the response variable Y and year as the explanatory variable X.
  • Plot the Scatterplot with the regression line ‡(in the hypermenu Add Regression Line ). Describe what you see about the association.
  • Again using the hypermenu, do the Regression ‡ of dist vs year.
  • In the hypermenu for the regression analysis, Compute the Residuals.
  • Plot the Scatterplot ‡ of residual(Y) vs year(X), and Add Regression Line ‡.
  • Note that the residuals plot shows some interesting patterns and gaps. Drawing upon your knowledge of the history of the 20th Century, explain why the residuals plot looks like that. Also point out any outliers.

3. Model Considerations

Your goal is to create a good model for predicting Olympic long jump performances for the near future. Good models should be based on relevant data. Your historical analysis probably suggests that it would be unwise to use all the data for the first 100 years of the modern Games to predict the results for the early 21st century. Think about what part of the data you would consider most relevant.

4. Model Building

Now create the model (find the equation of the line of best fit) based upon your chosen data points.

  • Under Modify, show the Tools. Select the "lasso" from the upper left corner of the tools palette. Holding the mouse button down, draw a loop encircling the points you want to use. (If you make a mistake, or want to change your mind later, you can Modify Selection → Clear at any time.) Your chosen points should now be highlighted.
  • Under Modify Selection you want to Assign Selector. This creates a new variable that indicates which years' data you will use. A "selector button" will appear in the lower left corner of the DataDesk window. Be sure it is "on" (highlighted black). Now reselect dist and year as Y and X, and Plot a new Scatterplot from your chosen data.
  • Do a regressions analysis and make a residuals plot for your chosen data. Do you think you have a good model? If so, continue with the rest of the lab. If not, clear this selection and trash the selector, then make a different choice of data and try again. You need to find a model you are satisfied with, and be able to justify your choice of which data to use. On the one hand is your desire to make the model really good by using only data that accurately represents typical contemproary performances. On the other hand is the scientifically indefensible practice of ignoring data just because you don't like it. A model that works well must both look good and take into account the reasonable variability and the trend seen in these performances. Be assured that there is no "right answer". Different people will make different decisions.
  • When you are satisfied, include your model's scatterplot (with line) ‡, regression analysis ‡, and residuals plot ‡ in your Layout.

5. Model Justification

Justify and analyze your model.
  • What data points did you choose to base your model on? Why?
  • Do you think your model works well? Why?
  • What is the model; that is, what is the equation of the line of best fit? (The constant term and the slope for the equation of the line are the coefficients displayed in the bottom left corner of the regression analysis.)
  • What does the slope mean in this context?
  • What does the value of R-Squared mean in this context?
  • What does your model "predict" for the gold medal jump in the 2000 Sydney Games?
    • The actual Sydney jump was 336.6 inches, Comment.
  • What does your model "predict" for the gold medal jump in the 2004 Games?
    • The actual jump was 338.2 inches, Comment.
  • What does your model predict for the gold medal jump in the 2008 Games?
    • Comment on your faith in this prediction.
  • Predict the winning distance for the Olympics at the end of this century, in 2100.
    • Comment on your faith in this prediction.

The End!

Your completed assignment is due in lab next week.

home icon CuMath171Info > LabExercises LabOnOlympicLongJumps
Revision: LabOnOlympicLongJumps - r1.10 04 Feb 2007 - 22:28 - Dick Furnas