The Central Limit Theorem

The goal here is to observe the general central limit theorem in computer simulation.

Table of Contents

I. Simulation Study

Let X be a uniform random variable and let Y be the random variable given by $ Y = \sqrt{x} $. To study the distribution of Y, first use simulation to generate one DataDesk variable XR containing 200 Realizations of X. We will refer to this DataDesk variable as YR which contains 200 Realizations of the random variable Y . By default, DataDesk will name it something like $\sqrt{Un1}$ which serves as a reminder of how you arrived at the values in YR .

  • Computer Hint:
    1. Generate the Realization of X , XR using the menu entry ManipGenerate Random Numbers... Specify 1 variable with 200 cases, and select the uniform distribution.
    2. Create the Realizations of Y , YR from XR ( ManipTransformSqrt(y) ).

  1. Plot the histogram of YR
  2. based on the histogram of YR , does Y appear to have a uniform distribution? Does Y appear to have a Normal Distribution? Why or why not? Do the heights of the histogram bars generally increase or decrease as the value of Y increases between 0 and 1? Explain intuitively in terms of the graph of $ y = \sqrt{x} $.
  3. Based on the simulation study (i.e. using the sample mean ‡ and sample standard deviation ‡ of YR), what is your estimate of the mean of Y and standard deviation of Y?
  4. The mean and standard deviation of Y are exactly $ \frac{2}{3} $ and $ \frac{1}{\sqrt{18}} $ respectively. Are the estimates in part (3) close to the exact values?

II. Central Limit Theorem

For the rest of the lab you will be filling in the table below:

n exact mean mean(YBAR)
(From Simulation)
exact SD SD(YBAR)
(From Simulation)
exact variance variance(YBAR)
(From Simulation)
1 $ \frac{2}{3} $   $ \frac{1}{\sqrt{18}} $   $ (\frac{1}{\sqrt{18}})^2 = \frac{1}{18} $  
5            
10         $ \frac{ 1 }{ 10^2} \cdot ( 10 \cdot \frac{1}{18} ) $  
40            

Here we have $ Y_1, Y_2, \dots, Y_n $ following the same distribution as Y above with $ \bar{Y} = \frac{Y_1 + Y_2 + ... + Y_n}{n} $ and we want to study the distribution of $ \bar{Y} $.

  • Be careful! As you look at each different sample size, be sure to:
    1. Generate the appropriate number of Uniform Random Variables, x
    2. Take their Square Roots to arrive at a corresponding number of Random Variables y
    3. Create a new derived variable for $ \bar{Y} $ based on the relevant sample size.
    4. Plot your histogram
    5. Calculate your summary statistics.
    6. Write values in the table above.

  1. For n = 5 what are the exact values of the mean and standard deviation of $ \bar{Y} $ ?
    For n = 5, generate 200 realizations of $ \bar{Y} $. Computer Hint Note this means that you first have to generate 5 variables (each has 200 cases) of realizations $ Y_1, Y_2, ..., Y_n $ using the method of Problem I. After you have generated these 5 variables, average them by the following steps: First use the menu entry DataNewDerived Variable to write a formula. Assign the name YBAR to this variable. If your 5 variables with the same distribution as Y are named $\sqrt{Un1}, ... \sqrt{Un5} $, then you'd like to enter $ \frac{( \sqrt{Un1} + \sqrt{Un2} + \sqrt{Un3} + \sqrt{Un4} + \sqrt{Un5} )}{5} $ as the defining formula of the derived variable YBAR. A way to do this with less typing is to select the 5 icons $\sqrt{Un1}, ... \sqrt{Un5} $ and drag them into the window where YBAR is being defined. This will give a list of these variable separated by commas. Now edit the commas, changing them to plus signs. This technique will really help on question III below when we ask you to do this for 40 variables! An alternative is suggested at the end of the assignment.
  2. Based on your simulation of YBAR, estimate the mean ‡ and the variance ‡ of YBAR. How do they compare to the exact values in (1)?
  3. Plot the histogram ‡ of YBAR. Does the distribution look normal?

III. Larger n

Redo Problem II, for n = 10 and n = 40 and fill in the rest of the table above. ( You need to redo all three parts in II for n = 10 and n = 40).

For n = 40, what does the Central Limit Theorem say that the approximate distribution of YBAR is? What are the mean and standard deviation of this approximating distribution?

Note in selecting 40 icons, you can just use your mouse, starting at the leftmost, and while holding the mouse button down, drag a rectangle touching all 40 icons to be included in the definition of YBAR. You will have to change 39 commas to plus signs to come up with this informative average. You can do them one at a time in DataDesk, but a better way is to copy and paste into a text editor with search and replace. Do the search and replace and then copy and paste back into DataDesk. Note: When you copy/paste into your text editor, you may find that the √ symbol becomes something else. Don't worry about it. It should return to its original form when pasted back into DataDesk after you have done the replace/copy operation.

Note:

An alternative is to generate 200 variables of 40 cases each. All 200 variables are selected at this point. Use ManipTransform$ \sqrt{} $ then Calculate Summaries as Variables. A new window opens with variables mean, standard deviation, counts, etc. This depends on the requested summary variables. Do a histogram of the mean.

This exercise is designed to illustrate ideas from Chapter 18 of Stats Data and Models by DeVeaux, Velleman and Bock.

The density curve of $ Y = \sqrt{X} $ is $ f(y) = 2y, 0 \le y \le 1 $ .

To Turn in:

  • Please print out the results marked with ‡.
  • Hand in your completed assignment when your TA asks for it in lab next week..

home icon CuMath171Info > LabExercises LabOnTheCentralLimitTheorem
Revision: LabOnTheCentralLimitTheorem - r1.15 13 Mar 2007 - 18:43 - Dick Furnas