Data Desk: First steps
Law of large numbers
1. Menu Manip, Generate random numbers
choose Bernoulli, prob. of success 0.5, 3 variables, 10 000 or (100 000 cases).
2. Window appears, Variables BTrials1,..BTrials3 appear
3. Menu Manip, Transform, New derived variable
Window appears, write name (e.g. ave ).
Window appears, write formula:
GetCase(CumSum('BTrials1'),'grid')/'grid'
4. Generate the variable 'grid' :
Menu Manip, generate patterned data.
Generate numbers from 1 to 10000 in steps of 1.
Variable appears, name it 'grid'.
5. Select derived variable ave ,
Menu plot, lineplots.
A plot (picture) appears. Shows development of cumulated average across sample.
Brownian motion like picture, wiggly at the beginning, converges to probab. of success
6. Look at the picture in detail. Expand the picture (magnification lense), then drag it back into center, magnify again etc. The magnification works as follows: when the mouse cursor is in the center, the mouse click shrinks (zooms out). When the mouse cursor is in the outer region of the picture, the mouse click zooms in. You can see the "monster" in great detail but also as a whole. For each sample you get a different monster.
Central Limit theorem
(Unfortunately this does not work on the version of Datadesk in the Computer Lab.
The formula in 4. below has to be changed in some way to calculate binomial probabilities.
But it shows how to work with sliders.)
Window appears; name the new sliding variable p.
Another window appeears with a picture: scale between 0 and 1.5,
Vertical axis at 1.
(This will be our variable probability of success).
Set upper bound 1 and lower bound 0, close dialog with OK.
In the "plot scale" dialog, set lower bound 0, upper bound none, but interval size 500.
(This will be our variable binomial sample size).
A window appears in which you can write a formula.
Write
N*BinomDistr(N*('grid'/10000),N,p)
(This calculates the binomial probabilities.)
Close this window by clicking on right upper corner.
Write the formula
Sqrt(s*N)*BinomDistr((s*('grid'-5000)/10000)*(Sqrt(N*p*(1-p)))+N*p,N,p)
This also calculates binomial probabilities, but in the center of the distribution (around Np with an appropriate scale).
Close the formula window,.
Change also the s slider (which is just the scale of the picture) to get a better view of the curve in ‘center’.
The curve
center approximates the normal density curve to a varying degree, namely:The better, the larger N is
And the closer p is to ½.
Watch the normal approximation break down as p approaches 0 or 1 or N becomes small.