One important goal in statistics is to summarize data. Here we explore several important summary numbers as well as graphical presentations which allow direct visual comparison of data sets.
|
| ||
Read the Reference file in the data and then inspect the four DataDesk variables Soprano, Alto, Tenor and Bass contined within the Icon Heights.
Make a histogram ‡ for the Alto variable. ( Plot → Histograms ). Estimate the median height of the altos graphically from the histogram. Mark your estimate on the histogram. Describe how you made this estimate. Is the distribution skewed to the right or left?
Collect the statistics ‡ to fill in the following table. Use Calc → Calculation Options → Select Summary Statistics to customize which values will appear and then Calc → Summary → Reports to actually see the values.
Standard Deviation | Interquartile Range | Mean | Median | |
Soprano | ||||
Alto | ||||
Tenor | ||||
Bass |
Plot ‡ the box plots side by side for the four groups. To get your box plots, append the data for each singing part as described below, select the Heights there as y, the Part as x and then Plot → boxplot y by x
Note: Create an Append relation by selecting all four icons in the Heights folder at once followed by Manip → Append and Make Group Variable. A new window opens with the icons data and group. rename these to heights and Parts. The boxplot can then be made with Heights selected as y and Parts as x.
Note: You don't need to do it on this problem, but you could also generate these boxplots individually by selecting one of the variables inside Heights (e.g. Soprano) and then choosing Plot → Boxplot Side by Side. However, you cannot use this command after selecting all four variables because the variables have different numbers of cases. That's why we used the approach above using an Append relation with Plot → boxplot y by x.
Note: Your textbook has further discussion of boxplots, medians and quartiles.