Skip to topic | Skip to bottom
Stats.LabOnSummariesAndHistogramsr1.13 - 22 Feb 2007 - 15:04 Dick Furnastopic end

Start of topic | Skip to actions

Summaries and Histograms

One important goal in statistics is to summarize data. Here we explore several important summary numbers as well as graphical presentations which allow direct visual comparison of data sets.

Table of Contents

Preparation:

Retrieve the Singers data set.

Read the Reference file in the data and then inspect the four DataDesk variables Soprano, Alto, Tenor and Bass contined within the Icon Heights.

Problem 1:

Make a histogram ‡ for the Alto variable. ( PlotHistograms ). Estimate the median height of the altos graphically from the histogram. Mark your estimate on the histogram. Describe how you made this estimate. Is the distribution skewed to the right or left?

Problem 2:

Collect the statistics ‡ to fill in the following table. Use CalcCalculation OptionsSelect Summary Statistics to customize which values will appear and then CalcSummaryReports to actually see the values.

  Standard
Deviation
Interquartile
Range
Mean Median
Soprano        
Alto        
Tenor        
Bass        

Problem 3:

Plot ‡ the box plots side by side for the four groups. To get your box plots, append the data for each singing part as described below, select the Heights there as y, the Part as x and then Plotboxplot y by x


Note: Create an Append relation by selecting all four icons in the Heights folder at once followed by ManipAppend and Make Group Variable. A new window opens with the icons data and group. rename these to heights and Parts. The boxplot can then be made with Heights selected as y and Parts as x.

Note: You don't need to do it on this problem, but you could also generate these boxplots individually by selecting one of the variables inside Heights (e.g. Soprano) and then choosing PlotBoxplot Side by Side. However, you cannot use this command after selecting all four variables because the variables have different numbers of cases. That's why we used the approach above using an Append relation with Plotboxplot y by x.


Problem 4:

  1. Classify these four groups of singers into two pairs so that within each pair the mean heights are similar. Would your conclusion be different if you were to use the median instead of the mean?
  2. Using the standard deviation, compare the spreads of the four groups. Now use the interquartile ranges to compare the spreads. Explain whether or not these two approaches on spreads are consistent.
  3. Hypothesize about why the 4 groups of singers break up into two pairs as you noted in part (a). Hint: Read the Reference File again.

Note: Your textbook has further discussion of boxplots, medians and quartiles.

To Turn In:

  • Please print out results marked with ‡.
  • Turn in your work in lab next week.


You are here: home icon CuMath171Info > LabExercises LabOnSummariesAndHistograms

to top

√xhtml w3c √css