Skip to topic | Skip to bottom
Stats.LabOnConfidenceIntervalsr1.7 - 04 Feb 2007 - 22:29 Dick Furnastopic end

Start of topic | Skip to actions

Confidence Intervals

We often need to know things about populations, but can seldom examine every individual. Instead we gather information from samples, often small, with the hope the sample will tell us something about the population. However, natural variability among samples (called sampling error) makes any guess about the population uncertain.
Table of Contents

Confidence intervals allow us to make inferences about population parameters based upon sample statistics. we do not pretend that we will know an exact value. Instead we settle for being fairly confident that the population value lies within a (hopefully small) range. In completing this assignment you will examine how well a confidence interval succeeds -- or doesn't. Here are the step-by-step instructions. (Again the ‡ highlights things to include in your Layout for printing.)

1. Datafile

Open the HeartAttacks data file. Remember this from Lab 3? The variable cost contains the dollar amount hospitals billed each heart attack patient in New Yourk State some years ago. These are the data from every individual in the population. Understand that you seldom have such a complete data set.

  • Plot a Histogram ‡ for this population. Does it appear to be normally distributed?
  • Cal culate Summaries ‡. How many NY patients were there? What was the population mean cost of treatment? What was the population standard deviation?

2. Estimate

Now, pretend that you do not actually know about the population mean. Instead, supose you need to estimate the typical patient cost for an insurance company, or a congressional committee drafting health care legislation. To produce this estimate, you will collect data from a random sample of the patients. Let's examine how well this process might work. We'll start with sample size n=60.

  • Select the cost icon as Y, then ManipSample. Remember that even though the sample size n is actually the important value for us here, DataDesk asks you to specify the percentage of the population to be chosen. Select 20 samples of size 60 (0.5%); do NOT create sample indices.
  • Using your chosen samples, estimate the population mean by creating the 75% confidence interval for each of these 20 samples. Use the knowledge of population standard deviation from part 1. (This can be done on DataDesk by using CalcEstimate... . You can do the 20 cases at once if you select them all first. Be sure to select z-interval, enter the standard deviation and set the individual confidence level to 75%.) Print Results ‡ .
  • Examine the list of intervals. How many of them successfully capture the actual population mean? Why does this sample-and-make-an-interval approach not always work?
  • Is the sample size big enough for inference?

3. Estimate further

Using the same samples, create 90% confidence intervals ‡ and 95% confidence intervals ‡. Compare the three sets of intervals. Which were most successful in correctly estimating the population mean? When we request greater confidence, how does the margin of error change?

4. Estimate from a Larger Sample

Now reselect the initial population, and create a new set of 95% confidence intervals ‡ based on much larger samples of around n = 600 (5%). Compare these to the old set of 95% confidence intervals. What is better about the new intervals?

5. Higher Confidence

We could create 99% confidence intervals. What are the advantages and disadvantages of doing that? (Note that you do not need to actually create them.)

The End!

Your completed assignment is due in lab next week.


You are here: home icon CuMath171Info > LabExercises LabOnConfidenceIntervals

to top

√xhtml w3c √css