Applications of statistics typically involve the analysis of data in an attempt to glean some
understanding. What are the data telling us? Are the data themselves any good? If not, where
are the problems? All of these are great questions but they all require that you not only have the
data in hand but also that you have it in a form that you can analyze it with the analysis tools
available to you. In this lab you will look at several useful techniques for coaxing data from
various sources into
DataDesk. You will then take a quick look at some of the exploratory tools
which
DataDesk provides. It is intended as a set of examples and quick reference rather than a
contemplative assignment to be written up and handed in. You will be familiar with some of the
techniques we use here from earlier work in the course. As the course continues however, you will need to know
how to do these things, so view the items below as a checklist of skills to review and develop.
- In particular, you will:
- enter data directly in DataDesk
- gather some data from a web page
- transfer the data to an Excel spreadsheet
- massage the data into clean columns
- export the data to a text file
- import the data into DataDesk
- sanity check the data
- generate summaries of data
- generate histograms
- generate boxplots
- generate a timeplot (scatterplot)
- do a Layout of some kind
Much of the emphasis here is on working with data from the web.
Assignment:
- Work through the the exercises which continue at:
- Turn in your layout on paper in lab.
- Send an email from your Cornell netID to web171@ref2.net with:
- a URL of an interesting data set you may have encountered in your web meanderings, together with a brief comment.
- any spreadsheet tips you'd like to pass along (optional)
For your amusement...
here are finds from web meanderings of previous students: