Skip to topic | Skip to bottom
Stats.LabWhenceDatar1.8 - 03 Apr 2007 - 20:26 Dick Furnastopic end

Start of topic | Skip to actions

Whence Data?

Applications of statistics typically involve the analysis of data in an attempt to glean some understanding. What are the data telling us? Are the data themselves any good? If not, where are the problems? All of these are great questions but they all require that you not only have the data in hand but also that you have it in a form that you can analyze it with the analysis tools available to you. In this lab you will look at several useful techniques for coaxing data from various sources into DataDesk. You will then take a quick look at some of the exploratory tools which DataDesk provides. It is intended as a set of examples and quick reference rather than a contemplative assignment to be written up and handed in. You will be familiar with some of the techniques we use here from earlier work in the course. As the course continues however, you will need to know how to do these things, so view the items below as a checklist of skills to review and develop.

  • In particular, you will:
    • enter data directly in DataDesk
    • gather some data from a web page
    • transfer the data to an Excel spreadsheet
    • massage the data into clean columns
    • export the data to a text file
    • import the data into DataDesk
    • sanity check the data
    • generate summaries of data
    • generate histograms
    • generate boxplots
    • generate a timeplot (scatterplot)
    • do a Layout of some kind

Much of the emphasis here is on working with data from the web.

Assignment:

  1. Work through the the exercises which continue at:
  2. Turn in your layout on paper in lab.
  3. Send an email from your Cornell netID to web171@ref2.net with:
    1. a URL of an interesting data set you may have encountered in your web meanderings, together with a brief comment.
    2. any spreadsheet tips you'd like to pass along (optional)

For your amusement...

here are finds from web meanderings of previous students:
URL Comment
http://bbr.icc.utexas.edu/index.php?pid=100&pub=115 Historical crime rates for selected cities
http://digitalcommons.uconn.edu/ctiwr_specreports/36/ Precipitation/Snowfall in CT
http://dpb.cornell.edu/F_Undergraduate_Enrollment.htm http://dpb.cornell.edu/documents/1000178.pdf Undergraduate Enrollment
http://lib.stat.cmu.edu/DASL/Datafiles/cigcancerdat.html Cigarettes sold and Cancer Death rates.
http://lib.stat.cmu.edu/DASL/Datafiles/cigcancerdat.html ibid
http://lib.stat.cmu.edu/datasets/ Assorted Data Sets. Arsenic in wells vs. toenails.
http://lib.stat.cmu.edu/datasets/baseball.data  
http://lib.stat.cmu.edu/datasets/bodyfat  
http://mathforum.org/workshops/sum96/data.collections/datalibrary/data.set6.html Numerous data sets
http://nber.org/oww/ Wage data for 161 occupations x 150 Countries
http://tkyte.blogspot.com/2006/08/interesting-data-set.html search hits
http://www.agingstats.gov/update2006/spreadsheets.html This form is from the US government website, of the population 65 and over by county.
http://www.bom.gov.au/cgi-bin/wrap_fwo.pl?IDD60029.html Weather in Australia
http://www.census.gov/cgi-bin/ipc/idbsum.pl?cty=US Age Distribution of US Population
http://www.census.gov/cgi-bin/ipc/pcwe US Census world population growth estimates
http://www.census.gov/ipc/www/idbsum.html Growth Rate in China
http://www.census.gov/ipc/www/worldpop.html US Census world population growth rate
http://www.netcraft.com website statistics
  Health Insurance Coverage by age
http://mathforum.org/workshops/sum96/data.collections/datalibrary/data.set6.html Assorted Data Sets, one about American Colleges particularly interesting.
http://www.earth.rochester.edu/ees207/Mass_Ext/higgins_mass2.html Extinction Rates vs. Time
http://www.ers.usda.gov/Data/FoodConsumption/FoodAvailSpreadsheets.htm#fruitveg  
* US Census Bureau
URL Comment


You are here: home icon LabWhenceData

to top

√xhtml w3c √css