Data for 85-222: Treatment of Experimental Data

For further assistance, email libdata@uwindsor.ca or come to the data centre during open hours.

Types of data

There are two main types of data you will be using. Time series data is data that has repeated observations on something such as a country or region over time - for example, Canada's GDP annually. Cross-sectional data is data on multiple units at the same time - for example, a survey where 500 people were asked the same questions, or the population of 100 countries in 2003.

How many observations

You need many observations to conduct an analysis. For a bivariate analysis (an analysis with two variables where you use one variable to predict the other) 30 observations may be enough. 20 is iffy, 10 ridiculous.

While it is mathematically possible to run statistical procedures with fewer observations, you are unlikely to get useful results, and your professor is unlikely to accept your paper.

With annual time series data it can be difficult to get enough observations - many time series don't go back more than 20 years or so. However, some time series are available monthly, giving you 120 observations in 10 years, which is plenty. With cross-sectional data you need enough units - data on Canada's 10 provinces would not be enough, but data on 100 Canadian cities would.

Frequently data - particularly international data - will have missing observations.

Combining data

Sometimes you can combine data from different sources to get the variables you need. With time series data you need to make sure the time periods match up. With cross-sectional data on countries or regions this can also work. With survey data this is not possible, as there is no way to make different groups of people match up.

Environmental and scientific data

International Environmental Performance Index 2010.
All the environmental indicators you could possibly want, by country.

Canadian Energy Use Data from the Office of Energy Efficiency

U.S. Environmental Protection Agency data
Air and water quality, pollutant emissions, hazardous waste, etc.

Sunspots - Solar Influence Data Centre

Automotive, transport, related

Car Fuel Economy Data from the U.S.Dept. of Energy

Ward's Key Automotive Data

Statistical Handbook - Canadian Association of Petroleum Producers

Aerospace Industries Association Data

Canadian Road Safety Statistics and Reports

Finance and economic data

World Bank Development Data
Data on countries: economics, living standards, health

Economagic - Economic Time Series Data

U.S. Bureau of Economic Analysis Data

FRED - Federal Reserve Economic Data

Dow Jones Historic Indexes
Stock market data. Register (free) for access.

Miscellaneous

Scientists and Engineers Statistical Data System
Employment, educational, and demographic characteristics of scientists and engineers in the United States.

CANSIM - Statistics Canada's Time Series database
Time series data, on Canada and provinces: economic, elections, social and health, and environmental sources. Be careful when using this data - some series have enough observations for an analysis, others do not.

Sports Statistics
Links from the American Statistical Association.

Canada's Best Places to Live
Data comparing 154 communities on a variety of factors.

UN Office on Drugs and Crime
International statistics on drugs, organized crime, human trafficking, etc.

Statistical Abstract of the United States
Lots of stuff on the U.S.: tax revenue, crime, sports, business and industry.

CDC Wonder
Data from the U.S. Centers for Disease Control and Prevention.

U.N. Databases
Assorted international data. Includes some environmental as well as industrial, economic and population.

Religion Data Archive
See particularly QuickStats and QuickLists

Archives of datasets specifically prepared for teaching and learning

These datasets are usually ready-to-use but may be limited in scope. Most include some data with an engineering or experimental focus.

Time Series Data Library topics include electrical usage, manufacuturing processes, chemistry, finance, etc.

UCI Machine Learning Repository includes the Challenger shuttle O-Ring failure data, web usage/tracking, robots, OCR

ICPSR Instructional Data Modules includes social, political, criminological and other data

The Data and Story Library from Carnegie-Mellon

Australasian Data and Story Library with a focus on Asian and Australian data

Journal of Statistics Education Teaching Data Archive including the famous "cars" datasets

University of Massachusets Amherst teaching data archive includes several medical experiment datasets

UCLA Statistics Data Sets has some prepared datasets under "Datasets for teaching"

Send us a message 

Your Contact

Research Data Services Coordinator
(519) 253-3000 ext.3858
Leddy Library 1104 - Main