For further assistance, email libdata@uwindsor.ca or come to the data centre during open hours.
- Types of data
There are two main types of data you will be using. Time series data is data that has repeated observations on something such as a country or region over time - for example, Canada's GDP annually. Cross-sectional data is data on multiple units at the same time - for example, a survey where 500 people were asked the same questions, or the population of 100 countries in 2003.
- How many observations
You need many observations to conduct an analysis. For a bivariate analysis (an analysis with two variables where you use one variable to predict the other) 30 observations may be enough. 20 is iffy, 10 ridiculous.
While it is mathematically possible to run statistical procedures with fewer observations, you are unlikely to get useful results, and your professor is unlikely to accept your paper.
With annual time series data it can be difficult to get enough observations - many time series don't go back more than 20 years or so. However, some time series are available monthly, giving you 120 observations in 10 years, which is plenty. With cross-sectional data you need enough units - data on Canada's 10 provinces would not be enough, but data on 100 Canadian cities would.
Frequently data - particularly international data - will have missing observations.- Combining data
Sometimes you can combine data from different sources to get the variables you need. With time series data you need to make sure the time periods match up. With cross-sectional data on countries or regions this can also work. With survey data this is not possible, as there is no way to make different groups of people match up.
- Environmental and scientific data
International Environmental Performance Index 2010.
All the environmental indicators you could possibly want, by country.
Canadian Energy Use Data from the Office of Energy Efficiency
U.S. Environmental Protection Agency data
Air and water quality, pollutant emissions, hazardous waste, etc.
Sunspots - Solar Influence Data Centre- Automotive, transport, related
Car Fuel Economy Data from the U.S.Dept. of Energy
Ward's Key Automotive Data
Statistical Handbook - Canadian Association of Petroleum Producers
Aerospace Industries Association Data
Canadian Road Safety Statistics and Reports- Finance and economic data
World Bank Development Data
Data on countries: economics, living standards, health
Economagic - Economic Time Series Data
U.S. Bureau of Economic Analysis Data
FRED - Federal Reserve Economic Data
Dow Jones Historic Indexes
Stock market data. Register (free) for access.- Miscellaneous
Scientists and Engineers Statistical Data System
Employment, educational, and demographic characteristics of scientists and engineers in the United States.
CANSIM - Statistics Canada's Time Series database
Time series data, on Canada and provinces: economic, elections, social and health, and environmental sources. Be careful when using this data - some series have enough observations for an analysis, others do not.
Sports Statistics
Links from the American Statistical Association.
Canada's Best Places to Live
Data comparing 154 communities on a variety of factors.
UN Office on Drugs and Crime
International statistics on drugs, organized crime, human trafficking, etc.
Statistical Abstract of the United States
Lots of stuff on the U.S.: tax revenue, crime, sports, business and industry.
CDC Wonder
Data from the U.S. Centers for Disease Control and Prevention.
U.N. Databases
Assorted international data. Includes some environmental as well as industrial, economic and population.
Religion Data Archive
See particularly QuickStats and QuickLists- Archives of datasets specifically prepared for teaching and learning
These datasets are usually ready-to-use but may be limited in scope. Most include some data with an engineering or experimental focus.
Time Series Data Library topics include electrical usage, manufacuturing processes, chemistry, finance, etc.
UCI Machine Learning Repository includes the Challenger shuttle O-Ring failure data, web usage/tracking, robots, OCR
ICPSR Instructional Data Modules includes social, political, criminological and other data
The Data and Story Library from Carnegie-Mellon
Australasian Data and Story Library with a focus on Asian and Australian data
Journal of Statistics Education Teaching Data Archive including the famous "cars" datasets
University of Massachusets Amherst teaching data archive includes several medical experiment datasets
UCLA Statistics Data Sets has some prepared datasets under "Datasets for teaching"
Connect with your library