Course Guide - data for 78-632, Quantitative Studies

Types of data

There are two main types of data you will be using. Time series data is data that has repeated observations on something such as a country or region over time - for example, Canada's GDP annually. Cross-sectional data is data on multiple units at the same time - for example, a survey where 500 people were asked the same questions, or the population of 100 countries in 2003.

How many observations

You need many observations to conduct an analysis. For a bivariate analysis (an analysis with two variables where you use one variable to predict the other) 30 observations may be enough. 20 is iffy, 10 ridiculous.

With annual time series data it can be difficult to get enough observations - many time series don't go back more than 20 years or so. However, some time series are available monthly, giving you 120 observations in 10 years, which is plenty. With cross-sectional data you need enough units - data on Canada's 10 provinces would not be enough, but data on 100 Canadian cities would.

Frequently data - particularly international data - will have missing observations.

Combining data

Sometimes you can combine data from different sources to get the variables you need. With time series data you need to make sure the time periods match up. With cross-sectional data on countries or regions this can also work.

Canadian data

  • CANSIM - Statistics Canada's Time Series database
    Time series data, on Canada and provinces: economic, elections, social and health, and environmental sources. Be careful when using this data - some series have enough observations for an analysis, others do not.
  • Canada's Best Places to Live
    Data comparing 154 communities on a variety of factors.
