Computer Science Data

Data for Computer Scientists to play with

Any machine-readable data can be used as fodder for computer programming. This page merely attempts to list some datasets of particular interest to computer science researchers: data on computer hardware performance and reliability, data on networks, datasets developed to investigate problems in computer science and artificial intelligence, and so on. Some of these datasets may also be of interest to researchers in mathematics.

Note that some of these datasets come in non-standard / custom formats and ADC staff may not always be able to provide assistance, though we are always willing to try.

Internet / Network Traffic Data

Social Network Analysis

  • University of California - Irvine: Network Data Repository
    Interpersonal network datasets (fraternity students, southern women, etc.), ham radio interactions, an email communication network derived from the USC Enron data, and several others.
  • Data from the Conference on Weblogs and Social Media (ICWSM): 2013, 2012, 2011.
    Free after signing a license agreement. Includes data on Facebook, Twitter, wiki edits, weblog comments, etc.
  • The Facebook100 Data Set
    A snapshot showing the complete set of people and friendships from the Facebook networks of 100 different colleges and universities from September 2005. The earlier Facebook5 (5 colleges/universities) is also available.
  • The Twitter Project Page at MPI-SWS
    An anonymized topology of the Twitter social network as of August 2009.

Computer and Component Operations and Failure Data

Machine Learning