Skip to Main Content Albertsons Library Reservations

Data Science

Locating datasets

When learning data science, you'll often want to locate existing datasets to use for projects. Here are some good resources to help you find data on a variety of topics. This is a list of places to locate datasets on a variety of topics. Unless otherwise noted, the data is public (i.e. it's free to use for your projects).


Resources by topic

Biological sciences
  • Shark Attack Data - A robust source for data on human/shark encounters from 1900 to present. 
Business & industry
Cartographic data & GIS
  • OpenStreetMap - Mapping tool with built-in data
  • Social Explorer - Another mapping tool with datasets, focused on social indictors. Although they sell a version of this product, there is also a robust free version.
Criminal justice
  • Fatal encounters - A national archive of people killed during interactions with the police.
Entertainment
Health & medicine
Idaho
International
News
  • Juicy Data - The blog Information is Beautiful has compiled a number of data sets that are available for use.
  • ProPublica Data - Data and reports on criminal justice, politics, and policy topics.
Politics
  • Follow the Money - Data on campaign contributions and other conflicts of interests in the political arena.
Sports
United States 
Data repositories, & search engines
  • Google Dataset Search Engine - A lot of the data that you'll find via this search engine is proprietary, but many data-for-sale companies will give you a table or two for free.
  • Data and Story Library - This library of data and stories from Carnegie Mellon is meant for statistics instructors. Includes a search by statistical method. 
  • ICPSR - A gigantic archive of social sciences data.
  • OECD.Stat - A topical directory of statistical data.
  • Roper Center for Public Opinion - Survey data on a range of political, economic, and social topics.

Credits

Many of these resources were sources from blogger, educator, & statistician Amy Hogan