Skip to Main Content

Research Data Management

Background and links to more information about data management issues.

Sample Datasets for Data Science Practice

Have you acquired some new data science skills but need some data to practice with?

This guide points you to lists and repositories of free digital datasets. Use these to apply your new data analysis or visualization skills or to practice with a new tool.

Note: these datasets are not necessarily vetted for accuracy or consistency. If you are a researcher searching for datasets curated for research purposes please see our Find Data or Data Sharing and Repositories or the Dartmouth Libraries' many subject-area reference guides instead.

Looking to learn some new data science skills? See our list of upcoming workshops. Need some help or guide on a data science or digital humanities project? Contact us at: ResearchDataHelp@groups.dartmouth.edu.

Repositories of Small, Sample Datasets for Practice

 

Sample datasets used in tutorials and demo materials

Datasets used by visualization teams and data journalists:

Try to re-create these teams' beautiful visualizations:

Lists of practice datasets for data science

Large, Curated Lists or Repositories of Datasets

  • Data is Plural list: Google docs list of 1700 datasets (and growing)
  • DH Resources for Project-Building: List of datasets for the Digital Humanities under the following categories:
    • Demo Corpora (ready to use texts)
    • Document / Image Collections
    • Linguistic corpora
    • Map Collections
    • Datasets (with a focus on humanities datasets)
    • Datasets in Other Disciplines (i.e. Social and Natural Sciences)
  • Kaggle.com datasets (nearly 300,000 data user-submitted datasets)
    • Note: these are not vetted so if using for research you will need to verify the accuracy and consistency of the data

Datasets for Machine Learning