Define your data requirements
Understanding your data needs will narrow and ease your search. Before you look for data, consider the following questions:
What is the unit of analysis of your project?
The unit of analysis is the entity that you want to draw conclusions about. Think of it as the rows of your dataset. Examples include:
- Individuals
- Geographic or administrative units (grid cells, precincts, towns, cities, countries)
- Groups or organizations (classrooms, NGOs)
- Events (conflicts, disasters, elections)
- Structures/features (buildings, rivers, trees)
- Time-based combinations, common in time-series or panel data (city-week, person-year, household-month)
Match your unit of analysis with the goals of your project. If you’re studying individual political ideology, you need individual-level data. If you’re examining city-level flood resilience, city level data is preferred.
What time period do you need the data to cover?
Determine your temporal scope, or the amount of time you want to study your unit of analysis for. This might be hundreds of years for a project on economic market evolution, or days or weeks for a project examining the impact of an acute event.
What is the geographic scope of your project?
Define your geographic boundaries—global, regional, national, or local. This shapes data availability since political and administrative boundaries often shape data collection, influencing what's available where.
Does your project require a specific type of data?
Data types include:
- Spatial/GIS data
- Survey data
- News event data
- Social media data
- Administrative data
- Satellite data
Some projects can use multiple types (e.g.,economic development could be measured with survey data, spatial data, administrative data, etc.) ; others require specific formats (e.g., survey data for measuring attitudes/opinions).
Are any of your data requirements flexible?
While clearly defined data requirements are key to tailoring your search, research design is iterative and often needs to adapt to data availability. Sometimes your ideal data doesn’t exist or isn’t accessible. If this is the case, consider:
- Scaling up your unit of analysis - city-level data is often more available than individual-level
- Adjusting geographic scope - expanding or narrowing your study area based on data availability
- Modifying temporal scope - changing your time period to match available datasets
- Creative substitution - identifying alternative data types that can capture your core concepts
Think strategically about what's possible to measure with accessible data rather than abandoning your research question entirely.
Search for data
Once you define your data requirements, there are several strategies you can use to search for data. No single strategy is best, but turning to data repositories and common producers is a good first step. If you can't find what you need there, work backward from published literature or statistics you encounter while researching. Consider these strategies:
Search data repositories
Data repositories are centralized places for storing, sharing, and organizing data. Some essential data repositories include:
- Google Dataset search
- Google Dataset search is a tool for locating datasets hosted in repositories (including many of the repositories on this list). It functions much like a regular Google search, but retrieves only datasets.
- Registry of Research Data Repositories (re3data)
- The mission of re3data is to create a global registry of all data repositories. It currently has over 3,4000 repositories from across academic disciplines.
- Dataverse Network
- The dataverse network is an inter-disciplinary repository where researchers can share, archive, cite, access, and explore research data for free. You can browse for both dataverses and datasets by subject, publication year, and author.
- Datasets at Dartmouth
- Dartmouth Library hosts a selection of licensed datasets, many of which you can download directly from Datasets at Dartmouth when logged in with your netID.
- Inter-University Consortium of Political and Social Research (ICPSR)
- ICPSR is the largest archive of social science data. Though the data comes from a range of sources, it is primarily drawn from surveys, censuses, and administrative records. You can search data by keyword or individual variable. You can also browse by thematic topic.
There are also numerous subject-specific data repositories, many of which can be found in subject-specific data guides from Dartmouth Libraries:
Dartmouth’s subject librarians can help you locate subject-specific data.
Mine existing literature related to your project
Integrate looking for data into other parts of research, like conducting a literature review. When reading articles online, look for "Additional Materials" sections where you can download replication packages containing both data and analysis code (example from the American Economic Review below). If unavailable, search the journal's data repository or try searching "[article name] + replication data."
Brainstorm potential producers of data (private sector, government, NGO, open source)
Think about who has a stake in collecting the type of data you are interested in, and go directly to that source.
For example, government agencies commonly collect and publish data on land use, population demographics, the economy, etc. NGOs and multilateral institutions have incentives to collect data that supports their mission and goals. Here are some common sources of data you might turn to:
- Government agencies (US and international)
- Private sector actors
- NGOs
- Multilateral institutions and donors (e.g., World Bank, OECD, UN)
The library also has guides on specific sources/categories of data:
Work backwards from statistics
Individual statistics can sometimes lead to bigger datasets. Again, think about your data search as an integrated part of the research process. If you come across an interesting statistic when digging into your topic, track its source. Is there additional relevant data where the statistic came from?
Ask for help
If you’re not finding what you need using these strategies, or you’d like support in implementing them, please reach out to the Research Facilitation team at Dartmouth Libraries! Our team email is here.