Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Data Management

Background and links to more information about data management issues.

Workshops and Training

We offer a variety of workshops to expand on your research data best practices.  See our current workshop offerings on the Dartmouth Library calendar.  

Contact us at ResearchDataHelp@groups.dartmouth.edu to request a workshop.

Fall 2021 Research Training

Interested in learning new tools and skills to create more organized and useful research?

The Library and RTL @ ITC are pleased to announce a virtual workshop series aimed at expanding your research best practices.  This introductory workshop series will explore different tools, concepts, and strategies to help you get started and to improve on your work. Join us to learn new tools and skills to take your research, or work skills to the next level.

Get started with text analysis, LaTeX, File Management, Excel, GIS Spatial Data, Databases, Stata, Python and more!

Current workshop information and registration: https://dartgo.org/RRADworkshops

ON DEMAND /VIEW ANYTIME SESSIONS

Python for Reproducible Research

Instructor: Paige Scudder

Format: Pre-recorded, Available now

 

Python is a free, open-source programming language used by programmers and researchers of all levels.  In this hands-on session, we will introduce basic programming concepts using Python, and show you how they can save you time and increase the reproducibility of your research.

Python for Reproducible Research  - Videos and Accompanying Files

Reproducible Research with Spatial Data

Instructors: Steve Gaughan

Format: Pre-recorded, Available now

 

We'll introduce the concepts of spatial data analysis and show how to create reproducible spatial workflows, from data input to maps to exporting spatial overlay and other analysis results. 

 

Reproducible Research with Spatial Data Workshop - Videos and Accompanying Files

The first two videos are slides.  The last two are screen captures of live-coding using R and R Studio.  Instructions for installing R and R Studio can be found at the "Accompanying Files" link above.  This link also contains sample datasets and a solution file for the R code.

Stata Refresher for Reproducible Research 

Instructors: John Cocklin, Catrina Cuadra

Format: Pre-recorded, Available now

 

Stata is a popular statistical analysis software that is used extensively by economists. In session one attendees will load datasets into Stata. In session two attendees will wrangle the data in preparation for data analysis and reproducible research. In session three attendees will run basic summary statistics and a linear regression on the data they prepared in sessions one and two.This workshop is appropriate for those who have learned Stata in the past and would like a refresher, or for those who are using Stata for the first time. 

 

This will be a pre-recorded session, available for viewing anytime at https://researchguides.dartmouth.edu/econ/stata

 

Stata Refresher for Reproducible Research - A Three Part Video Tutorial

If you would like to follow the videos using Stata, please download these files and place them in a single folder on your computer where you will be able to find them. For information on downloading Stata, see the "Statistical Software" research guide under Additional Resources below.

SPRING 2020 PREVIOUS SESSIONS

Introduction to Version Control with GIT

Instructor: Lora Leligdon 

Date and time: Tuesday, April 28, 2020,  3 - 4 pm

 

Version control allows you to keep track of overtime changes in documents, and revert back to the previous version easily. Originally developed for source code management, it is a practical way to work with any text documents and more. Git, in conjunction with online platforms like GitHub, GitLab, or Bitbucket, allow you to backup your work in the cloud, share with collaborators anywhere in the world, synchronize your work between several machines, including HPC environment, regardless of the operating system, and publish and make your work accessible across the internet. 

 

This was a live session via Zoom, no recording available.

 

Workshop Slides: https://drive.google.com/open?id=1lIVOtiIJrUHJUEjAx5zIdsVuZnBBSD29

 

Webinar: Introduction to Database Design and Implementation

Instructor: Christian Darabos

Date and time: Thursday, April 30, 2020,  3 - 5 pm

 

Research Computing offers this hands-on workshop providing an introduction to database design. Using a relational database can help you store and analyze your research data and results more efficiently (than flat/text files). We will be using the relational database paradigm, the Unified Modeling Language (UML), the Entity-Relationship (ER) model, and implement a simple MySQL database. 

 

MySQL Workbench: https://dev.mysql.com/downloads/workbench/

 

This was a live session via Zoom, no recording available.

 

The Reproducible Research Workflow

Instructor: Lora Leligdon

Date and time: Tuesday, May 5, 2020, 3 - 4 pm

 

A research project can be considered reproducible if a second investigator (including you in the future!) can recreate the final reported results of the project, including key quantitative findings, tables, and figures, given only a set of files and written instructions.  In this session, we will discuss best practices and a reproducible workflow that will help make your research more clear, transparent, and organized from the start.

 

This was a live session via Zoom, no recording available.

 

Workshop slides: https://drive.google.com/open?id=1-qY2HsbqrZSe9pQ6w7UKqKVji6DagmBs

 

Workshop handout: https://drive.google.com/open?id=1vrU0CBQtrQ-OFYKy_W0wL9-X8l0B9lOw

 

Workshop reproducibility checklist: https://drive.google.com/open?id=1xLvFUWreuBqi-oo3jOarR_MlP_evTSAO

Webinar: Introduction to Database Query and Analytics

Instructor: Christian Darabos

Date and time: Thursday, May 7, 2020, 3 - 5 pm 

 

Research Computing offers this hands-on workshop providing an introduction to database query in Structured Query Language (SQL). Using a relational database can help you store and analyze your research data and results more efficiently (than flat/text files). We will be using SQL on a pre-populated database to extract, filter and answer simple analytics questions. If time permits, we will explore ways of programmatically accessing databases from tools such as R or 

Python in order to automate the analytical process. 

 

This was a live session via Zoom, no recording available.

Excel Best Practices for Reproducible Research 

Instructor: Pamela Bagley

Date and time: Tuesday, May 12, 2020,  3 - 4 pm

 

Actively managing your research data is an important part of reproducible research, and Excel is one of the most widely used tools.  In this session, we will discuss best practices for data management in Excel, along with tips on filenames, README files, and metadata.  We’ll introduce the free version of Colectica for Excel, a tool that can help you document your spreadsheet data. 

 

This was a live session via Zoom, no recording available.

 

Reproducible Statistical Data Analysis with R

Instructor: Jianjun Hua

Date and time: Thursday, May 14, 2020, 11 am  - 12 pm 

 

R is a free, open-source programming language that is known for its approachability and for becoming an increasingly popular tool for data analysis and visualization.  In this hands-on session, you will learn how to use R to conduct basic statistical data analysis, and how to save you time and increase the reproducibility of your research. 

 

This was a live session via Zoom, no recording available.

 

Getting Organized for Reproducible Research: File Management Systems

Instructors: Pamela Bagley and Elaina Vitale

Date and time: Tuesday, May 19, 2020, 3 - 4 pm

Format: Live Zoom

 

This workshop will introduce basic concepts for keeping track of data files.  Learn good habits in file organization —naming files, creating organized file structures, tagging files, version control, and documenting your process.  Following these habits will help you locate files easily, avoid confusion when working on teams or sharing files, and prevent data loss by accidentally overwriting data files. 

 

This was a live session via Zoom, no recording available.