Materials for the 2023 GESIS workshop "Workflows for Reproducible Research with R & Git"
by Johannes Breuer, Bernd Weiß, and Arnim Bleier
Please link to the workshop GitHub repository
The workshop focuses on reproducible research in the quantitative social and behavioral sciences. In the context of this workshop, reproducibility means that other researchers can fully understand and rerun your data preparation and statistical analyses. However, the workflows and tools covered in this workshop will also help in facilitating your own work as they allow you to automate and track analysis and reporting tasks. In addition to a conceptual introduction to the methods and key terms around reproducible research, this workshop focuses on procedures for maximizing the reproducibility of data analyses using R
. After discussing essential definitions and dimensions of reproducibility, we will cover some computer literacy and project organization basics that are helpful for conducting reproducible research (e.g., folder structures, naming schemes, or command-line interfaces). After that, we will focus on version control, dependency management, and computational reproducibility. The tools we will use for that include Git
and GitHub, R
packages for dependency management as well as Binder, a tool to package and share reproducible and interactive analysis environments.
The workshop is targeted at participants who have (at least some) experience with R
and want to learn (more) about workflows and tools for making the results of their research reproducible.
By the end of the course participants should:
- have gained important insights into key concepts of reproducible research and recommended best practices
- be able to work with frameworks and tools that can be used for maximizing reproducibility, such as
Git
,R
packages for dependency management, or Binder - be able to publish reproducible computational analysis pipelines with
R
Participants should have some basic knowledge of R
and RStudio (e.g., installing and loading packages, importing different data types, basic data wrangling, and analyses). To facilitate applying the methods covered in the course to their work, we recommend that participants ensure to install all necessary software on their computers before the start of the course.
Time | Topic | Slides | Exercises | Solutions |
---|---|---|---|---|
09:30 - 10:45 | Introduction | HTML, PDF | - | - |
10:45 - 11:00 | Coffee Break | - | - | - |
11:00 - 12:00 | Computer literacy | HTML, PDF | see slides | see slides |
12:00 - 13:00 | Lunch Break | - | - | - |
13:00 - 15:00 | Git & GitHub - Part I | HTML, PDF (contain also Part II) |
see slides | see slides |
15:00 - 15:15 | Coffee Break | - | - | - |
15:15 - 16:30 | Git & GitHub - Part II | HTML, PDF | HTML | HTML |
16:30 - 17:00 | Q&A | - | - | - |
Time | Topic | Slides | Exercises | Solutions |
---|---|---|---|---|
09:00 - 09:30 | Recap Day 1 | HTML, PDF | - | - |
09:30 - 11:00 | Dependency management | HTML, PDF | HTML | HTML |
11:00 - 11:15 | Coffee Break | - | - | - |
11:15 - 12:00 | Binder & Notebooks | see slides | - | |
12:00 - 13:00 | Lunch Break | - | - | - |
13:00 - 14:30 | Build your own Binder | see slides | Project | |
14:30 - 14:45 | Coffee Break | - | - | - |
14:45 - 16:00 | Saving computational environments | Project | - | |
16:00 - 17:00 | Recap & Outlook | HTML, PDF | - | - |
The R Markdown
parts of this workshop were created using the R
packages xaringan
, unilur
, and woRkshoptools
. The materials are based on an earlier version of this workshop and a similar course by Frederik Aust and Johannes Breuer.