Data Wrangling Workshop

Published

March 30, 2026

Required Readings

  • None

Overview

Today we will practice intermediate data wrangling by working with two real datasets: Olympic Games medal counts and World Bank development indicators. We’ll clean each dataset, merge them, and analyze whether national wealth predicts Olympic success.

Step 1: Set Up Your Project

Create a folder called wrangling_workshop on your computer. Inside it, create three subfolders:

wrangling_workshop/
├── input/       ← raw data goes here (never modify these files)
├── output/      ← cleaned data, figures, tables go here
├── code/        ← your scripts go here

You can create these in Finder/File Explorer, or in the Terminal:

mkdir wrangling_workshop
cd wrangling_workshop
mkdir input output code

Step 2: Create an RProject

  1. Open RStudio
  2. Go to File → New Project → Existing Directory
  3. Browse to your wrangling_workshop folder and click Create Project

This creates a .Rproj file. From now on, always open your project by double-clicking the .Rproj file. This tells R where your files are so you can use relative paths like read_csv("input/olympics_raw.csv") instead of long absolute paths that break on other computers.

Step 3: Download the Data

Right-click each link below and choose “Save Link As…” (or “Download Linked File”). Save both files into your input/ folder.

  • olympics_raw.csv — Olympic athletes and medals, Athens 1896 to Rio 2016
  • wb_indicators_panel.csv — GDP per capita, population, and female labor force participation from the World Bank (2000-2016, Olympic years only)

Step 4: Download the Workshop Script

Save this R script into your code/ folder:

Check Your Setup

Before we start, your folder should look like this:

wrangling_workshop/
├── wrangling_workshop.Rproj
├── input/
│   ├── olympics_raw.csv
│   └── wb_indicators_panel.csv
├── output/
│   └── (empty for now)
├── code/
│   └── wrangling_workshop.R

Open wrangling_workshop.Rproj, then open code/wrangling_workshop.R in RStudio. We’ll work through it together in class.