Advanced Data Wrangling in R
Welcome
This is the course site for the Advanced Data Wrangling in R workshop created by Kelsey Gonzalez for the Critical Path Institute in Tucson Arizona from May 26-28, 2021.
This nine-hour hands-on workshop is an exploration of the Tidyverse as a tool to wrangle your data in preparation for analysis.
Day 1 will cover basic dplyr funcitonality including select
, filter
, arrange
, mutate
, group_by
and summarize
and then moves into the concept of tidy data and pivots.
Day 2 will cover special variable types and advanced creation of new variables. Specifically, we will cover case_when
, factors with forcats
, dates with lubridate
, and strings and regular expressions with stringr
.
Day 3 will cover relational data and joins in addition to more advanced and cutting edge dplyr functions like across
, colwise
, rowwise
. After diving into purrr
for iteration, we will dive into case studies.
Prework
We will use Rstudio desktop for this workshop. You’re also welcome to utilize Rstudio cloud. - Install R and RStudio Desktop on your computer.
You can find step-by-step instructions for installing these here: macOS, Windows.
- Install the following packages:
# From CRAN
install.packages("tidyverse")
install.packages("NHANES")
install.packages("janitor")
install.packages("gapminder")
install.packages("glue")
Links
- Link to this website: https://bit.ly/cpath-wrangling
- Day 1: Slides - Live-coded.Rmd
- Day 2: Slides - Live-coded.Rmd
- Day 3: Slides - Live-coded.Rmd