Data Management in R

Data Management and Cleaning in R

This course will provide you with an introduction to Data Management and Cleaning for Analysis using R Software. Each of the four module includes a slide deck, training data, R code and associated exercises for practice. Topics Covered include:

1. Introduction to Administrative Data
2. Introduction of data cleaning and management
3. Tidyverse packages to clean data
4. Creating analysis workbooks in RMarkdown
5. Subsetting variables and data cleaning
6. Advanced filtering, selection, reshaping, functions and REGEX

Please install and familiarize yourself with R prior to the start of the webinar.

This course expects familiarity with R and base R functions and will be focusing on the tidyverse libraries and R Markdown.

You can learn more about the Tidyverse here and R Markdown here.

The best way to learn how to use R is to practice, make mistakes, learn and repeat. These exercises are an integral part of the webinar series. By completing these exercises students will be introduced to data management and data cleaning. If you experience any difficulties or require a different file format please contact me and I can send them to you. (lauren@mapdatascience.com)

Session 1:

  • Brief Introduction to Administrative Data
  • Data Cleaning and Data Management
  • Data Management with R
  • Introduction to Tidyverse

Session 2:

  • Tidyverse mutate, group_by, summarize
  • Advanced Tidyverse in dplyr
  • Reshaping Data
  • Joins

Session 3:

  • Types of Data
  • Working with Dates
  • Working with lists and purr
  • Introduction to regex, stringr and stringi

Session 4:

  • R Markdown Continued
  • Embedding tables, plots
  • Github version control