DataThink Development
  • Modules
  • R Help
  • Resources
    • Course Textbook
    • R for Data Science
    • Git/GitHub and R
    • R Markdown: The Definitive Guide
    • Geocomputation with R

    • Supplemental Material
    • Happy Git and Github for the useR
    • plotly for R
    • Computing in R for Social Sciences
    • Statistical Concepts in Presenting Data:
    • Advanced R
    • R Packages
    • Tidy evaluation
    • Fundamentals of Data Visualization
    • Geocomputation with R
    • Crosstalk: htmlwidgets add-on

On this page

  • Case Study 5: I can clean your data
    • Background
    • Reading
    • Tasks

cs-05

Case Study 5: I can clean your data

Background

The Scientific American argues that humans have been getting taller over the years. As the data scientists that we are becoming, we would like to find data that validates this concept. Our challenge is to show different male heights across the centuries.

This project is not as severe as the two quotes below, but it will give you a taste of pulling various data and file formats together into “tidy” data for visualization and analysis. You will not need to search for data as all the files are listed here

  1. “Classroom data are like teddy bears and real data are like a grizzly bear with salmon blood dripping out its mouth.” - Jenny Bryan
  2. “Up to 80% of data analysis is spent on the process of cleaning and preparing data” - Hadley Wickham
  • Course Website

Reading

This reading will help you complete the tasks below.

  • o foreign R Package and read.dbf()

Tasks