Week 3: Tabular Data
This week provides an overview of the most common kind of data used in data science—tabular data—including an introduction to the principles of tidy data and core ideas behind database systems.
Lecture Slides Supporting Code
Note: materials will be posted before lecture and seminar each week.
Additional resources
Wickham, Hadley and Garett Grolemund. 2017. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. Sebastopol, CA: O’Reilly. Part II Wrangle, Tibbles, Data Import, Tidy Data (Ch. 7-9 of the print edition).
Note: there is a newer version of the Wickham and Grolemund text from 2023, which is available at https://r4ds.hadley.nz/.*
The Tidyverse collection of packages for R.
{lubridate}docs for analysing dates in R.Lake, P. and Crowther, P. 2013. Concise guide to databases: A Practical Introduction. London: Springer-Verlag. Chapter 1, Data, an Organizational Asset
Beaulieu. 2009. Learning SQL. O’Reilly. (Chapter 1)
Stephens et al. 2009. Teach yourself SQL in one hour a day. Sam’s Publishing.