MY472: Data for Data Scientists

Autumn Term 2025

Course instructors

Dr Ryan Hübert (course convenor)      
Associate Professor, Department of Methodology

Dr Yuanmo He      
LSE Fellow, Department of Methodology

Dr Thomas Robinson      
Assistant Professor, Department of Methodology

Salvatore Finizio (GTA)      
PhD Student, Department of Geography and Environment

Charlotte Kuberka (GTA)      
PhD Student, Department of Government

Asami Narita (GTA)     
PhD Student, Department of Methodology

Booking office hours

All office hours must be pre-booked via LSE’s StudentHub. Per department policy, GTA(s) do not hold regular office hours.

Keep in mind that office hours are a limited resource. Please do not regularly book office hours and cancel them at the last minute, as this prevents other students from accessing this limited resource.

Prerequisites

There are no formal prerequisites, but the course relies heavily on the R programming language. We will presume familiarity with core R programming skills. If you have no experience using R (or any other programming language), you are still welcome to join the course. Just keep in mind that the learning curve will steeper for you and you will need to set aside sufficient time to get up-to-speed with the programming skills required to be successful in the course.

Mandatory R course

All students should complete the R Advanced for Methodology Preparatory Course before the end of Week 1 of the Autumn Term. If you already have advanced R skills (covering all the modules in the preparatory course), then you do not need to complete it.

During the first lecture, we will provide a link to complete a course on-boarding survey.

Technical requirements

For this course, you must have a laptop that is capable of running the required software and doing the kinds of data science tasks we will cover. Any recent laptop running macOS or Windows with at least 8 GB of RAM should be sufficient, but please email the course convenor if you are unsure. Also, if you are considering purchasing a new laptop and want some advice, feel free to contact the course convenor.

Minimum hardware requirements

You will not be able to use a tablet or iPad for the work in this course. You also might not be able to use a very old laptop or one with minimal technical specifications, such as some Chromebooks.

Most of the materials in this course have been developed on Apple’s macOS, and the lectures and seminars will be primarily taught with Apple laptops. In our experience, Apple computers are the most commonly used for data science in academia and in industry. You may use other operating systems (Windows or Linux), but please be aware that we make no guarantees about the seamless interoperability of the course materials across all operating systems and/or all computers.

We expect students to install R and Positron on their computers, in addition to other software as necessary. You can install R from https://www.r-project.org/ and Positron from https://positron.posit.co/.

All students should create a GitHub account (if they have not already), and sign up for GitHub Education benefits. See https://github.com/education.

Technical requirements for in-person assessment

Details about the technical requirements for the in-person practical test will be announced during the term.

Registration and auditing

Any questions or problems relating to course registration should be directed to the Methodology Department staff at methodology.admin@lse.ac.uk.

Students interested in auditing this course should complete this form and await a reply.

Course format and scheduling

During Autumn Term, there will be 20 hours of lectures and 15 hours of seminar. There are no lectures or seminars during week 6, which is LSE’s reading week.

There is a two-hour lecture each week during the term on Wednesdays from 13:00 to 15:00 in CLM.2.02 (see the campus map). Lecture slides and example code will be updated in advance of each week’s lecture.

There is a 1.5 hour “lab-style” seminar each week during the term. See the LSE Timetable for the schedule and locations for the seminars. Seminar exercises will be updated in advance of each week’s seminars. Students must attend their assigned seminar.

Administrative support

If you have any administrative queries about this course or would like to book a Zoom call with the course administrator, please contact methodology.admin@lse.ac.uk.

Attribution statement

The teaching materials in this course been iteratively developed by current and former instructors, including Pablo Barberá, Ken Benoit, Daniel de Kadt, Friedrich Geiecke, Ryan Hübert, Akitaka Matsuo and Thomas Robinson.