MY472: Data for Data Scientists
Autumn Term 2025
Course instructors
Dr Ryan Hübert (course convenor)
Associate Professor, Department of Methodology
Dr Yuanmo He
LSE Fellow, Department of Methodology
Dr Thomas Robinson
Assistant Professor, Department of Methodology
Salvatore Finizio (GTA)
PhD Student, Department of Geography and Environment
Charlotte Kuberka (GTA)
PhD Student, Department of Government
Asami Narita (GTA)
PhD Student, Department of Methodology
All office hours must be pre-booked via LSE’s StudentHub. Per department policy, GTA(s) do not hold regular office hours.
Keep in mind that office hours are a limited resource. Please do not regularly book office hours and cancel them at the last minute, as this prevents other students from accessing this limited resource.
Prerequisites
There are no formal prerequisites, but the course relies heavily on the R programming language. We will presume familiarity with core R programming skills. If you have no experience using R (or any other programming language), you are still welcome to join the course. Just keep in mind that the learning curve will steeper for you and you will need to set aside sufficient time to get up-to-speed with the programming skills required to be successful in the course.
All students should complete the R Advanced for Methodology Preparatory Course before the end of Week 1 of the Autumn Term. If you already have advanced R skills (covering all the modules in the preparatory course), then you do not need to complete it.
During the first lecture, we will provide a link to complete a course on-boarding survey.
Technical requirements
For this course, you must have a laptop that is capable of running the required software and doing the kinds of data science tasks we will cover. Any recent laptop running macOS or Windows with at least 8 GB of RAM should be sufficient, but please email the course convenor if you are unsure. Also, if you are considering purchasing a new laptop and want some advice, feel free to contact the course convenor.
You will not be able to use a tablet or iPad for the work in this course. You also might not be able to use a very old laptop or one with minimal technical specifications, such as some Chromebooks.
Most of the materials in this course have been developed on Apple’s macOS, and the lectures and seminars will be primarily taught with Apple laptops. In our experience, Apple computers are the most commonly used for data science in academia and in industry. You may use other operating systems (Windows or Linux), but please be aware that we make no guarantees about the seamless interoperability of the course materials across all operating systems and/or all computers.
We expect students to install R and Positron on their computers, in addition to other software as necessary. You can install R from https://www.r-project.org/ and Positron from https://positron.posit.co/.
All students should create a GitHub account (if they have not already), and sign up for GitHub Education benefits. See https://github.com/education.
Details about the technical requirements for the in-person practical test will be announced during the term.
Registration and auditing
Any questions or problems relating to course registration should be directed to the Methodology Department staff at methodology.admin@lse.ac.uk.
Students interested in auditing this course should complete this form and await a reply.
Course format and scheduling
During Autumn Term, there will be 20 hours of lectures and 15 hours of seminar. There are no lectures or seminars during week 6, which is LSE’s reading week.
There is a two-hour lecture each week during the term on Wednesdays from 13:00 to 15:00 in CLM.2.02 (see the campus map). Lecture slides and example code will be updated in advance of each week’s lecture.
There is a 1.5 hour “lab-style” seminar each week during the term. See the LSE Timetable for the schedule and locations for the seminars. Seminar exercises will be updated in advance of each week’s seminars. Students must attend their assigned seminar.
Administrative support
If you have any administrative queries about this course or would like to book a Zoom call with the course administrator, please contact methodology.admin@lse.ac.uk.
Attribution statement
The teaching materials in this course been iteratively developed by current and former instructors, including Pablo Barberá, Ken Benoit, Daniel de Kadt, Friedrich Geiecke, Ryan Hübert, Akitaka Matsuo and Thomas Robinson.