PP422
Data Science for Public Policy
This information is for the 2023/24 session.
Teacher responsible
Casey Kearney
Availability
This course is compulsory on the MPA in Data Science for Public Policy. This course is not available as an outside option.
Pre-requisites
Students must have completed Pre-Sessional Coding and Mathematics Bootcamp (PP407).
This will ensure that students have basic fluency in Python and its main Data Science libraries.
Course content
This course covers the theory and practice of the Data Science project lifecycle in Python for Public Policy, from problem definition and data sourcing/cleaning to exploration, visualization, and modelling. Emphasis will be placed on identifying problems that are suitable for different Data Science techniques and on good practices for managing data. Linear and logistic models, regularization techniques and basic time-series models will be covered in the MT but more advanced timeseries and ML/AI models will be left for the LT. Key concepts and ideas underlying modelling (bias vs. variance, types of error, training vs. test data) and data ethics and data science ethics will be illustrated and implemented with examples from healthcare, education, urban policy, international development, and other policy areas. By the end of the course, students will have a strong coding workflow and will be able to source and experiment with data for analysis and research, both individually and in a collaborative environment.
Teaching
15 hours of lectures and 15 hours of seminars in the AT. 15 hours of lectures and 15 hours of seminars in the WT.
Formative coursework
Students will be expected to produce 2 pieces of coursework in the AT and WT.
Indicative reading
These books provide an excellent starting point and can be used as the main reference for many topics. A full reading list will be provided at the beginning of the course.
- Gareth James, Daniela Witten, Trevor Hastie and Rob Tibshirani (2021) - An Introduction to Statistical Learning
- Jeffery CC. Chen, Edward A. Rubin, Gary J. Cornwall (2021) 'Data Science for Public Policy'.
- Claus O. Wilke (2019) - Fundamentals of Data Visualization
Assessment
Exam (40%, duration: 3 hours, reading time: 15 minutes) in the spring exam period.
Coursework (30%), policy memo (15%) and presentation (15%) in the AT and WT.
Coursework is comprised of weekly coding notebooks to be completed by the student and in-class participation. Students will also complete a policy memo, presentation and take a final exam for the course.
Key facts
Department: School of Public Policy
Total students 2022/23: Unavailable
Average class size 2022/23: Unavailable
Controlled access 2022/23: No
Value: One Unit
Course selection videos
Some departments have produced short videos to introduce their courses. Please refer to the course selection videos index page for further information.
Personal development skills
- Leadership
- Problem solving
- Application of information skills
- Communication
- Application of numeracy skills
- Specialist skills