Not available in 2024/25
ST101A      Half Unit
Programming for Data Science

This information is for the 2024/25 session.

Teacher responsible

Dr Christine Yuen

Availability

This course is compulsory on the BSc in Data Science and BSc in Mathematics with Data Science. This course is available on the BSc in Accounting and Finance, BSc in Actuarial Science, BSc in Finance, BSc in Mathematics, Statistics and Business and BSc in Politics and Data Science. This course is available with permission as an outside option to students on other programmes where regulations permit and to General Course students.

This course has a limited number of places (it is capped) and demand is typically very high. Students who have this course as a compulsory course are guaranteed a place. 

Pre-requisites

Although not a formal requirement, it is preferable that students have some familiarity with the basic concepts of probability and statistics, to the level of ST102/ST107 first 2 chapters (Data visualisation and descriptive statistics and probability theory).

Course content

The primary focus of the course is to cover principles of computer programming with a focus on data science applications.

The topic covered will include variables, basic data types, data structures and sequences, control flow structures, modularisation, functions, variable and function scoping, testing and debugging, errors and exception handling, and data input-output operations using file systems and operating system standard input-output; principles of object-oriented programming including objects, classes, methods, encapsulation, inheritance, and polymorphism; principles of functional programming languages such as use of immutable data, flow control using functional calls and recursions; practical aspects of algorithmic concepts such as searching.

The course will primarily use Python programming language, but may also discuss and provide references to how the fundamental programming concepts are implemented in other programming languages, in particular, R. 

Teaching

This course will be delivered through a combination of classes and lectures totalling a minimum of 35 hours in Autumn Term.

Students are required to install Python on their own laptops and use their own laptops in the classes and lectures.

Formative coursework

Students will be expected to produce 7 problem sets in the AT.

The problem sets will consist of computer programming exercises in Python programming language.

Indicative reading


Essential Reading: 

  • J. V. Guttag, Introduction to Computation and Programming using Python, Second Edition, The MIT Press, 2017
  • A. B. Downey, Think Python: How to Think like a Computer Scientist, 2nd Edition, O'Reilly Media, 2015

Additional Reading: 

  • W. Mckinney, Python for Data Analysis, 2nd Edition, O'Reilly, 2017
  • J. Zelle, Python Programming: An Introduction to Computer Science, 3rd edition, Franklin, Beedle & Associates, 2016
  • M. Lutz, Learning Python, 5th Edition, O'Reilly Media, 2013
  • M. Dawson, Python Programming for the Absolute Beginner, 3rd Edition, Course Technology, 2010

Assessment

Exam (70%, duration: 2 hours, reading time: 10 minutes) in the spring exam period.
Coursework (30%) in the AT.

Students are required to hand in solutions to 3 problem sets using Python, each accounting for 10% of the final assessment.

Key facts

Department: Statistics

Total students 2023/24: 101

Average class size 2023/24: 26

Capped 2023/24: Yes (120)

Value: Half Unit

Guidelines for interpreting course guide information

Course selection videos

Some departments have produced short videos to introduce their courses. Please refer to the course selection videos index page for further information.

Personal development skills

  • Self-management
  • Problem solving
  • Application of information skills
  • Communication
  • Application of numeracy skills
  • Commercial awareness
  • Specialist skills