MG4E1 Half Unit
Algorithmic Techniques for Data Mining
This information is for the 2015/16 session.
Teacher responsible
Dr Laszlo Vegh NAB 3.05
Availability
This course is available on the MSc in Management Science (Decision Sciences) and MSc in Management Science (Operational Research). This course is available as an outside option to students on other programmes where regulations permit.
Pre-requisites
Students are not permitted to take this course alongside ST443 Machine Learning and Data Mining.
Students must have basic knowledge of Mathematics and Statistics, in particular, familiarity with hypothesis testing, linear and logistic regression.
Course content
Data Mining is an interdisciplinary field developed over the last three decades. Vast quantities of data are available today in marketing, other areas of business including demand forecasting, and various fields of science and technology. The main goal of data mining is to extract previously unknown, useful information from such massive scale data. The aim of the course is to equip the students with a theoretically founded and practically applicable knowledge of data mining. The theoretical foundations of the field come from statistics, computer science and artificial intelligence.
The course introduces fundamental methods and algorithms for basic data analytics problems. These methods include algorithms for tree construction and for rule generation, instance-based learning, regression methods, support vector machines, nearest-neighbour methods, Bayesian networks, website ranking, principal component analysis, association rule mining, and distance based and density based clustering.
The methods are illustrated on practical problems arising from various fields. The course also gives an introduction to the usage of the data mining software package Weka.
Teaching
20 hours of lectures and 13 hours and 30 minutes of seminars in the LT. 1 hour and 30 minutes of seminars in the ST.
A reading week will take place in W6. There will be no teaching during this week.
Formative coursework
Students will be expected to produce 1 project in the LT and 1 problem sets in the ST.
A mock exam and a mock project will be given. The mock project will be similar to the group project, but with the dataset provided.
Indicative reading
Main textbook:
I. H. Witten, E. Frank, M. A. Hall: Data Mining - Practical Machine Learning Tools and Techniques.
Further reading:
T. Hastie, R. Tibshirani, J. Friedman: The Elements of Statistical Learning - Data Mining, Inference and Prediction;
P. Flach: Machine Learning: The Art and Science of Algorithms that Make Sense of Data, Cambridge University Press, 2012.
Assessment
Exam (45%, duration: 2 hours) in the main exam period.
Project (45%) in the ST.
Coursework (10%) in the LT.
Key facts
Department: Management
Total students 2014/15: Unavailable
Average class size 2014/15: Unavailable
Controlled access 2014/15: No
Value: Half Unit
Personal development skills
- Self-management
- Team working
- Problem solving
- Application of information skills
- Communication
- Application of numeracy skills
- Commercial awareness
- Specialist skills