School of Information Technology and Electrical Engineering
Semester 1, 2008
COMP4702/COMP7703 - Machine Learning
Course Material
Lecture Notes
Notes are listed here in the order that we will cover them in the course. These slides are based on those provided by Alpaydin (the author of the text), with modifications made where possible. The original versions by Alpaydin are also here as the "pre-lecture" version, for students who wish to read ahead.
- Introduction (Ch. 1) - Slides
- Occam's Razor (wikipedia entry)
- VC Dimension: a useful reference is via the Support Vector Machines" .org website (see later in the course for SVMs!).
- PAC Learning: for a better overview of this area, see the Russel and Norvig text, Section 18.5.
- Supervised Learning (Ch. 2) - Slides
- Bayesian Decision Theory (Ch. 3) - Slides
- Thomas Bayes (wikipedia entry)
- SpamBayes - popular spam filter based on Naive Bayes models.
- An article on Bayesian networks in the Microsoft Assistant (link).
- A truly awesome list of links to Bayesian net resources (link). Look here for many real-world example applications of Bayesian nets, as well as information about advanced topics such as learning in Bayesian nets and inference in non-DAG graphical models.
- Parametric Methods (Ch. 4) - Slides
- Dimensionality Reduction (Ch. 6) - Slides
- Clustering (Ch. 7) - Slides
- k-means figures: [Bis] pic1, [Bis] pic2, [HTF] pic3.
- Gaussian mixture model figures: [Bis] pic1, [Bis] pic2.
- Dendrogram figures: [HTF] pic1, [HTF] pic2.
- Jain, A., Murty, M. and Flynn, P. Data Clustering: A Review. ACM Computing Surveys 34(3), 1999.
- Xu, R. and Wunsch II, D. Survey of Clustering Algorithms. IEEE Transactions on Neural Networks 16(3), 2005.
- A nice applet illustrating Gaussian mixture models and the EM algorithm.
- Nonparametric Methods (Ch. 8) - Slides
- Kernel density estimation figures: [Bis] pic1, [HTF] pic2, [HTF] pic3.
- k-Nearest Neighbour figures: [Bis] pic1 (density estimation), [HTF] pic2 (classification), [HTF] pic3.
- Linear Discrimination (Ch. 10) - Slides
- A few supplementary slides (from lectures, summarizing motivation for logistic sigmoid, cross-entropy, etc.)
- A nice applet illustrating Support Vectors Machines for 2-D classification.
- Multilayer Perceptrons (Ch. 11) - Slides
- Assessing and Comparing (Ch. 14) - Slides
- Combining Learners (Ch. 15) - Slides
- Hidden Markov Models (Ch. 13) - Lecture Notes: same as (pre-lecture version)
Textbooks
- Course text: Introduction to Machine Learning. Ethem Alpaydin, The MIT Press, October 2004. Book Website (including errata)
- Reference texts:
- The text for the AI course (COMP3702) is a useful reference - Russell S. and Norvig P., Artificial Intelligence: A modern approach, 2nd ed., 2003. Prentice Hall.
- R. Duda, P. Hart and D. Stork. Pattern Classification, Second edition. Wiley, 2001.
- [Bis] Bishop, C. M. Pattern Recognition and Machine Learning. Springer, 2006.
- [HTF] T. Hastie, R. Tibshirani and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer. 2001.
- D. Hand, H. Mannila and P. Smyth, Principles of Data Mining, MIT Press, 2001.
Pracs
- Prac 1 (5/3): Introduction to Classification and Weka. Note: this would also be a good chance to have a look at Matlab if you are not familiar with it - working through some of the "Matlab Primer" document below would be a good start.
- Prac 2 (12/3): Bayesian Networks
- Prac 3 (19/3): Regression and Parametric Models
- Temperature dataset
- Iris dataset (in plain text format from UCI repository)
- (2/4): There will be no new prac sheet this week, to allow us to catch up with the lecture material. The prac session will still run, so please use this opportunity to catch up if you haven't completed the previous pracs.
- Prac 4 (9/4): Dimensionality Reduction using Principal Component Analysis
- Prac 5 (16/4): Clustering
- Prac 6 (23/4): Nonparametric techniques
- Prac 7 (7/5): Support Vector Machines
- Sonar dataset
- Ionosphere dataset
- Glass dataset
- Gorman, R. P.; Sejnowski, T. J.; Analysis of Hidden Units in a Layered Network Trained to Classify Sonar Targets, Neural Networks, 1, 75-89, 1988 (PDF)
- Prac 8 (14/5): Single and Multilayer Perceptrons and Trajan
- Prac 9 (21/5): Assessing algorithms, Bagging and Boosting
- (Use datasets from Pracs 4 and 7).
- D. Opitz and R. Maclin. Popular ensemble methods: an empirical study. Journal of Artificial Intelligence Research v.11 (1999) pp.169-198. (PDF)
Assignments
The assignments will be comprised of some of the questions on the pracs. If you complete each prac, producing your assignment will be quite easy. (NB: in the assignment question a.b refers to question b from Prac a). Assignments should be submitted in hardcopy to the submission box in level 1, GP South, or electronically via submit.itee.uq.edu.au.
- Assignment 1: Questions 1.4, 2.4, 2.5, 3.1, 3.3, 4.1, 4.3, 4.4. Due Wednesday 12pm, 16/4/08.
- Assignment 2: Questions 5.1, 5.2, 6.2, 6.3, 7.3, 7.4, 8.6, 8.7, 9.5, 9.6. Due Friday 5pm, 30/5/08.
Exam
The 2007 and 2006 exams are available from the library web.
The 2005 exam is also available. Note however that the course content has changed extent, hence some of the 2005 exam is irrelevant for you. In particular, you should ignore questions: 3(a), most of(b), (d); 6. Some of Q1 is a little out of context also. Please ask the lecturer if you need more clarification about the 2005 exam questions.
Study Guide/notes
Reference Material
- Matrix Identities, by Sam Roweis (source)
- Introduction to Probability Models, by Em Prof Tom Downs.
- Weka software website.
- Weka Explorer User Guide
- Neural Computing notes 9
- Neural Computing notes 10
- UCI Machine Learning Repository
- Matlab primer (An Introduction to Matlab for Cognitive Programming) by Scott Bolland.
- Hidden Markov Model tutorial (from University of Leeds).
Last modified: 10/06/08
