School of Information Technology and Electrical Engineering
Semester 1, 2008
COMP4702/COMP7703 - Machine Learning
Course Material
Lecture Notes
Notes are listed here in the order that we will cover them in the course. These slides are based on those provided by Alpaydin (the author of the text), with modifications made where possible.
- Introduction (Ch. 1) - Slides
- Supervised Learning (Ch. 2) - Slides
- Occam's Razor (wikipedia entry)
- VC Dimension: a useful reference is via the Support Vector Machines" .org website (see later in the course for SVMs!).
- PAC Learning: for a better overview of this area, see the Russel and Norvig text, Section 18.5.
- Bayesian Decision Theory (Ch. 3) - Slides
- Thomas Bayes (wikipedia entry)
- SpamBayes - popular spam filter based on Naive Bayes models.
- An article on Bayesian networks in the Microsoft Assistant (link).
- A truly awesome list of links to Bayesian net resources (link). Look here for many real-world example applications of Bayesian nets, as well as information about advanced topics such as learning in Bayesian nets and inference in non-DAG graphical models.
- Parametric Methods (Ch. 4) - Slides
- Dimensionality Reduction (Ch. 6) - Slides
- Clustering (Ch. 7) - Slides
- k-means figures: [Bis] pic1, [Bis] pic2, [HTF] pic3.
- Gaussian mixture model figures: [Bis] pic1, [Bis] pic2.
- Dendrogram figures: [HTF] pic1, [HTF] pic2.
- Jain, A., Murty, M. and Flynn, P. Data Clustering: A Review. ACM Computing Surveys 34(3), 1999.
- Xu, R. and Wunsch II, D. Survey of Clustering Algorithms. IEEE Transactions on Neural Networks 16(3), 2005.
- A nice applet illustrating Gaussian mixture models and the EM algorithm.
- Nonparametric Methods (Ch. 8) - Slides
- Kernel density estimation figures: [Bis] pic1, [HTF] pic2, [HTF] pic3.
- k-Nearest Neighbour figures: [Bis] pic1 (density estimation), [HTF] pic2 (classification), [HTF] pic3.
- Linear Discrimination (Ch. 10) - Slides
- A few supplementary slides (from lectures, summarizing motivation for logistic sigmoid, cross-entropy, etc.)
- A nice applet illustrating Support Vectors Machines for 2-D classification.
- Multilayer Perceptrons (Ch. 11) - Slides
- MLP figures: [Bis] pic1, [DHS] pic1, [DHS] pic2, [DHS] pic3, [HTF] pic2, [HTF] pic3.
- Assessing and Comparing (Ch. 14) - Slides
- Combining Learners (Ch. 15) - Slides
- Hidden Markov Models (Ch. 13) - Slides
- "Bayesian tie-up" - a few slides I used in the final lecture to connect a few different topics in the course together.
Textbooks
- Course text: Introduction to Machine Learning. Ethem Alpaydin, The MIT Press, October 2004. Book Website (including errata)
- Reference texts:
- The text for the AI course (COMP3702) is a useful reference - Russell S. and Norvig P., Artificial Intelligence: A modern approach, 2nd ed., 2003. Prentice Hall.
- R. Duda, P. Hart and D. Stork. Pattern Classification, Second edition. Wiley, 2001.
- [Bis] Bishop, C. M. Pattern Recognition and Machine Learning. Springer, 2006.
- [HTF] T. Hastie, R. Tibshirani and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer. 2001.
- D. Hand, H. Mannila and P. Smyth, Principles of Data Mining, MIT Press, 2001.
Pracs
- Prac 1 (12/3): Introduction to Classification and Weka. Note: this would also be a good chance to have a look at Matlab if you are not familiar with it - working through some of the "Matlab Primer" document below would be a good start.
- Prac 2 (19/3): Bayesian Networks
- Prac 3 (26/3): Regression and Parametric Models
- Temperature dataset
- Iris dataset (in plain text format from UCI repository)
- Prac 4 (2/4): Dimensionality Reduction using Principal Component Analysis
- Prac 5 (23/4): Clustering
- Prac 6 (30/4): Nonparametric techniques
- Prac 7 (7/5): Support Vector Machines
- Sonar dataset
- Ionosphere dataset
- Glass dataset
- Gorman, R. P.; Sejnowski, T. J.; Analysis of Hidden Units in a Layered Network Trained to Classify Sonar Targets, Neural Networks, 1, 75-89, 1988 (PDF)
- Prac 8 (21/5): Single and Multilayer Perceptrons and Trajan
- Prac 9 (28/5): Assessing algorithms, Bagging and Boosting
- (Use datasets from Pracs 4 and 7).
- D. Opitz and R. Maclin. Popular ensemble methods: an empirical study. Journal of Artificial Intelligence Research v.11 (1999) pp.169-198. (PDF)
Assignments
The assignments will be comprised of some of the questions on the pracs. If you complete each prac, producing your assignment will be quite easy. (NB: in the assignment question a.b refers to question b from Prac a). Assignments should be submitted in hardcopy to the submission box in level 1, GP South, or electronically via submit.itee.uq.edu.au.
- Assignment 1: Questions 1.4, 2.4, 2.5, 3.1, 3.3, 4.1, 4.3, 4.4. Due Thursday 5pm, 9/4/09.
- Assignment 2: Questions 5.1, 5.2, 6.2, 6.3, 7.3, 7.4, 8.6, 8.7, 9.5, 9.6. Due 5pm Friday, 5/6/09.
Exams
The 2008, 2007 and 2006 exams are available from the library web.
The 2005 exam is also available. Note however that the course content has changed extent, hence some of the 2005 exam is irrelevant for you. In particular, you should ignore questions: 3(a), most of(b), (d); 6. Some of Q1 is a little out of context also. Please ask the lecturer if you need more clarification about the 2005 exam questions.
Study Guide/notes
Note that we do NOT cover the following material in lectures (i.e. it is not examinable/assessable):- Chapter 7: 7.5, mixture of mixtures model (under 7.6).
- Chapter 10: 10.8, the formulation of the optimization problems for SVMs in 10.9, 10.9.4.
- Chapter 11: 11.9-11.11 only very briefly covered, 11.12 not covered.
- Chapter 12: not covered at all (we didn't get time for this).
- Chapter 13: 13.8-13.10. Also, the derivations/inner workings of the Baum-Welch, Viterbi and Forward-backward algorithms do not need to be remembered in detail - just the general principles of how they work.
- Chapter 14: 14.5 and 14.6 covered only briefly, 14.7, 14.8.
- Chapter 15: 15.3, 15.7, 15.8.
Reference Material
- Matrix Identities, by Sam Roweis (source)
- Introduction to Probability Models, by Em Prof Tom Downs.
- Weka software website.
- Weka Explorer User Guide
- Neural Computing notes 9
- Neural Computing notes 10
- UCI Machine Learning Repository
- Matlab primer (An Introduction to Matlab for Cognitive Programming) by Scott Bolland.
- Hidden Markov Model tutorial (from University of Leeds).
Last modified: 19/06/09
