CSC411/2515: Machine Learning and Data Mining (Winter 2018)

About CSC411/2515

This course serves as a broad introduction to machine learning and data mining. We will cover the fundamentals of supervised and unsupervised learning. We will focus on neural networks, policy gradient methods in reinforcement learning. We will use the Python NumPy/SciPy stack. Students should be comfortable with calculus, probability, and linear algebra.

Required Math Background

Linear Algbera

Vectors: the dot product, vector norm, vector addition
Matrices: matrix multiplication
Rotation and translation, linear spaces, basis and dimension, Eigenvectors

Calculus: derivatives, derivatives as the slope of the function; integrals
Probability: random variables, expectation, independence, conditional probability, Bayes' Rule

Math background problem set: complete ASAP

Other topics will be needed, but are not part of the pre-requisites, so we will devote an appropriate amount of lecture time to them.

Textbooks

The two recommended textbooks for this course are below. Additional useful resources are further down the page.

Pattern Recognition and Machine Learning by Christopher M. Bishop. A very detailed and thorough book on the foundations of machine learning. A good textbook to have as a reference.
Deep Learning by Yoshua Bengio, Ian Goodfellow, and Aaron Courville. A advanced textbook with good coverage of deep learning and a brief introduction to machine learning, and several chapters covering the math you need for this course.
Introduction to Machine Learning, Second Edition by Ethem Alpaydin. This textbook contains explanations at about the level of this course (although we sometimes go beyond what is in this textbook).

Discussion Forum

The course discussion forum is on Piazza. Please ask questions on Piazza if they are relevant to everyone.

is a third-party discussion forum with many features that are designed specifically for use with university courses. We encourage you to post questions (and answers!) on Piazza, and read what other questions your classmates have posted. However, since Piazza is run by company separate from the university, we also encourage you to read the privacy policy carefully and only sign up if you are comfortable with it. You can always read announcements and other students' posts without signing up.

Evaluation

The course mark will be the maximum of the following. Students enrolled in CSC2515 have to submit a graduate course project instead of writing the exam. A minimum grade of 40% on the final exam/graduate course project is required to pass the course.

40%: Projects (10% each)
30%: Midterm
30%: Final Exam

40%: Projects (10% each)
10%: Midterm
50%: Final Exam

Policy on special consideration

Special consideration will be given in cases of documented and serious medical and personal cirumstances.

Midterm

The midterm will take place Friday March 2nd, 2018 at 6pm-8pm. In case of scheduling conflicts, please contact the instructors by Feb. 23 with the documentation of the scheduling conflict to make arrangements for an alternate sitting.

Other recommended resources

The Elements of Statistical Learning by Trevor Hastie, Robert Tibshirani, and Jerome Friedman is also an excellent reference book, available on the web for free at the link.
An Introduction to Statistical Learning with Applications in R by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani is a more accessible version of The Elements of Statistical Learning.
Deep Learning by Yoshua Bengio, Ian Goodfellow, and Aaron Courville is an advanced textbook with good coverage of deep learning and a brief introduction to machine learning.
Learning Deep Architectures for AI by Yoshua Bengio is in some ways better than the Deep Learning book, in my opinion.
Reinforcement Learning: An Introduction by R. Sutton and A. Barto will be useful when we discuss Reinforcement Learning
Geoffrey Hinton's Coursera course contains great explanations for the intution behind neural networks.
The CS229 Lecture Notes by Andrew Ng are a concise introduction to machine learning.
Andrew Ng's Coursera course contains excellent explanations of basic topics (note: registration is free).
Pedro Domingos's CSE446 at UW (slides available here) is a somewhat more theorically-flavoured machine learning course. Highly recommended.
CS231n: Convolutional Neural Networks for Visual Recognition at Stanford (archived 2015 version) is an amazing advanced course, taught by Fei-Fei Li and Andrej Karpathy (a UofT alum). The course website contains a wealth of materials.
CS224d: Deep Learning for Natural Language Processing at Stanford, taught by Richard Socher. CS231, but for NLP rather than vision. More details on RNNs are given here.
Python Scientific Lecture Notes by Valentin Haenel, Emmanuelle Gouillart, and Gaël Varoquaux (eds) contains material on NumPy and working with image data in SciPy. (Free on the web.)

Software

We will be using the Python NumPy/SciPy stack in this course. It is installed in the Teaching Labs.

The most convenient Python distribution to use is Anaconda. If you are using an IDE and download Anaconda, be sure to have your IDE use the Anaconda Python.

We will be using PyTorch later on in the course.

Latex

Web-based LaTeX interfaces: Overleaf, ShareLaTeX

TeXworks, a cross-platform LaTeX front-end. To use it, install MikTeX on Windows and MacTeX on Mac

Detexify² - LaTeX symbol classifier

The LaTeX Wikibook.

Additional LaTeX Documentation, from the home page of the LaTeX Project.

LaTeX on Wikipedia.