TA Office hours
Study Guide
CSC2515 Grad Project

CSC411/2515: Machine Learning and Data Mining

Winter 2018

About CSC411/2515

This course serves as a broad introduction to machine learning and data mining. We will cover the fundamentals of supervised and unsupervised learning. We will focus on neural networks, policy gradient methods in reinforcement learning. We will use the Python NumPy/SciPy stack. Students should be comfortable with calculus, probability, and linear algebra.

All announcements will be made on Piazza

Teaching Team

InstructorSectionOffice HourEmail
Michael GuerzhoyTh6-9 (SF1101)M6-7, W6-7 (BA3219)guerzhoy [at] cs.toronto.edu
Lisa ZhangT1-3, Th2-3 (RW117)Th11-12, 3:30-4:30 (BA3219)lczhang [at] cs.toronto.edu

When emailing instructors, please include "CSC411" in the subject.

Please ask questions on Piazza if they are relevant to everyone.

TA office hours are listed here.

Study Guide

The CSC411/2515 study guide.

Tentative Schedule

   Review of ProbabilityDeep Learning 3.1-3.9.3Math Background Problem Set
Complete ASAP
   Review of Linear AlgebraDeep Learning 2.1-2.6
Week 1Jan 4Jan 4Welcome; K-Nearest Neighbours

Reading: CIML 3.1-3.3

Video: A No Free Lunch theorem

Videos: Partial derivatives Part 1, Part 2

Jan 9Jan 4Linear Regression

Reading: Roger Grosse's Notes on Linear Regression

Reading: Alpaydin Ch. 4.6

Just for fun: why don't we try to look for all the minima/the global minimum of the cost function? Because it's an NP-hard task: if we could find global minima of arbitrary functions, we could also solve any combinatorial optimization problems. The objective functions that correspond to combintorial optimization problems often will look "peaky:" exactly the kind of functions that are intuitively difficult to optimize.

Jan 9Jan 4
Week 2Jan 11Jan 11Numpy Demo: html ipynb
Numpy Images: html ipynb
3D Plots (contour plots, etc): html ipynb
Gradient Descent (1D): html ipynb
Gradient Descent (2D): html ipynb
Linear Regression: html ipynb

What is the direction of steepest ascent on the point (x, y, z) on a surface plot? Solution (Video: Part 1, Part 2.)

Video: 3Blue1Brown, Gradient Descent.

Jan 16Jan 11Multiple Linear Regression
Linear Classification (Logistic Regression)
Maximum Likelihood
Bayesian Inference

Reading: Andrew Ng's Notes on Logistic Regression
Maximum Likelihood

Reading: Andrew Ng, CS229 notes

Reading: Alpaydin, Ch. 4.1-4.6

Jan 16Jan 11
Week 3Jan 18Jan 18Bayesian inference and regularization (updated)
Bayesian inference: html ipynb
Overfitting in linear regression: html ipynb

Videos: Unicorns and Bayesian Inference, Why I don't believe in large coefficients, Why L1 regularization drives some coefficients to 0

Reading: Alpaydin, Ch. 14.1-14.3.1

Jan 22Jan 18Neural Networks

Reading: Deep Learning, Chapter 6. CS231n notes, Neural Networks 1. Start reading CS231n notes, Backpropagation.

Videos on computation for neural networks: Forward propagation setup, Forward Propagation Vectorization, Backprop specific weight, Backprop speicif weight pt 2, Backprop generic weight, Backprop generic weight: vectorization.

Optional video: 3Blue1Brown, But what *is* a Neural Network?, Gradient Descent, What is Backpropagation really doing?

Just for fun: Formant Frequencies

Just for fun: Andrej Karpathy, Yes you should understand Backprop

Jan 22Jan 18
Week 4Jan 25Jan 25PyTorch Basics ipynb, html; Maximum Likelihood with PyTorch (ipynb, html) ; Neural Networks in PyTorch, low-level programming (ipynb, html) and high-level programming (ipynb, html); If there is time, Justin Johnson's Dynamic Net (ipynb, html)

Reading: PyTorch with Examples,

Just for fun: Who invented reverse-mode differentiation?

Project #1 due Jan 29th
Jan 30Jan 25

Neural Networks, continued.

Neural Networks Optimization, Activation functions, multiclass classification with maximum likelihood

Reading: Deep Learning, Ch. 7-8.

Just for fun: Brains, Sex and Machine Learning, Prof. Geoffrey Hinton's talk on Dropout (also see the paper)

Jan. 30Jan. 25


How Neural Networks see

Week 5Feb 1Feb 1

Convolutional Networks

How Deep Neural Networks see

AlexNet demo

Reading: Deep Learning, Ch. 9, CS231n notes on ConvNets

Video: Guided Backprop: idea (without the computational part)

Just for fun: the Hubel and Wiesel experiment

Just for fun: Andrej Karpathy, What I learned from competing against a ConvNet on ImageNet.

Project #1 bonus due Feb 5th
Feb 6Feb 1
Feb 6Feb 1

Backpropagating through a Conv Layer

Generative classifiers and Naive Bayes

Reading: CIML Ch. 8, Bishop 4.2.1-4.2.3, Andrew Ng, Generative Learning Algorithms
Week 6Feb 8Feb 8

The EM Algorithm

Gaussian classifiers, Multivariate Gaussains (ipynb), Mixtures of Gaussians and k-Means

Reading: Andrew Ng, Generative Learning Algorithms, Andrew Ng, Mixtures of Gaussians and the EM Algorithm. A fairly comprehensive introdoction to multivariate Gaussians (more than you need for the class): Chuong B. Do, The Multivariate Gaussian Distribution.

Reading: Alpaydin, 7.1-7.6

Videos: NB model setup, P(x), EM for the NB model.

Just for fun: Radford Neal and Geoffrey Hinton, A View of the EM Algorithm that Justifies Incremental, Sparse, and other Variants -- more on maximimizing the likelihood of the data using the EM algorithm (optional and advanced material).

Feb 13Feb 8
Feb 13Feb 8
Week 7Feb 15Feb 15EM Tutorial: Gaussians, Binomial.

Reading: CIML Ch. 15, Bishop Ch. 12.1.

Reading: Alpaydin, Ch. 6.1-6.3

Just for fun: average faces are more attractive, but not the most attractive. Francis Galton computed average faces over a 100 years ago pixelwise (by using projectors), just like we are doing for centering data before performing PCA.

Project #2 due Feb 23rd

Grad Project Proposal due Feb 28th
Reading WeekFeb 27Feb 15

About the midterm

Principal Component Analysis (Night Section)

Decision Trees (Day Section)

Feb 27 Feb 15
Week 8Mar 1Mar 1Review Tutorial
(8pm-9pm for evening section)

Reading: Alpaydin Ch. 9, CIML Ch. 1.

Just for fun: For more on information theory, see David MacKay, Information Theory, Inference, and Learning Algorithms or Richard Feynman, Feynman Lectures on Computation, Ch. 4.

Midterm Mar 2nd

Midterm| Solution
Mar 6Mar 1

Decision Trees (Night Section)

Principal Component Analysis (Day Section)

Mar 6Mar 1
Week 9Mar 8 Mar 8

Midterm postmortem.

Reinforcement Learning

Neural Style Transfer, Deep Dream (evening section only)

Evolution Strategies for RL (evening section only)

Midterm solutions. Video solutions for Q6: data likelihood, the E-step, the M-step. Extra optional material: Preview: MLE for Poisson, deriving the E-step and the M-step by maximizing the expected log-likelihood.

Reading: Ch. 1 and Ch. 13.1-13.3 of Sutton and Barto, Reinforcement Learning: An Introduction (2nd ed.)

Reading: Alpaydin, 18.1-18.4 (for general background)

Just for fun: Deep Dream images, Deep Dream grocery trip

Just for fun: Reinforcement Learning for Atari

Just for fun: A very fun paper on learning to play Super Mario from the SIGBOVIK conference (link to pdf).

Just for fun: the OpenAI blog post on Evolution Strategies in RL

Video of the Bipedal Walker learning to walk

Mar 13Mar 8
Mar 13Mar 8
Week 10Mar 15Mar 15 Reinforcement Learning tutorial

Reading: Alpaydin, Ch. 13.1-13.5

Project #3 due Mar 19th
Mar 20Mar 15SVMs and Kernels
Mar 20Mar 15
Week 11Mar 22Mar 22

Review of the Policy Gradients starter code for P4


Reading: Alpaydin, Ch. 17.1-17.2, 17.6, 17.7 (?)

Mar 27Mar 22


Mar 27Mar 22
Week 12Mar 29Mar 29Review TutorialProject #4 due Apr 2
Apr 3Mar 29

Overview: successes in supervised learning. Overview: unsupervised learning

GANs, adversarial examples

Suggested reading: Ian Goodfellow, Generative Adversarial Networks tutorial at NIPS 2016 (skim for applications and the description of GANs).
Apr 3Mar 29

Final Examination (CSC411)

April 2018 exam timetable
CSC2515 Project due Apr. 26