MAS 622J/1.126J: PATTERN RECOGNITION AND ANALYSIS


FALL 2010 Class Information:

Lectures: Tuesdays and Thursdays 9:30-11:00, 8-205
Recitations: Fridays 10-11, 8-205
Textbook: Pattern Classification by Duda, Hart, and Stork, with other readings


Staff | Announcements | Assignments| Syllabus | Policies

Instructor:

Prof. Rosalind W. Picard
Office: E14-374g
Office hours: I'm available Tuesdays 11-12 pm, weekly, and for additional hours during project planning season: Nov 5, Nov 9 and Nov 16. Outside of posted hours you can meet with me by scheduling an appointment with Alex at email below. I'm also usually available right after class walking back to the lab & always happy to try to answer your questions in person (I usually don't have enough hours to answer by email).
Phone: 617-253-0611
picard ( @ media dot mit dot edu)
 

Teaching Assistants:

Daniel McDuff
Office: E14-374a
Office hours: Monday 5-6pm
Phone: 617-254-2136
djmcduff ( @ media dot mit dot edu)

Javier Hernandez Rivera
Office: E14-374a
Office hours: Wednesday 10-11am
Phone: 617-253-8628
javierhr ( @ mit dot edu)
 

Support Staff:

Lillian Lai
Office: E14-374h
Phone: 617-253-0369
lillian ( @ media dot mit dot edu)



 

Staff | Announcements | Assignments| Syllabus | Policies


R 9-9 First day of class. (Visitors welcome, although class size is limited to 20.)

The first recitation will be held Friday Sep 10, providing an overview for those who would like help with MATLAB, probability, or the first problem set (PS). This class has traditionally made heavy use of MATLAB. Here is a MATLAB Tutorial you may find helpful. You might also enjoy this short insightful (optional) piece on nuances in the use of probability by Media Lab graduate Tom Minka, Ph.D.

There is a useful website for the text, which includes not only ppt slides for the chapters, but also errata for each edition of the text. Please correct the errors in your text now to save you time later when you are reading and trying to understand.

The IAPR Pattern Recognition Education Resources web site was initiated by the Internation Association for Pattern Recognition (http://www.iapr.org/). The goal was a web site that can support students, researchers and staff. Of course, advances in pattern recognition and its subfields means that developing the site will be a never-ending process. What resources does the IAPR Education web site have? The most important resources are for students, researchers and educators. These include lists with URLs to: - Tutorials and surveys - Explanatory text - Online demos - Datasets - Book lists - Free code - Course notes - Lecture slides - Course reading lists - Coursework/homework - A list of course web pages at many universities

OLD EXAMPLES OF PROJECTS Here are the 2002 Class Projects Page, the 2004 Class Projects Page, the 2006 Class Projects Page, and the 2008 Class Projects Page.



 
 

Staff | Announcements | Assignments| Syllabus | Policies


Note: Future assignments below, beyond what has been stated in class each week, are predictions. They are subject to revision, with each revision a more accurate prediction than the one before.

"If you must predict, predict often." -- Prof. Paul Samuelson, 1970 Nobel Prize in Economics


R 9-9 Lecture 1 Introduction, Reading: DHS Chap 1, A.1-A.2

PS 1 + Dataset (500x2 array, 500 points in 2-dim space, each column row a sample point (x,y))

F 9-10 Recitation 1 Matlab Introduction [file + tutorial]

T 9-14 Lecture 2 Reading: DHS Chap A.2-A.5.
  Recommended readings on probability:
     - Alvin W. Drake. Fundamentals of applied probability theory.
     - William Feller. An introduction to probability theory and its applications.
     - Richard W. Hamming. The art of probability for scientis and engineers.
     - Andrew's Moore's tutorials: [ probabilistic analytics, probability densities, and gaussians ]

R 9-16 Lecture 3 Reading: DHS Chap 2.1-2.4 (can skip 2.3.1, 2.3.2) Notes (courtesy by Rob Speer)

F 9-17 Recitation 2

M 9-20 First submission PS 1

T 9-21 Lecture 4 Reading: DHS Chap 2.5-2.7 Notes

Solution PS 1

PS 2 (data)

R 9-23 Lecture 5 Reading: DHS Chap 2.8.3, 2.9, 3.1-3.2

Final submission PS 1

F 9-24 Recitation 3

T 9-28 Lecture 6 DHS 3.1-3.5.1 Slides

R 9-30 Lecture 7 Guest lecture from Javier Hernandez Rivera: Dimensionality Reduction DHS 3.7.1-.3, 3.8 Slides

F 10-1 Recitation 4

Solution PS 2

PS 3 + data prob 1, data prob 4

M 10-4 Final submission PS 2

T 10-5 Lecture 8 HMMs, reading: A Tutorial On Hidden Markov Models and Selected Applications in Speech Recognition, L.R. Rabiner, Proceedings of the IEEE, Vol 77 No 2, Feb. 1989.; optional extra reading: DHS 3.10 (beware, as DHS uses non-standard notation in this section).

R 10-7 Lecture 9 HMMs, same reading. Notes

F 10-8 Recitation 5

T 10-12 Lecture 10 DHS 2.10 and 3.9, Missing Data and Expectation Maximization, brief intro to Bayes Nets, 2.11 Notes

First submission PS 3 (prob. 1, 2 and 3)

R 10-14 Lecture 11 (Picard away co-directing TTT meeting) Inference on Bayes Nets and Dynamic Bayes Nets, guest lecture from Dr. Rana el Kaliouby, Reading: Introduction to Graphical Models and Bayesian Networks, Kevin Murphy, 1998. (DBN Lecture PDF slides)

PS 4 (data and tutorial)

First submission PS 3 (prob. 4 and 5) + solution

F 10-15 Recitation 6 (during 25th anniversary sponsor/alumni event)

M 10-18 Final submission PS 3

T 10-19 Lecture 12 DHS 10.2-10.4.3 Mixture densities, K-means clustering, Quiz review

R 10-21 Lecture 13 DHS 10.4.3, 10.4.4, 10.6-10.10 Clustering

F 10-22 Recitation 7 (Midterm 2008)

M 10-25 First submission PS 4 + (solution + code)

T 10-26 Lecture 14 DHS 4.5.1, 4.5.4, K-nn classifier, DHS 5.1-5.3, 5.8.1 Linear Discriminants. Slides

W 10-27 Final submission PS 4

R 10-28 MIDTERM QUIZ, Covers Lectures 1-13 (Nov 17, 2010 = DROP DATE)

F 10-29 Recitation 8

PS 5 (paper and datasets: prob. 1 and prob. 3)

T 11-2 Lecture 15 (Picard in UK), Provided Data for Projects Ready Presentations on project data available, Optional Readings (short and very interesting articles to discuss with your friends, given Election Day)
"Election Selection: Are we using the worst voting procedure?" Science News, Nov 2 2002.
Range voting: Best way to select a leader?
Slides on Class Project

R 11-4 Lecture 16 Guest lecture on regression by Sophia Yuditskaya Slides

Project Plan Due if you're using your own data

F 11-5 Recitation 9

M 11-8 First submission PS 5 + solution

T 11-9 Lecture 17 DHS 6.1-6.6, 6.8 Multilayer Neural Nets

Project Plan Due if you're using class-provided data

PS 6 (data)

R 11-11 Veteran's Day Holiday - No Class

F 11-12 Recitation 10

Final submission PS 5

T 11-16 Lecture 18 Feature Selection, webpage shown in class: http://ro.utia.cz/fs/fs_guideline.html DHS reading: Entropy/Mutual information A.7, Decision Trees, 8.1-8.4

R 11-18 Lecture 19 Project Progress Presentations/ Critique Day/ Attendance counts toward grade today

F 11-19 Recitation 11

M 11-22 First submission PS 6 + (solution + code)

T 11-23 Lecture 20 Project Progress Presentations/ Critique Day/ Attendance counts toward grade today

W 11-24 Final submission PS 6

R 11-25 Thanksgiving Holiday - No Class

F 11-26 No Recitation, Thanksgiving Vacation

T 11-30 Lecture 21 5.11 SVM, 9.1-9.2.1 No free lunch, 9.2.3-9.2.5 MDL, Occam; 9.3 Bias and Variance, 9.5 Bagging, Boosting, Active Learning, 9.6 Estimating and Comparing Classifiers, 9.7 Classifier Combination

R 12-2 Lecture 22 Guest lecture from Dan McDuff: Gaussian Processes. Slides

F 12-3 Recitation 12 Project help session - your staff has lots of experience - get our input and help

T 12-7 Final Project Presentations: All students required to attend: Attendance counts significantly for grade today.

R 12-9 Final Project Presentations, Last Day of Class: All students required to attend; attendance counts significantly for grade today.



 

Staff | Announcements | Assignments| Syllabus | Policies


Fall 2010 Syllabus: (subject to adjustment)

Intro to pattern recognition, feature detection, classification

Review of probability theory, conditional probability and Bayes rule

Random vectors, expectation, correlation, covariance

Review of linear algebra, linear transformations

Decision theory, ROC curves, Likelihood ratio test

Linear and quadratic discriminants

Sufficient statistics, coping with missing or noisy features

Template-based recognition, feature extraction

Eigenvector and Fisher linear discriminant analysis

Independent component analysis

Training methods, Maximum likelihood and Bayesian parameter estimation

Linear discriminant/Perceptron learning, optimization by gradient descent

Support Vector Machines

K-Nearest-Neighbor classification

Non-parametric classification, density estimation, Parzen estimation

Unsupervised learning, clustering, vector quantization, K-means

Mixture modeling, Expectation-Maximization

Hidden Markov models, Viterbi algorithm, Baum-Welch algorithm

Bayesian networks

Bagging, boosting

Decision trees, Multi-layer Perceptrons

Optional other topics toward end of term




 

Staff | Announcements | Assignments | Syllabus | Policies

Grading:


30% Homework/Mini-projects, due every 1-2 weeks up until 3 weeks before the end of the term. These will involve both programming (Matlab) and non-programming assignments.

New homework submission and grading policy:

It is NOT ALLOWED to look at old homeworks before handing in your first pass at your homework. If we catch you doing this you will get a zero on the homework. This year we want to help you get as far as possible on your own or working with the other students collaboratively, and then after you hand in that work, we will hand out the solutions and let you fix your homework and improve your grade. This policy has multiple goals: (1) it forces you to learn by figuring things out, which develops your abilities better, (2) you are allowed to collaborate, which also facilitates discussion-based learning and meeting other people in the class (although you should not copy anyone's answers - and don't let them copy yours - write your own please, (3) it levels the playing field and is fair to those who don't have access to old homeworks, and (4) if many of you don't do well, it shows us we need to teach better, giving us important feedback so we can make the course better. Finally, because looking at solutions can raise your grade and be educational, we will hand out the solutions in a fair way to everyone after you hand your problem set in, and if you fix your answers by the grading deadline, we will regrade your homework and give you the average of the two grades, before and after the solutions.


Here is how it works: Homework will be due by 5pm on the due date. On the day when the homework is due, submit the best efforts you have made and photocopy or otherwise make a copy for yourself of what you handed in. We will hand out the homework solution and you will have a chance to refine your work given the solution. Submit your revised homework by the date and time specified (please do not be late - it is not fair to our graders and TAs). Do not copy your revised answers directly from the solution sheet: directly copying answers from the solution sheet will not be counted. Our goal is to help you learn and understand the ideas in the course materials. Optimizing your grades in grad school is not as important as getting depth of knowledge and developing your learning skills. Feel free to add comments to your solution to make the graders aware of specific things you have learnt from the 'second pass' - this will improve your chance of achieving the maximum bonus.

30% Project with approximate due dates:


25% Midterm Quiz: R 10-28 (Drop Date is W 11-17)

15% Your presence and interaction in lectures (especially your presence during the two days of project critiques and two days of final project presentations, which is 10%), in recitation, and with the staff outside the classroom.
 

Late Policy:

Assignments are due by 5pm on the due date. Please bring them to E14-374A (slide them under the door if no one is there) or email them to both of the TAs (djmcduff@mit.edu & javierhr@mit.edu). If you are late, you will get a zero on the assignment.
 

Collaboration/Academic Honesty:

The goal of the assignments is to help you learn, not to see how many points you can get. Grades in graduate school do not matter as much as in undergraduate: what you learn matters. Thus, if you stumble across old course material with similar-looking problems, you are NOT allowed to look at their solutions. It is unfair to other students if you do this, and we will hand out the solutions in an ethically fair way to everyone with time to revise and resubmit. Start early, and don't be disappointed if you get stuck when you try to do the homework solo; that frustrating experience can lead to greater inquiry and effective learning. Please feel free to come to the staff for help, and also to collaborate on the problems and projects with each other. Collaboration should be at the "whiteboard" level: discuss ideas, techniques, even details - but write your answers independently. This includes writing Matlab code independently, and not copying code or solutions from each other or from similar problems from previous years. If you are caught violating this policy (and we have caught people in the past) it will result in an automatic F for the assignment AND may result in an F for your grade for the class as well as a strong reprimand from the department. (This has happened to people before - it is not an empty threat.) If you team up on the final project (teams of two are encouraged), then you may submit one report which includes a jointly written and signed statement of who did what.

The midterm will be closed-book, but we will allow a cheat sheet.
 

Course feedback:

The staff welcomes your comments on the course at any time. There is an anonymous service for providing feedback . Please feel free to send us comments -- in the past, we have obtained helpful remarks that allow us to make improvements mid-course. We want to maximize the value of this course for everyone and welcome your input, positive or negative.
 

Attendance:

Attendance is especially important for this class on project critique days and on final project presentation days. These are highly valuable sessions for learning practical issues that the book/lectures do not delve into. Attendance on these four days will contribute 10% to your final grade.