MAS S68 F'19 | Computer Visions: Generative Media

MAS S68 F’19 | Computer Visions: Generative Machine Learning Tools for Creative Applications

About

Machine learning is transforming our reality. Today’s models deeply learn about their input domains and reach beyond-human capabilities on a growing list of tasks. While learning to predict, many ML models construct a view of their input worlds that allow them to also generate new input data – they are generative models. In this fast-paced class students will use generative ML models to “paint” and “sketch”, “write” a poem or a whole (fake-)news article, transfer visual artistic style, hallucinate structure out of noise, generate 3D models, and “compose” music. The class will lightly cover topics in applied mathematics for machine learning but will focus on hands-on practical programmatic methods to implement computer visions in Python. Students will learn how machine learning models are used to generate information in multiple media (text, sound, picture, 3D geometry), and by the end of the course will be able to apply these tools to their own domain of interest. The course is designed for persons without prior experience in deep learning, but it can benefit advanced students looking for an overview of the latest generative methods.

Logistics

Location: E15-359
Time: Thursdays from 1-3pm, on a once-every-two-weeks schedule.
Instructors: Prof. Pattie Maes, Dr. Roy Shilkrot, Guillermo Bernal
Units: 0-6-0

Using Google Cloud: https://courses.media.mit.edu/2019fall/mass68/howto-using-google-cloud/

Syllabus | Topics Covered

Deep learning mechanics: ML concepts (model, classification, regression, supervision, train-test, loss, overfitting, regularization), linear models, neural nets, back-propagation optimization, convolution, recurrence, attention, transformers
Deep learning practicalities: Tensorflow-Keras, working with data, inference environment stack (CUDA, Docker, Jupyter), retraining and transfer learning
Generative model patterns: encoder-decoders, autoencoders, adversarial nets
Significant ML Models of interest: VAEs, CNNs (VGG, Inception, ResNeXt, DenseNet, WaveNet, Pixel), RNNs (Char, DRAW, Sketch, Melody, ELMo), GANs (DC, VAE, C/Info, Cycle, Big, Style, Pro, pix2pix, Gau, Stacked, 3D, …), Transformers (BERT, GPT1,2)

Timeline

	Date	Topics	Assignment	Slides	Video	Assignment
1	9/5	Intro, Machine & Deep Learning, CNNs	ImageNet: Style Transfer, Deep Dream, Neural Doodle	PDF	Link	Link
2	9/19	Generative models, VAEs, GANs	VAE, DCGAN, C/Info GAN, BigGAN, Cycle	PDF	Link	Link
3	10/3	Text: RNNs, Attention, Transformers	Word2Vec, BERT, GPT, ELMo, Txt2img	PDF	Link	Link
4	10/17	Quick Draw!, DRAW, Sketch-RNN	Sketch, draw, paint	PDF	Link	Link
5	10/31	Music, audio, representation, melody, rhythm	WaveNet, Melody RNN	PDF	Link	Link
6	11/14	3D models, transfer learning	Image to 3D, 3D GAN	PDF	Link
7	12/5	Project presentations

Assignments and Grading

Each week a home assignment will be given in the form of a jupyter notebook. The notebook will contain code that follows each week’s class and also open segments for students to run their own code and tweak parameters to generate new artifacts. Students will be encouraged to post their successful creations on the class website. Assignments will be graded towards the final grade, and feedback will be given.

The class will have a final project that is centered around the student’s domain of interest, applying the tools given in and out of class. Instructors will provide starting points, data, and help for finding a suitable project. Projects could be done in groups of 1, 2 or 3 students.

Final grades will be given after project submission evaluations. Project grading criteria: creative value, technical contribution, academic contribution.

Three special awards will be given to extraordinary teams to earn extra credit:

The Transfer-Learning award – given to students who successfully apply a model pre-trained in one domain to generate output in a different domain (e.g. a text model to create images, or vice versa).
The C-AI-borg award – given to students who demonstrate their generative model created human-level outputs or otherwise surprising capability.
The Hinton-LeCun-Bengio award – given to students who train their model from scratch and demonstrate its utility towards generation.

Grading scheme

Breakdown:
- Assignments: 50%
- Final project: 50%
- Extra credit: 5%
Letter grading policy: http://catalog.mit.edu/mit/procedures/academic-performance-grades/#gradestext

Class Structure

The class is based on a 2hrs every 2-weeks schedule, with biweekly readings and home assignments.

Participation in class will be encouraged, and topics from active learning practices will be applied.

Prerequisite knowledge

Programming and scripting: Python, command line scripting (linux/mac)
If you’re already comfortable with programming, or are a quick learner, you can take this class. Work in Jupyter enviroment – recommended.
Basic mathematics: Linear algebra, statistics and probability, multivariate calculus – only cursory knowledge required, however basics will not repeated in class for lack of time.
If interested in an in-depth understanding of Machine Learning, Deep Learning, Numeric Optimization or Statistical Modeling – consider taking a dedicated class on these subjects.

Recommended literature

Hands-On Machine Learning with Scikit-Learn and TensorFlow, by Aurélien Géron.
Deep Learning with Python, by Francois Chollet

Recommended courses

Machine learning, by Andrew Ng, Coursera
Deeplearning.ai
Dive into Deep Learning
MIT EdX Machine Learning with Python (Barzilay & Jaakkola)
MIT CSAIL’s 6.036 Introduction to Machine Learning
MIT EECS 6.883/6.S083 Modeling with Machine Learning: from Algorithms to Applications
MIT 6.S191

Related Classes / Initiatives

MIT EECS IAP 6.S192: Deep Learning AI for Arts, Aesthetics, and Creativity
- A class given in IAP’19 focused on image creation.
- Our class will extend to other media: text, audio, music, 3D, sketch, etc.
Machine Learning for Artists, Gene Kogan: https://ml4a.github.io/
- A book-in-progress and a string of talks (class) given in NYU’s ITP
Artists + Machine Intelligence (AMI): https://medium.com/artists-and-machine-intelligence
- An initiative from Google to bring AI tools to the hands of designers.
Machine Learning for Musicians and Artists: https://www.kadenze.com/courses/machine-learning-for-musicians-and-artists/info
- A class on connecting classical ML models to artistic tools such as MAX/MSP, PD, Processing etc.
Creative Applications of Deep Learning with TensorFlow: https://www.kadenze.com/courses/creative-applications-of-deep-learning-with-tensorflow/info