MAS S68 F’19 | Computer Visions: Generative Machine Learning Tools for Creative Applications
About
Machine learning is transforming our reality. Today’s models deeply learn about their input domains and reach beyond-human capabilities on a growing list of tasks. While learning to predict, many ML models construct a view of their input worlds that allow them to also generate new input data – they are generative models. In this fast-paced class students will use generative ML models to “paint” and “sketch”, “write” a poem or a whole (fake-)news article, transfer visual artistic style, hallucinate structure out of noise, generate 3D models, and “compose” music. The class will lightly cover topics in applied mathematics for machine learning but will focus on hands-on practical programmatic methods to implement computer visions in Python. Students will learn how machine learning models are used to generate information in multiple media (text, sound, picture, 3D geometry), and by the end of the course will be able to apply these tools to their own domain of interest. The course is designed for persons without prior experience in deep learning, but it can benefit advanced students looking for an overview of the latest generative methods.
Logistics
Location: E15-359
Time: Thursdays from 1-3pm, on a once-every-two-weeks schedule.
Instructors: Prof. Pattie Maes, Dr. Roy Shilkrot, Guillermo Bernal
Units: 0-6-0
Using Google Cloud: https://courses.media.mit.edu/2019fall/mass68/howto-using-google-cloud/
Syllabus | Topics Covered
- Deep learning mechanics: ML concepts (model, classification, regression, supervision, train-test, loss, overfitting, regularization), linear models, neural nets, back-propagation optimization, convolution, recurrence, attention, transformers
- Deep learning practicalities: Tensorflow-Keras, working with data, inference environment stack (CUDA, Docker, Jupyter), retraining and transfer learning
- Generative model patterns: encoder-decoders, autoencoders, adversarial nets
- Significant ML Models of interest: VAEs, CNNs (VGG, Inception, ResNeXt, DenseNet, WaveNet, Pixel), RNNs (Char, DRAW, Sketch, Melody, ELMo), GANs (DC, VAE, C/Info, Cycle, Big, Style, Pro, pix2pix, Gau, Stacked, 3D, …), Transformers (BERT, GPT1,2)
Timeline
Date | Topics | Assignment | Slides | Video | Assignment | |
1 | 9/5 | Intro, Machine & Deep Learning, CNNs | ImageNet: Style Transfer, Deep Dream, Neural Doodle | Link | Link | |
2 | 9/19 | Generative models, VAEs, GANs | VAE, DCGAN, C/Info GAN, BigGAN, Cycle | Link | Link | |
3 | 10/3 | Text: RNNs, Attention, Transformers | Word2Vec, BERT, GPT, ELMo, Txt2img | Link | Link | |
4 | 10/17 | Quick Draw!, DRAW, Sketch-RNN | Sketch, draw, paint | Link | Link | |
5 | 10/31 | Music, audio, representation, melody, rhythm | WaveNet, Melody RNN | Link | ||
6 | 11/14 | 3D models, transfer learning | Image to 3D, 3D GAN | Link | ||
7 | 12/5 | Project presentations |
Assignments and Grading
Each week a home assignment will be given in the form of a jupyter notebook. The notebook will contain code that follows each week’s class and also open segments for students to run their own code and tweak parameters to generate new artifacts. Students will be encouraged to post their successful creations on the class website. Assignments will be graded towards the final grade, and feedback will be given.
The class will have a final project that is centered around the student’s domain of interest, applying the tools given in and out of class. Instructors will provide starting points, data, and help for finding a suitable project. Projects could be done in groups of 1, 2 or 3 students.
Final grades will be given after project submission evaluations. Project grading criteria: creative value, technical contribution, academic contribution.
Three special awards will be given to extraordinary teams to earn extra credit:
- The Transfer-Learning award – given to students who successfully apply a model pre-trained in one domain to generate output in a different domain (e.g. a text model to create images, or vice versa).
- The C-AI-borg award – given to students who demonstrate their generative model created human-level outputs or otherwise surprising capability.
- The Hinton-LeCun-Bengio award – given to students who train their model from scratch and demonstrate its utility towards generation.
Grading scheme
- Breakdown:
- Assignments: 50%
- Final project: 50%
- Extra credit: 5%
- Letter grading policy: http://catalog.mit.edu/mit/procedures/academic-performance-grades/#gradestext
Class Structure
The class is based on a 2hrs every 2-weeks schedule, with biweekly readings and home assignments.
Participation in class will be encouraged, and topics from active learning practices will be applied.
Prerequisite knowledge
- Programming and scripting: Python, command line scripting (linux/mac)
If you’re already comfortable with programming, or are a quick learner, you can take this class. Work in Jupyter enviroment – recommended. - Basic mathematics: Linear algebra, statistics and probability, multivariate calculus – only cursory knowledge required, however basics will not repeated in class for lack of time.
If interested in an in-depth understanding of Machine Learning, Deep Learning, Numeric Optimization or Statistical Modeling – consider taking a dedicated class on these subjects.
Recommended literature
- Hands-On Machine Learning with Scikit-Learn and TensorFlow, by Aurélien Géron.
- Deep Learning with Python, by Francois Chollet
Recommended courses
- Machine learning, by Andrew Ng, Coursera
- Deeplearning.ai
- Dive into Deep Learning
- MIT EdX Machine Learning with Python (Barzilay & Jaakkola)
- MIT CSAIL’s 6.036 Introduction to Machine Learning
- MIT EECS 6.883/6.S083 Modeling with Machine Learning: from Algorithms to Applications
- MIT 6.S191
Related Classes / Initiatives
- MIT EECS IAP 6.S192: Deep Learning AI for Arts, Aesthetics, and Creativity
- A class given in IAP’19 focused on image creation.
- Our class will extend to other media: text, audio, music, 3D, sketch, etc.
- Machine Learning for Artists, Gene Kogan: https://ml4a.github.io/
- A book-in-progress and a string of talks (class) given in NYU’s ITP
- Artists + Machine Intelligence (AMI): https://medium.com/artists-and-machine-intelligence
- An initiative from Google to bring AI tools to the hands of designers.
- Machine Learning for Musicians and Artists: https://www.kadenze.com/courses/machine-learning-for-musicians-and-artists/info
- A class on connecting classical ML models to artistic tools such as MAX/MSP, PD, Processing etc.
- Creative Applications of Deep Learning with TensorFlow: https://www.kadenze.com/courses/creative-applications-of-deep-learning-with-tensorflow/info