12.S592

Fall 2023, MIT: 12.S592  U/G      MIT REGISTRATION   Lec: F. 9-11  Rm: 35-308 and Zoom
Simultaneously offered at Instituto Politecnico National (IPN) Queretaro (via Zoom).
Instructor: Sai Ravela (ravela@mit.edu)

  • Offered since 2017, in earlier form since 2013
  • Dynamics, Optimization, and Information foundations of Machine Learning
  • ML-based System Dynamics and Optimization for Earth, Planets, Climate, and Life
  • Flexible participation model
  • Pre-reqs: Linear Algebra, Probability & Stat, a first course in ML, and some Systems preparation (at least one of) Statistical Signal Processing, Estimation, Control or Optimization

Seminar course 12.S592 is offered every term. Its syllabus follows four primary themes annually with an almost periodic, ahem, "50-shades of challenging grey matter" over approximately 50 weeks.  Some have stayed on in the course for six years!

    Theme By Season

    The Spring theme is Informativeness for Estimation, Control, Learning and Inference.

    Topics:

    1. Systems Dynamics and Optimization (SDO):  
      • Using Models for Prediction and Discovery in Earth, Planets, Climate and Life.
      • Errors in states, model parameters and structure. 
      • Inference using Theory, Analogs, Experts, and Data for prediction and discovery. 
      • Interacting with a complex non-stationary real-world. The System Dynamics and Optimization cycle.
      •  The notion of Information Gain.  Information principles for efficacy, robustness, reliability, and resilience. 
      • Introduction to Co-Active Systems Theory.
    2. Linear Gaussian World: 
      1. Introduction to Optimal Estimation
      2. Introduction to Optimal Control.
      3. Introduction to Gaussian Graphical Models
      4. optimal estimation, developing basic ideas in organizing data into features, inference on graphs, and learning for information gain. We close the loop for sequential processing through recursive estimation. We examine how Linear-Gaussian world  degrades in a nonlinear, non-gaussian world with complex structure. 
      5.  
    3. Regularization and Sparsity Promotion
    4. Reproducing Kernel Hilbert Spaces
    5. Towards a Non-Linear World
      1. Ensemble-Approximated Gaussian Processes
      2. Mixtures and Kernels
        1. Expectation Maximization
      3. Sampling the Posterior
        1. Markov Chain Monte Carlo and Hamiltonian Monte Carlo
        2. Particle Filters and Smoothers
    6. Entropy and Measures on Information
      1. Entropy as Sparsity
      2. Tunable Entropy
      3. Conditional Entropy and the Information Bottleneck
    7. Variational Bayes
    8. Bayesian Optimization

     

    In Summer 2024 we will study the "backends" of interesting learning machines, including effective ways to train them with application to Earth, Planets, Climate, and Life

    • Convolutional Neural Networks and Residual Networks
    • Long Short-Term Memory
    • Encoder-Decoder Machines
    • Generative Adversarial Learning
    • Stable Diffusion
    • Graph Neural Networks
    • Transformers
    • Neural Dynamical Systems
       
    1. Earth, Planet, Climate and Life Applications and Associated Systems Problems
      • Prototype Application: A Planet-scale Observatory in a Changing Climate.
      • Problems: Reduced and Induced Models, Uncertainty Quantification, Planning, Detection, Estimation and Control.
    2. Opportunities 
      • Learning in Physics -- what opportunities for learning in Earth, Planets, Climate, and Life exist? 
      • Learning from Physics and Data-- Can physics train machines to be more effective in handling the data?  How can we combine the two?
      • The Physics of Learning -- How to understand Learning as a process?
      • Learning the Physics -- Discovering Equations from Data

    Winter is meant to deal with "unsupervised" problems (no pun intended). We will prepare ourselves in terms of techniques In Winter 2024, we will look at the following topics:

    Trees and Hashes
    Principal and Independent Components
    Mixtures in the Exponential Family
    Kernels and Reproducing Kernel Hilbert Spaces, including

    • Moore Aronszajn Theorem
    • Reisz's Representation Theorem
    • Mercer's Theorem

    Kernel Machines -- e.g., the Gaussian Process
    Kernels and Embeddings
    Graph Laplacians (and other properties)
    Manifolds
    Co-Active Data Organization 
    Some Applied Examples: Developing simple models for reproducing the behavior of  a Flocks of Birds, the Peloton, an Airplane, the Gulf Stream, Cyclone or the Atmospheric River.