VAEs and Normalizing Flows: An Introduction to Modeling High-Dimensional Data with Deep Learning

Of great interest to research and industry is the ability to model and simulate very high-dimensional data, such as images, audio or text. We provide an introduction to a powerful and general set of techniques for high-dimensional modeling, that simultaneously allows for efficient learning, inference and synthesis. We introduce the framework of VAEs, that uses amortized variational inference to efficiently learn deep latent-variable models. We also introduce Normalizing Flows (NFs), an equally useful, and interlinking, technique. NFs allow us to improve variational inference, and can even completely remove the need for variational inference. We explain common methods such as NICE, IAF, and Glow.

Flow Contrastive Estimation: An (Un?)-Holy Trinity of Energy-Based Models, Likelihood-Based Models and GANs

Learning generative models with tractable likelihoods is fairly straightforward, as we have shown. But what about learning energy-based models, with intractable partition functions? Few methods exist that scale to high-dimensional data. To this end we introduce Flow Contrastive Estimation (FCE), a new method for estimating energy-based models. FCE was primarily conceived as a version of noise-contrastive estimation (NCE), adding an adaptive noise distribution, making it scale well to high-dimensional data. Secondarily, FCE is also a method for optimizing likelihood-based generative model w.r.t. the Jensen-Shannon divergence, as an alternative to the usual Kullback-Leibler Divergence. Lastly, we show that the FCE method is also a special case of the GAN method, where the generator is given by a flow-based model, and the discriminator is parameterized by contrasting the likelihoods of an energy-based model and the flow-based model.

Finding Deeply Hidden Truths: Breakthroughs in Nonlinear Identifiability Theory

Is principled disentanglement possible? Equivalently, do nonlinear models converge to a unique set of representations when given sufficient data and compute? We provide an introduction to this problem, and present some surprising recent theoretical results that answer this question in the affirmative. In particular, we show that the representations learned by a very broad family of neural networks are identifiable up to only a trivial transformation. The family of models for which we derive strong identifiability results includes a large fraction of models in use today, including supervised models, self-supervised models, flow-based models, VAEs and energy-based models.

Lecture 1: Evolving Neural Networks for POMDP Tasks

Abstract TBA

Lecture 2: Evolutionary Neural Architecture Search

Abstract TBA

Lecture 1: Beyond Backpropagation: Cognitive Architectures for Object Recognition in Video – Requisites for a Cognitive Architecture

Lecture I – Requisites for a Cognitive Architecture

  • Processing in space
  • Processing in time with memory
  • Top down and bottom processing
  • Extraction of information from data with generative models
  • Attention mechanisms and fovea vision
Lecture 2: Beyond Backpropagation: Cognitive Architectures for Object Recognition in Video – Putting it all together

Lecture II – Putting it all together

  • Empirical Bayes with generative models
  • Clustering of time series with linear state models
  • Information Theoretic Autoencoders
Lecture 3: Beyond Backpropagation: Cognitive Architectures for Object Recognition in Video – Modular Learning for Deep Networks

Lecture III – Beyond Backpropagation: Modular Learning for Deep Networks

  • Reinterpretation of neural network layers
  • Training each learning without backpropagation
  • Examples and advantages in transfer learning

Lecture 1: Machine Learning for Medicine – a new research frontier


Lecture 2: Causal Inference and Estimating Individualized Treatment Effects


Lecture 3: From Black Boxes to White Boxes: Machine Learning Interpretability, Explainability and Trustworthiness