##### Lecture 1: An Introduction to Deep Reinforcement Learning

Reinforcement Learning (RL) is a subfield of machine learning that is

concerned with training agents to take decisions such that a

cumulative reward signal is maximized in an environment. It provides a

highly general and elegant toolbox for building intelligent systems

that learn through interacting with their environment rather than

supervision. In the past few years RL (in combination with deep neural

networks) has shown impressive results on a variety of challenging

domains, from games to robotics. It is also seen by some as a possible

path towards general human-level intelligent systems. I will explain

some of the basic algorithms that the field is based on (Q Learning,

Policy Gradients), as well as a few extensions to these algorithms

that are used in practice (PPO, IMPALA, and others).

##### Lecture 2: Milestones in Large-scale Reinforcement Learning: AlphaZero, OpenAI Five and AlphaStar

Over the past few years, we have seen a number of successes in Deep

Reinforcement Learning: Among other results, RL agents have been able

to match or exceed the strength of the best human players at the games

of Go, Dota II and StarCraft II. These were achieved by AlphaZero,

OpenAI Five and AlphaStar, respectively. I will go into details of how

these three systems work, highlighting similarities and differences.

What are the lessons we can draw from these results, and what is

missing to apply Deep RL to challenging real world problems?

##### Lecture 3 – Tutorial: JAX, A new library for building neural networks

Tutorial: JAX, A new library for building neural networks

JAX is a new framework for deep learning developed at Google AI.

Written by the authors of the popular autograd library, it is built on

the concept of function transformations: Higher-order functions like

‘grad’, ‘vmap’, ‘jit’, ‘pmap’ and others are powerful tools that allow

researchers to express ML algorithms succinctly and correctly, while

making full use of hardware resources like GPUs and TPUs. Most

importantly, solving problems with JAX is fun! I will give a short

introduction to JAX, covering the most important function

transformations, and demonstrating how to apply JAX to several ML

problems.

##### Lecture 1: Autoencoders and Deep Learning

TBA

##### Lecture 2: Deep Learning in the Physical Sciences

Abstract TBA

##### Lecture 3: Deep Learning in the Life Sciences

TBA

##### Lecture 1: Introduction to the Value of Information Theory

Data analysis allows us to create models of and extract information

about various phenomena (e.g. recognition, classification, prediction of

some objects or events). What is the value of this information?

Ideally, information should improve the quality of decisions, which is

manifested in a reduction of errors or an increase of expected utility.

The maximum possible `improvement’ represents the value of information.

The corresponding mathematical theory originated in the works of Claude

Shannon on rate distortion, and later was developed in the 1960s by

Ruslan Stratonovich and his colleagues. In this lecture, I will remind

the solution of the maximum entropy problem with a linear constraint.

This will outline the main mathematical ideas leading to other

variational problems with constraints on different types of information

(namely, of Hartley, Boltzmann and Shannon’s types). The optimal values

of these problems are the values of different types of information. The

value of Shannon’s information is particularly interesting, because in

many cases it has a nice analytical solution, and it provides

theoretical upper frontier for the values of all types of information.

Geometric analysis of solutions to problems on the values of information

of different types gives interesting insights about the role of

randomization in various learning and optimization algorithms.

##### Lecture 2: Applications of the Value of Information: Graphs, Evolutionary and Learning Algorithms

I will show that the power-law graphs are solutions to the maximum entropy problem, and that the preferential attachment procedure generating such graphs can be derived as solution to the dual problem on path length minimization with constraints on information, which is the value of information problem. Another application is optimal control of mutation rates in evolutionary algorithms. One interesting solution to this problem can be obtained by maximizing expected fitness of a population subject to constraints on information divergence. In the end I will discuss how optimal learning can be defined as dynamical generalization of the value of information problem.

##### Lecture 3: Tutorial on Quantum Probability

The interest in quantum computing has resulted in greater penetration of the ideas from quantum physics into broader areas of science, which include not only computer science, but also social sciences and even psychology. Yet, many results and notation of quantum physics and quantum information remain alien and difficult for understanding. This talk is intended for those familiar with basic ideas of the classical probability theory, and it will summarize the main facts about its non-commutative (i.e. quantum) generalization. If time allows, I will also mention some curious facts about the quantum-classical interface.

##### Lecture: Geometric deep learning: history, successes, promises, and challenges

In the past decade, deep learning methods have achieved unprecedented performance on a broad range of problems in various fields from computer vision to speech recognition. So far research has mainly focused on developing deep learning methods for Euclidean-structured data. However, many important applications have to deal with non-Euclidean structured data, such as graphs and manifolds. Such data are becoming increasingly important in computer graphics and 3D vision, sensor networks, drug design, biomedicine, high energy physics, recommendation systems, and social media analysis. The adoption of deep learning in these fields has been lagging behind until recently, primarily since the non-Euclidean nature of objects dealt with makes the very definition of basic operations used in deep networks rather elusive. In this talk, I will introduce the emerging field of geometric deep learning on graphs and manifolds, overview existing solutions and outline the key difficulties and future research directions. As examples of applications, I will show problems from the domains of computer vision, graphics, medical imaging, and protein science.

##### Tutorial 1: From grids to graphs

Fundamental challenges of building deep learning architectures for non-Euclidean structured data * Signal processing perspective * Convolutions in spectral and spatial domains * Basic recipes how to build graph neural networks.

##### Tutorial 2: Theory and practice

##### Tutorial 3: Manifolds, meshes, and point clouds

##### Lecture 1

TBA

##### Lecture 2

TBA

##### Lecture 1: A Constraint-based approach to learning and reasoning

TBA

##### Lecture 2

TBA

##### Finding Deeply Hidden Truths: Breakthroughs in Nonlinear Identifiability Theory

Is principled disentanglement possible? Equivalently, do nonlinear models converge to a unique set of representations when given sufficient data and compute? We provide an introduction to this problem, and present some surprising recent theoretical results that answer this question in the affirmative. In particular, we show that the representations learned by a very broad family of neural networks are identifiable up to only a trivial transformation. The family of models for which we derive strong identifiability results includes a large fraction of models in use today, including supervised models, self-supervised models, flow-based models, VAEs and energy-based models.

##### Lecture 1: Evolving Neural Networks for POMDP Tasks

TBA

##### Lecture 2: Evolutionary Neural Architecture Search

Abstract TBA

##### Lecture 3: Evolutionary Surrogate-Assisted Optimization

Abstract TBA

##### Lecture 1: Beyond Backpropagation: Cognitive Architectures for Object Recognition in Video – Requisites for a Cognitive Architecture

Lecture I – Requisites for a Cognitive Architecture

- Processing in space
- Processing in time with memory
- Top down and bottom processing
- Extraction of information from data with generative models
- Attention mechanisms and fovea vision

##### Lecture 2: Beyond Backpropagation: Cognitive Architectures for Object Recognition in Video – Putting it all together

Lecture II – Putting it all together

- Empirical Bayes with generative models
- Clustering of time series with linear state models
- Information Theoretic Autoencoders

##### Lecture 3: Beyond Backpropagation: Cognitive Architectures for Object Recognition in Video – Modular Learning for Deep Networks

Lecture III – Beyond Backpropagation: Modular Learning for Deep Networks

- Reinterpretation of neural network layers
- Training each learning without backpropagation
- Examples and advantages in transfer learning

##### Lecture 1: Bayesian hierarchical models for single-cell ‘omics – Foundations and problem description

TBA

##### Lecture 2: Hierarchical models for gene epression in single cells

TBA

##### Lecture 3: Single-cell epigenetics and multi-omics

TBA

##### Lecture 1: Machine Learning for Medicine – a new research frontier

TBA

##### Lecture 2: Causal Inference and Estimating Individualized Treatment Effects

TBA

##### Lecture 3: From Black Boxes to White Boxes: Machine Learning Interpretability, Explainability and Trustworthiness

TBA