References

The main reference for the first half of the course is Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville, freely available online.

Additional colabs

Playing with parity : Here we train a simple neurl net to learn the parity function, and look at how well it generalizes.

Classifying descents in Sn : Here we train a simple neural network to classify the descent sets of a permutation.

Weekly overview

Week 1, Geordie Williamson

Basics of Machine Learning

Classic problems in machine learning, kernel methods, deep neural networks, supervised learning, and basic examples.

Week 2, Joel Gibson

What can and can’t neural networks do

Universal approximation theorem and convolutional neural networks

Lecture links: Tensorflow playground, approximation by bumps/ridges, convolution filters

Week 3, Georg Gottwald

How to think about machine learning

Borrowing from statistical mechanics, dynamical systems and numerical analysis to better understand deep learning.

Week 4, Georg Gottwald

Week 4, Joel Gibson

Week 5, Geordie Williamson

Geometric Deep Learning; or never underestimate symmetry

Week 6, Georg Gottwald

Week 6, Geordie Williamson

Saliency and Combinatorial Invariance

Week 7, Adam Zsolt Wagner

A simple RL setup to find counterexamples to conjectures in mathematics

Abstract: In this talk we will leverage a reinforcement learning method, specifically the cross-entropy method, to search for counterexamples to several conjectures in graph theory and combinatorics. We will present a very simplistic setup, in which only minimal changes need to be made (namely the reward function used for RL) in order to successfully attack a wide variety of problems. As a result we will resolve several open problems, and find more elegant counterexamples to previously disproved ones.

Week 8, Bamdad Hosseini

Perspectives on graphical semi-supervised learning [no recording, see lecture notes]

Abstract: Semi-supervised learning (SSL) is the problem of extending label information from a small subset of a data set to the entire set. In low-label regimes the geometry of the unlabelled set is a crucial aspect that should be leveraged in order to obtain algorithms that outperform standard supervised learning. In this talk I will introduce graphical SSL algorithms that rely on manifold regularization in order to incorporate this geometric information. I will discuss interesting connections to linear algebra and matrix perturbations, kernel methods, and theory of elliptic partial differential equations.

Week 9, Carlos Simpson

Machine learning for optimizing certain kinds of classification proofs for finite structures

Abstract: We’ll start by looking at the structure of classification proofs for finite semigroups and how to program these in Pytorch. (That could be the subject of the tutorial.) A proof by cuts generates a proof tree—think of solving Sudoku. Its size depends on the choice of cut locations at each stage. This leads to the question of how to choose the cuts in an optimal way. We’ll discuss the Value-Policy approach to RL for this, and discuss some of the difficulties notably in sampling. Then we’ll look at another approach, somewhat more heuristic, that aims to provide a faster learning process with the goal of obtaining an overall gain in time when the training plus the proof are counted together.

Week 10, Alex Davies

A technical history of AlphaZero

Abstract: In 2016 AlphaGo defeated the world champion go player Lee Sedol in a historic 5 game match. In this lecture we will discuss the research behind this system and the innovations that ultimately lead to AlphaZero, which can learn to play multiple board games, including Go, from scratch without human knowledge.

Week 11, Daniel Halpern-Leinster

Learning selection strategies in Buchberger’s algorithm

Abstract: Studying the set of exact solutions of a system of polynomial equations largely depends on a single iterative algorithm, known as Buchberger’s algorithm. Optimized versions of this algorithm are crucial for many computer algebra systems (e.g., Mathematica, Maple, Sage). After discussing the problem and what makes it challenging, I will discuss a new approach to Buchberger’s algorithm that uses reinforcement learning agents to perform S-pair selection, a key step in the algorithm. In certain domains, the trained model outperforms state-of-the-art selection heuristics in total number of polynomial additions performed, which provides a proof-of-concept that recent developments in machine learning have the potential to improve performance of algorithms in symbolic computation.

Week 12, Gitta Kutyniok

Deep Learning meets Shearlets: Explainable Hybrid Solvers for Inverse Problems in Imaging Science

Abstract: Pure model-based approaches are today often insufficient for solving complex inverse problems in medical imaging. At the same time, methods based on artificial intelligence, in particular, deep neural networks, are extremely successful, often quickly leading to state-of-the-art algorithms. However, pure deep learning approaches often neglect known and valuable information from the modeling world and suffer from a lack of interpretability.

In this talk, we will develop a conceptual approach towards inverse problems in imaging sciences by combining the model-based method of sparse regularization by shearlets with the data-driven method of deep learning. Our solvers pay particular attention to the singularity structures of the data. Focussing then on the inverse problem of (limited-angle) computed tomography, we will show that our algorithms significantly outperform previous methodologies, including methods entirely based on deep learning. Finally, we will also touch upon the issue of how to interpret the results of such algorithms, and present a novel, state-of-the-art explainability method based on information theory.

Week 13, Qianxiao Li

Deep learning for sequence modelling

Abstract: In this talk, we introduce some deep learning based approaches for modelling sequence to sequence relationships that are gaining popularity in many applied fields, such as time-series analysis, natural language processing, and data-driven science and engineering. We will also discuss some interesting mathematical issues underlying these methodologies, including approximation theory and optimization dynamics.

Qianxiao has provided some notebooks on GitHub.

Week 14, Lars Buesing

Searching for Formulas and Algorithms: Symbolic Regression and Program Induction

Abstract: In spite of their enormous success as black box function approximators in many fields such as computer vision, natural language processing and automated decision making, Deep Neural Networks often fall short of providing interpretable models of data. In applications where aiding human understanding is the main goal, describing regularities in data with compact formuli promises improved interpretability and better generalization. In this talk I will introduce the resulting problem of Symbolic Regression and its generalization to Program Induction, highlight some learning methods from the literature and discuss challenges and limitations of searching for algorithmic descriptions of data.

Lecture notes

Week 1 - Lecture notes.pdf

Week 2 - Lecture notes.pdf

Week 3 - Lecture notes.pdf

Week 4 - Lecture notes - Part 1.pdf

Week 4 - Lecture notes - Part 2.pdf

Week 5 - Lecture notes.pdf

Week 6 - Lecture Notes - Part 1.pdf

Week 6 - Lecture Notes - Part 2.pdf

Week 7 - Lecture notes.pdf

Week 9 - Lecture notes.pdf

Week 13 - Lecture notes.pdf

Week 14 - Lecture notes.pdf

Jupyter notebooks

MLWM Jupyter files Week 1 Workshop - Introduction to PyTorch.ipynb Week 2 Workshop - The Möbius Function.ipynb Week 3 Workshop - Continue with week 2!.ipynb Week 4 Workshop - RNNs and LSTM.ipynb Week 5 Workshop_ First steps with GNNs Week 6 Workshop_ More fun with GNNs Week 7 V2 - Using reinforcement learning to solve simple math problems.ipynb Week 7 Workshop_ Using reinforcement learning to solve simple math problems.ipynb Week 9 - Classifying semigroups.ipynb Week 10 - Deep Sets.ipynb Week 11 - RL with Buchberger_s algorithm.ipynb Week 12 - CNNs, the MNIST database, and normalisation.ipynb

Machine Learning for the Working Mathematician 2022 detailed course overview