## References

The main reference for the first half of the course is *Deep Learning* by Ian Goodfellow, Yoshua Bengio, and Aaron Courville, freely available online.

## Additional colabs

Playing with parity : Here we train a simple neurl net to learn the parity function, and look at how well it generalizes.

Classifying descents in Sn : Here we train a simple neural network to classify the descent sets of a permutation.

# Weekly overview

## Week 1, Geordie Williamson

*Basics of Machine Learning*

Classic problems in machine learning, kernel methods, deep neural networks, supervised learning, and basic examples.

## Week 2, Joel Gibson

*What can and can’t neural networks do*

*What can and can’t neural networks do*

Universal approximation theorem and convolutional neural networks

**Lecture links**: Tensorflow playground, approximation by bumps/ridges, convolution filters

### Week 3, Georg Gottwald

*How to think about machine learning*

Borrowing from statistical mechanics, dynamical systems and numerical analysis to better understand deep learning.

### Week 4, Georg Gottwald

*Regularisation*

### Week 4, Joel Gibson

*Recurrent Neural Nets*

*Recurrent Neural Nets*

### Week 5, Geordie Williamson

*Geometric Deep Learning; or never underestimate symmetry*

### Week 6, Georg Gottwald

*Geometric Deep Learning* *II*

*Geometric Deep Learning*### Week 6, Geordie Williamson

*Saliency and Combinatorial Invariance*

### Week 7, Adam Zsolt Wagner

*A simple RL setup to find counterexamples to conjectures in mathematics*

**Abstract**: In this talk we will leverage a reinforcement learning method, specifically the cross-entropy method, to search for counterexamples to several conjectures in graph theory and combinatorics. We will present a very simplistic setup, in which only minimal changes need to be made (namely the reward function used for RL) in order to successfully attack a wide variety of problems. As a result we will resolve several open problems, and find more elegant counterexamples to previously disproved ones.

Related paper: *Constructions in combinatorics via neural networks*.

### Week 8, Bamdad Hosseini

*Perspectives on graphical semi-supervised learning* [no recording, see lecture notes]

**Abstract**: Semi-supervised learning (SSL) is the problem of extending label information from a small subset of a data set to the entire set. In low-label regimes the geometry of the unlabelled set is a crucial aspect that should be leveraged in order to obtain algorithms that outperform standard supervised learning. In this talk I will introduce graphical SSL algorithms that rely on manifold regularization in order to incorporate this geometric information. I will discuss interesting connections to linear algebra and matrix perturbations, kernel methods, and theory of elliptic partial differential equations.

### Week 9, Carlos Simpson

*Machine learning for optimizing certain kinds of classification proofs for finite structures*

*Machine learning for optimizing certain kinds of classification proofs for finite structures*

**Abstract**: We’ll start by looking at the structure of classification proofs for finite semigroups and how to program these in Pytorch. (That could be the subject of the tutorial.) A proof by cuts generates a proof tree—think of solving Sudoku. Its size depends on the choice of cut locations at each stage. This leads to the question of how to choose the cuts in an optimal way. We’ll discuss the Value-Policy approach to RL for this, and discuss some of the difficulties notably in sampling. Then we’ll look at another approach, somewhat more heuristic, that aims to provide a faster learning process with the goal of obtaining an overall gain in time when the training plus the proof are counted together.

Related paper: *Learning proofs for the classification of nilpotent semigroups*

### Week 10, Alex Davies

*A technical history of AlphaZero*

*A technical history of AlphaZero*

**Abstract**: In 2016 AlphaGo defeated the world champion go player Lee Sedol in a historic 5 game match. In this lecture we will discuss the research behind this system and the innovations that ultimately lead to AlphaZero, which can learn to play multiple board games, including Go, from scratch without human knowledge.

### Week 11, Daniel Halpern-Leinster

*Learning selection strategies in Buchberger’s algorithm** *

**Abstract**: Studying the set of exact solutions of a system of polynomial equations largely depends on a single iterative algorithm, known as Buchberger’s algorithm. Optimized versions of this algorithm are crucial for many computer algebra systems (e.g., Mathematica, Maple, Sage). After discussing the problem and what makes it challenging, I will discuss a new approach to Buchberger’s algorithm that uses reinforcement learning agents to perform S-pair selection, a key step in the algorithm. In certain domains, the trained model outperforms state-of-the-art selection heuristics in total number of polynomial additions performed, which provides a proof-of-concept that recent developments in machine learning have the potential to improve performance of algorithms in symbolic computation.

Related paper: *Learning selection strategies in Buchberger’s algorithm*

### Week 12, Gitta Kutyniok

*Deep Learning meets Shearlets: Explainable Hybrid Solvers for Inverse Problems in Imaging Science *

**Abstract**: Pure model-based approaches are today often insufficient for solving complex inverse problems in medical imaging. At the same time, methods based on artificial intelligence, in particular, deep neural networks, are extremely successful, often quickly leading to state-of-the-art algorithms. However, pure deep learning approaches often neglect known and valuable information from the modeling world and suffer from a lack of interpretability.

In this talk, we will develop a conceptual approach towards inverse problems in imaging sciences by combining the model-based method of sparse regularization by shearlets with the data-driven method of deep learning. Our solvers pay particular attention to the singularity structures of the data. Focussing then on the inverse problem of (limited-angle) computed tomography, we will show that our algorithms significantly outperform previous methodologies, including methods entirely based on deep learning. Finally, we will also touch upon the issue of how to interpret the results of such algorithms, and present a novel, state-of-the-art explainability method based on information theory.

### Week 13, Qianxiao Li

*Deep learning for sequence modelling*

**Abstract:** In this talk, we introduce some deep learning based approaches for modelling sequence to sequence relationships that are gaining popularity in many applied fields, such as time-series analysis, natural language processing, and data-driven science and engineering. We will also discuss some interesting mathematical issues underlying these methodologies, including approximation theory and optimization dynamics.

Qianxiao has provided some notebooks on GitHub.

### Week 14, Lars Buesing

*Searching for Formulas and Algorithms: Symbolic Regression and Program Induction*

**Abstract:** In spite of their enormous success as black box function approximators in many fields such as computer vision, natural language processing and automated decision making, Deep Neural Networks often fall short of providing interpretable models of data. In applications where aiding human understanding is the main goal, describing regularities in data with compact formuli promises improved interpretability and better generalization. In this talk I will introduce the resulting problem of Symbolic Regression and its generalization to Program Induction, highlight some learning methods from the literature and discuss challenges and limitations of searching for algorithmic descriptions of data.