Maths and machine learning

Can artificial intelligence help solve humanity’s tough problems? This is one of the most pressing questions in contemporary science.

Over the last two decades, we have seen neural networks starting to perform well on tasks that humans find easy, like image and speech recognition. However, most tasks that require conscious thought, like mathematics problems, are beyond the capacity of current neural networks.

Nature cover: AI-guided intution

SMRI Director Geordie Williamson and a team of mathematicians from Oxford University have collaborated with AI lab DeepMind, using machine learning to help prove or suggest new mathematical theorems. This is one of the first examples where machine learning has been used to guide human intuition on decades-old problems.  The results were published in the preeminent journal Nature in December 2021.

Geordie applied the power of DeepMind’s AI processes to explore conjectures in representation theory, a branch of pure mathematics. This has brought him closer to proving a conjecture concerning deep symmetry in higher dimensional algebra, which has been unsolved for 40 years.

In parallel work, the Oxford mathematicians used the AI to discover surprising connections in the field of knot theory, establishing a completely new mathematical theorem.

View external media

Read the University of Sydney news article and view more press on the SMRI in the News page. You can also join the conversation on Twitter.

AP Archive video: “Can the latest AI technology out-math the mathematicians?” feat. Geordie Williamson and Lindon Roberts, University of Sydney (22 May 2023)
SBS News in Depth: “Mathematicians say artificial intelligence just doesn’t add up” feat. Geordie Williamson and Lindon Roberts (20 May 2023)

 SMRI YouTube video (14 March 2022): “Intuition meets AI in pure mathematics”, also featured in Math off the grid Carnival of Mathematics blog post (3 April 2022)
 Quanta magazine article (15 Feb 2022): “Machine Learning Becomes a Mathematical Collaborator”
 University of Sydney news feat. Geordie Williamson (2 Dec 2021): “Mathematicians use DeepMind AI to create new methods…”
 The Conversation article by Geordie Williamson (2 Dec): “Mathematical discoveries take intuition and creativity…”
Nature News feature (1 Dec): “DeepMind’s AI helps untangle the mathematics of knots”
 DeepMind research blog post (1 Dec): “Exploring the beauty of pure mathematics in novel ways”
 New Scientist technology news article (1 Dec): “DeepMind AI collaborates with humans on two mathematical breakthroughs”
 University of Oxford news release (1 Dec): “Machine learning helps mathematicians make new connections”
 VentureBeat article (1 Dec): “DeepMind claims AI has aided new discoveries and insights in mathematics”
 TechCrunch article (2 Dec): “AI proves a dab hand at pure mathematics and protein hallucination”
 COSMOS article (3 Dec): “The AI making waves in complex mathematics”
Numberphile podcast feat. Alex Davies, DeepMind & Marcus du Sautoy, University of Oxford (3 Dec): “Google’s ‘DeepMind’ does Mathematics”
2ser Science Spotlight with Cameron Furlong feat. Geordie Williamson (9 Dec): “Mathematics and AI”
 Silicon Reckoner blog post (3 Dec): “News flash: DeepMind and “the beauty of pure mathematics”
 Combinatorics and more blog post (4 Dec): “To cheer you up in difficult times 33: Deep learning leads to progress in knot theory and on the conjecture that Kazhdan-Lusztig polynomials are combinatorial”
 Science Alert tech news (4 Dec): “AI Is Discovering Patterns in Pure Mathematics That Have Never Been Seen Before”
 Towards AI Deep Learning blog post (4 Dec): “Inside DeepMind’s New Efforts to Use Deep Learning to Advanced Mathematics”
 DailyAlts Artificial Intelligence news (6 Dec): “Artificial Intelligence: New Revelations In Pure Mathematics By AI”
 Live Science news (7 Dec): “DeepMind cracks ‘knot’ conjecture that bedeviled mathematicians for decades”
 SingularityHub article (7 Dec): “How DeepMind’s AI Helped Crack Two Mathematical Puzzles That Stumped Humans for Decades”
 IANS news (2 Dec): “DeepMind & Mathematicians Use AI To Solve The Knot Problem”
 Analytics Drift data science news (6 Dec): “DeepMind Makes Huge Breakthrough by Discovering New Insights in Mathematics”
 Analytics India Mag opinions piece (7 Dec): “Australian mathematician cuts through knotty questions with AI”
 Ask Innovative India AI news (9 Dec): “DeepMind’s AI aids in the deciphering of knot mathematics!”
 Article in Sky News (2 Dec): “Mathematicians hail breakthrough in using AI to suggest new theorems”
 Article in Independent (2 Dec): “Scientists make huge breakthrough to give AI mathematical capabilities never seen before”
 Article in The Times (2 Dec): “DeepMind’s artificial intelligence software helps mathematicians pinpoint patterns”

Kazhdan-Lusztig polynomial G2 (v equals 1). Simulation by Joel Gibson

This video shows a visualisation in representation theory—while it doesn’t show machine learning data, it illustrates the insights one can try to learn. Representation theory studies abstract algebra at higher dimensions by representing their elements as linear transformations. This makes it easier to identify symmetries and other patterns deep within their structures.

Geordie’s AI-assisted research focused on Kazhdan-Lusztig (KL) polynomials, which are important measurements within representation theory. A chemical analogy is to describe representation theory as atoms, and KL polynomials as the atomic numbers of mathematical structure. This video shows the development of certain KL polynomials: they quickly develop complicated patterns of symmetry.

Artificial intelligence and machine learning can assist in the discovery of patterns in higher dimensions, revealing patterns faster or unseen by human methods alone. In the project, Geordie focused on a particular conjecture in representation theory (the combinatorial invariance conjecture), which involved associating a KL polynomial with an abstract object called a Bruhat graph. Geordie and DeepMind colleagues trained a neural network to absorb a Bruhat graph and generate a prediction for the KL polynomial, with impressively accurate results.

Machine learning for the working mathematician (SMRI course: Semester One, 2022)

The Machine Learning for the Working Mathematician course is an introduction to ways in which machine learning (and in particular deep learning) has been used to solve problems in mathematics. The seminar series was organised by Joel Gibson, Georg Gottwald, and Geordie Williamson.

We aim for a toolbox of simple examples, where one can get a grasp on what machine learning can and cannot do. We want to emphasise techniques in machine learning as tools that can be used in mathematics research, rather than a source of problems in themselves. The first six weeks or so will be introductory, and the second six weeks will feature talks from experts on applications.

Two nice examples of recent work that give the ‘flavour’ of the seminar are:

The lectures can be viewed on the YouTube playlist above. You can download the Jupyter notebooks to accompany the lecture notes below.


The main reference for the first half of the course will be Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville, freely available online.

Additional colabs

Playing with parity : Here we train a simple neurl net to learn the parity function, and look at how well it generalizes.

Classifying descents in Sn : Here we train a simple neural network to classify the descent sets of a permutation.

MLWM Jupyter filesWeek 1 Workshop - Introduction to PyTorch.ipynbWeek 2 Workshop - The Möbius Function.ipynbWeek 3 Workshop - Continue with week 2!.ipynbWeek 4 Workshop - RNNs and LSTM.ipynbWeek 5 Workshop_ First steps with GNNsWeek 6 Workshop_ More fun with GNNsWeek 7 V2 - Using reinforcement learning to solve simple math problems.ipynbWeek 7 Workshop_ Using reinforcement learning to solve simple math problems.ipynbWeek 9 - Classifying semigroups.ipynbWeek 10 - Deep Sets.ipynbWeek 11 - RL with Buchberger_s algorithm.ipynbWeek 12 - CNNs, the MNIST database, and normalisation.ipynb

Week 1 - Lecture notes.pdf

Week 2 - Lecture notes.pdf

Week 3 - Lecture notes.pdf

Week 4 - Lecture notes - Part 1.pdf

Week 4 - Lecture notes - Part 2.pdf

Week 5 - Lecture notes.pdf

Week 6 - Lecture Notes - Part 1.pdf

Week 6 - Lecture Notes - Part 2.pdf

Week 7 - Lecture notes.pdf

Week 9 - Lecture notes.pdf

Week 13 - Lecture notes.pdf

Week 14 - Lecture notes.pdf


Week 1, Geordie Williamson

Basics of Machine Learning

Classic problems in machine learning, kernel methods, deep neural networks, supervised learning, and basic examples.

Week 2, Joel Gibson

What can and can’t neural networks do

Universal approximation theorem and convolutional neural networks

Lecture linksTensorflow playgroundapproximation by bumps/ridgesconvolution filters

Week 3, Georg Gottwald

How to think about machine learning

Borrowing from statistical mechanics, dynamical systems and numerical analysis to better understand deep learning.

Week 4, Georg Gottwald


Week 4, Joel Gibson

Recurrent Neural Nets

Week 5, Geordie Williamson

Geometric Deep Learning; or never underestimate symmetry

Week 6, Georg Gottwald

Geometric Deep Learning II

Week 6, Geordie Williamson

Saliency and Combinatorial Invariance

Week 7, Adam Zsolt Wagner

A simple RL setup to find counterexamples to conjectures in mathematics

Abstract: In this talk we will leverage a reinforcement learning method, specifically the cross-entropy method, to search for counterexamples to several conjectures in graph theory and combinatorics. We will present a very simplistic setup, in which only minimal changes need to be made (namely the reward function used for RL) in order to successfully attack a wide variety of problems. As a result we will resolve several open problems, and find more elegant counterexamples to previously disproved ones.

Related paper:  Constructions in combinatorics via neural networks.

Week 8, Bamdad Hosseini

Perspectives on graphical semi-supervised learning [no recording, see lecture notes]

Abstract: Semi-supervised learning (SSL) is the problem of extending label information from a small subset of a data set to the entire set. In low-label regimes the geometry of the unlabelled set is a crucial aspect that should be leveraged in order to obtain algorithms that outperform standard supervised learning. In this talk I will introduce graphical SSL algorithms that rely on manifold regularization in order to incorporate this geometric information. I will discuss interesting connections to linear algebra and matrix perturbations, kernel methods, and theory of elliptic partial differential equations.

Week 9, Carlos Simpson

Machine learning for optimizing certain kinds of classification proofs for finite structures

Abstract: We’ll start by looking at the structure of classification proofs for finite semigroups and how to program these in Pytorch. (That could be the subject of the tutorial.) A proof by cuts generates a proof tree—think of solving Sudoku. Its size depends on the choice of cut locations at each stage. This leads to the question of how to choose the cuts in an optimal way. We’ll discuss the Value-Policy approach to RL for this, and discuss some of the difficulties notably in sampling. Then we’ll look at another approach, somewhat more heuristic, that aims to provide a faster learning process with the goal of obtaining an overall gain in time when the training plus the proof are counted together. 

Related paper: Learning proofs for the classification of nilpotent semigroups

Week 10, Alex Davies

A technical history of AlphaZero

Abstract: In 2016 AlphaGo defeated the world champion go player Lee Sedol in a historic 5 game match. In this lecture we will discuss the research behind this system and the innovations that ultimately lead to AlphaZero, which can learn to play multiple board games, including Go, from scratch without human knowledge.

Week 11, Daniel Halpern-Leinster

Learning selection strategies in Buchberger’s algorithm 

Abstract: Studying the set of exact solutions of a system of polynomial equations largely depends on a single iterative algorithm, known as Buchberger’s algorithm. Optimized versions of this algorithm are crucial for many computer algebra systems (e.g., Mathematica, Maple, Sage). After discussing the problem and what makes it challenging, I will discuss a new approach to Buchberger’s algorithm that uses reinforcement learning agents to perform S-pair selection, a key step in the algorithm. In certain domains, the trained model outperforms state-of-the-art selection heuristics in total number of polynomial additions performed, which provides a proof-of-concept that recent developments in machine learning have the potential to improve performance of algorithms in symbolic computation.

Related paper: Learning selection strategies in Buchberger’s algorithm

Week 12, Gitta Kutyniok

Deep Learning meets Shearlets: Explainable Hybrid Solvers for Inverse Problems in Imaging Science 

Abstract: Pure model-based approaches are today often insufficient for solving complex inverse problems in medical imaging. At the same time, methods based on artificial intelligence, in particular, deep neural networks, are extremely successful, often quickly leading to state-of-the-art algorithms. However, pure deep learning approaches often neglect known and valuable information from the modeling world and suffer from a lack of interpretability. 

In this talk, we will develop a conceptual approach towards inverse problems in imaging sciences by combining the model-based method of sparse regularization by shearlets with the data-driven method of deep learning. Our solvers pay particular attention to the singularity structures of the data. Focussing then on the inverse problem of (limited-angle) computed tomography, we will show that our algorithms significantly outperform previous methodologies, including methods entirely based on deep learning. Finally, we will also touch upon the issue of how to interpret the results of such algorithms, and present a novel, state-of-the-art explainability method based on information theory.

Week 13, Qianxiao Li

Deep learning for sequence modelling

Abstract: In this talk, we introduce some deep learning based approaches for modelling sequence to sequence relationships that are gaining popularity in many applied fields, such as time-series analysis, natural language processing, and data-driven science and engineering. We will also discuss some interesting mathematical issues underlying these methodologies, including approximation theory and optimization dynamics.

Qianxiao has provided some notebooks on GitHub.

Week 14, Lars Buesing

Searching for Formulas and Algorithms: Symbolic Regression and Program Induction

Abstract: In spite of their enormous success as black box function approximators in many fields such as computer vision, natural language processing and automated decision making, Deep Neural Networks often fall short of providing interpretable models of data. In applications where aiding human understanding is the main goal, describing regularities in data with compact formuli promises improved interpretability and better generalization. In this talk I will introduce the resulting problem of Symbolic Regression and its generalization to Program Induction, highlight some learning methods from the literature and discuss challenges and limitations of searching for algorithmic descriptions of data.

Larissa Fedunik-Hofman