## Formalising mathematics: an introduction.

As part of the EPSRC Taught Course Centre I am giving a course on formalising mathematics. This is a course for mathematics PhD students enrolled at Imperial College London, Bristol, Bath, Oxford, or Warwick. No formalisation experience will be assumed. Details of the course are available at the second link above. I have been timetabled eight 2-hour lectures on Thursdays 4-6pm UK time, starting this coming Thursday, 21st Jan 2021.

My instinct in the first lecture would be to start by listing a bunch of reasons why learning how to formalise pure mathematics is interesting/useful/important/whatever, and perhaps also explaining how I got involved with it. But I could probably spend about 30 minutes on this, and I don’t want to waste valuable lecture time on it. In fact I won’t actually be giving lectures at all — the 2-hour slots will be mini Lean workshops, where beginners formalise mathematics they know, with me watching, and I cannot see the point of making the students listen to me waffle on about my opinions/history when, after all, they have chosen to come to the course anyway. So I’ve just decided to write the introduction here, and then students can choose to read it at their leisure (or not read it at all).

Note that I will also be posting notes for the eight lectures here on this blog, explaining the tactics and lemmas which students will need to do the problem sheets associated to each of the eight workshops. I am hoping that the course material itself can be used by other people if they want, e.g. by people wanting to teach Lean courses in other universities, or students wanting to self-study. But without more ado, let’s get on with the stuff which I have now decided not to say in lecture 1.

# Formalising mathematics: Why?

Hello, and welcome to my course! I’m Kevin Buzzard, a lecturer at Imperial. For 20+ years I was a algebraic number theorist in the traditional sense, and my research work involved trying to prove new theorems in, broadly speaking, the Langlands Philosophy. Nowadays I work on something completely different — I am laying the foundations for teaching the Langlands Philosophy, and other aspects of serious modern mathematics, to computers. To make this switch, all I had to do was to learn a new language (a computer language) which was rich enough for me to be able to express my mathematical ideas in, and then it simply became a matter of explaining the mathematics I know and love to the computer; the computer checks the details of the proofs. In this course, we will be jumping right in. You will be actively learning this language, because you will be spending the vast majority of every lecture teaching basic undergraduate level mathematics (stuff you understand very well) to your own computer, and I or, hopefully, other students, will be helping you out when you get stuck. As the term progresses we might move onto harder mathematics — for example maybe we’ll do some MSc level algebra or algebraic geometry, maybe we’ll take a look at Patrick Massot’s sphere inversion project, or maybe we’ll explore the foothills of recent work of Clausen and Scholze. How far we get and what we’ll do in the last few lectures will depend very much on who is still attending the workshops after the first month.

A natural way to start such a course would be to spend half the first lecture explaining why I am investing my time giving such a course — what the point of it all is, and why I think this area is worth teaching. But such an explanation will eat into our precious time formalising time, so I decided to write it up here instead. In this introduction, which I won’t be formally reading out in the first lecture, I will explain the following:

• What happened to make me change area, and some mistakes I made at that time;
• Why I think it’s important to teach young people (i.e. you lot) about formalisation of mathematics;

## The nature of research mathematics

As I explained, I used to be a research algebraic number theorist, working in a technical area. More and more, my work would rely on results of other people, and it was becoming increasingly the case that I did not know the proofs of these results. This is the normal procedure in many modern areas of pure mathematics, as many of you will come to realise (if you didn’t realise already) during your time as PhD students.

Around 2017-18 I had some kind of a midlife crisis and decided that I did not want to proceed in this manner any more. I was seeing more and more evidence that some of these results that my colleagues and I were using were either not completely proved, or were completely proved but the complete proofs were not in the literature and may well never be in the literature. I was becoming increasingly concerned that the literature was incomplete and, in places, incorrect. Worse, I had begun to feel that there was a certain amount of failure to uphold what I thought were reasonable academic standards in number theory, and talking to experts in other areas I discovered that similar things were happening in several other branches of pure mathematics. By “failure to uphold academic standards” I certainly do not mean to imply that people are intentionally cheating the system. What I mean is that I felt, at that time, that the system was not scaling. Papers are getting longer. Areas are becoming highly technical. The fact that a journal is sitting on a technical paper with only a small pool of potential expert referees, and needs a report from a busy person — and that busy person has to make a judgement call on the correctness of a paper despite not having time to check the details correctly — and this person was sometimes me — means that some stuff gets through the net. I know papers, in my area, which have errors in. This is not a big deal, because I know, for the most part. how to work around the errors. Traditionally this is regarded as “to be expected”, but by 2017 I had became unhappy with the status quo. Did you know that there are two papers (this one and this one) in the Annals of Mathematics, the most prestigious maths journal, which contain results that directly contradict each other, for example? No erratum was ever published for either paper, and if you chase up the web pages of the authors involved you will see that (at least at the time of writing) both Annals papers are still being proudly displayed on authors’ publication lists. The reasons for this are complex. The incorrect paper has got some groud-breaking ideas in, even if the main theorem is wrong. Journals seem to be unhappy to waste their time/pages publishing errata, and authors sometimes seem to be unhappy to waste their time writing them. The experts know what’s going on anyway, so why bother fixing stuff? Contrast this with the world of computer science, where bugs in programs are discovered and fixed, and the fixes are pushed so that future users of the programs will not run into the same problems.

Mathematicians tend to be smart people. How come they’re making errors?

## Mathematicians think in pictures

I have a picture of the real numbers in my head. It’s a straight line. This picture provides a great intuition as to how the real numbers work. I also have a picture of what the graph of a differentiable function looks like. It’s a wobbly line with no kinks in. This is by no means a perfect picture, but it will do in many cases. For example: If someone asked me to prove or disprove the existence of a strictly increasing infinitely differentiable function $f:\mathbb{R}\to\mathbb{R}$ such that $f'(37)=0$ and $f''(37)<0$ then I would start by considering a picture of a graph of a strictly increasing function (monotonically increasing as we move from left to right), and a second picture of a function whose derivative at $x=37$ is zero and whose second derivative is negative (a function with a local maximum). I then note that there are features in these pictures which make them incompatible with each other. Working with these pictures in mind, I can now follow my intuition and write down on paper a picture-free proof that such a function cannot exist, and this proof would be acceptable as a model solution to an exam question. My perception is that other working mathematicians have the same pictures in their head when presented with the same problem, and would go through roughly the same process if they were asked to write down a sketch proof of this theorem.

I also have a picture in my head of an overconvergent modular form defined on a neighbourhood of the ordinary locus on a $p$-adic modular curve. This picture informed several papers I wrote earlier this century with Richard Taylor, Frank Calegari, and others. I was once privileged to be invited to speak in the number theory seminar at Orsay in Paris, and Jean-Pierre Serre was in the audience. I drew one of these pictures of mine on the board and Serre interrupted! He asked what the picture meant. I had drawn a picture of a compact Riemann surface of genus 3 and was drawing discs and annuli on the Riemann surface. The picture was however supposed to represent a 1-dimensional $p$-adic manifold (a rigid analytic space in the sense of Tate). It was a representation of the argument I was explaining, but because the object I was actually working with was $p$-adic, the drawing in some sense bore essentially no relation to the actual mathematical object I was working with. However, my Annals of Mathematics paper with Taylor and my follow-up Journal of the AMS single-author paper (which I was lecturing on at the time) were all evidence that my way of thinking about things, the pictures in my head, really could be translated down into rigorous mathematics, even though they were in some sense meaningless. They were effective guides. My picture came with caveats, which I had a mental note of (for example there are all sorts of subtleties with the “topology” on a rigid analytic space, issues which were solved initially by Tate in the 60s using Grothendieck topologies, and nowadays there are other solutions). These subtleties were not displayed in the picture I’d drawn on the board in Orsay, but I was aware of them. In short, I knew “how far one could push the picture” in some sense — which bits of it to take seriously.

I once found what I thought was an error, when refereeing a paper (the author of which was an expert). I could see where the error was coming from. There was a certain object being studied where my picture of the object was more sophisticated than that of the expert writing the paper. The author claimed that something was obvious. I could not see why it was obvious, so I consulted another expert. This expert I consulted (whose picture, or perhaps I should say “understanding”, of the situation was the most sophisticated out of all three of us) said that the result which the author claimed to be obvious was almost certainly true, they said they knew of no reference, and sketched an argument which they were confident could be turned into a proof with a fair amount of work. I relayed a brief summary of these ideas back to the author in my report (requesting revision) and in the second version of the paper they still claimed the result was obvious and attached a note saying that they could not see what the fuss was about. I then sent a much longer report explaining the problems in far more detail. In the next version of the paper which I was sent, the entire section containing the problematic argument had been rewritten and the difficult-but-probably-true result was no longer needed. This is a great example of the unreasonable resilience of mathematics. The author knew several proofs of the main result in that section; I had spotted an issue with one, so they simply supplied another one. In my experience, this is the way human-written mathematics routes its way round most errors in preprints and papers. But occasionally we’re not so lucky, and unfixable errors make it through the system.

## The system is not broken!

However, I now believe that the system is not broken at all. Rather, this is just the nature of mathematics as done by humans. What I did not realise in 2017 was that mathematics has always been like this. Humans are fallable. Philosophers and historians of mathematics have put me right on this point. What I also did not realise in 2017 is that I was going to have to come to terms with the fact that perhaps 20 years of administration and childcare meant that I was no longer a top expert in the Langlands philosophy, and the fact that I now was having problems with the literature certainly did not mean that the top experts were. The top experts know where the problems are, and why some things which seem like problems are not problems. Furthermore, they would probably be happy to explain things to you, if you raise specific issues of concern, and thus give them a reason to take you seriously. Thank you very much to those that did this for me.

## So…why formalise?

So, if formalising mathematics in a computer proof checker is not going to save pure mathematics, and if indeed pure mathematics does not even need saving, then why should a pure mathematician bother with using computer proof checkers at all? Well, the thing is, in stark contrast to my naive 2017 self, I have now seen what these systems can do, and it is now manifestly clear to me that these systems can change mathematics for the better. These systems can digitise mathematics. Why should we want to digitise mathematics? Is it not obvious to you? It will enable us to use it in different ways. It has the potential to change mathematics completely. Let me give you an analogy.

Digitising mathematics is not just useful for the AI researcher. Digitising mathematics guarantees a basic level of accuracy, which is extremely helpful for the learning experience. I have still not forgotten the example sheet question I spent five hours on as an undergraduate, which asked to prove that if a topological space had various nice properties then it was metrizable. I finally gave up, went to my supervision, showed my supervisor the problem, and they instantly responded “Oh that’s not true at all! The Stone-Cech compactification of the natural numbers is a counterexample”. Having never heard of this object at the time, I felt rather cheated. I have been formalising my undergraduate problem sheets over the last couple of years, and “edge cases” such as how a claimed result might become untrue/meaningless when some of the variables take extreme values such as zero are now completely eliminated. I am not saying that this is a reason to learn how to formalise. I am however saying that once more mathematics is formalised, people will take what we have and begin to consider doing things like creating online interactive textbooks and problem sheets, where students will be able to solve problems perhaps in some “front end” language. These tools and utilities will begin to appear as more and more mathematics is digitised and the software becomes more and more normalised in mathematics departments.

Let me finish by going back to pictures. Thurston’s ideas of the importance of humans turning mathematics into pictures and then using their intuition to manipulate these pictures can be interpreted as an argument for the importance of fluidity in thought, and hence an argument against formalisation. But imagine arguing that coordinate geometry should not be used to study Platonic solids because making Platonic solids out of clay clearly teaches you more about them. For example, testing the hypothesis that unit tetrahedra can be used to tessellate 3-space is far more easily done with physical tetrahedra than with a pile of formulae satisfied by the coordinates of the vertices in 3-space. However, without 3-dimensional coordinates there would be no 3d computer modelling software Blender, and using Blender (if you know what you’re doing) it is also very easy to check that unit tetrahedra do not fit together in a neat way to fill 3-space. When Descartes came up with his coordinates, he was in no position to imagine Blender. Maybe we are in no position in 2021 to imagine the kind of ways which a computer can be used to show us how to visualise various mathematical objects. But if people like me start teaching people like you the basics of how this stuff works, perhaps you will start having your own insights about what can be achieved in the future.

You can’t stop progress. Undergraduate and MSc mathematics is going to be formalised, and then who knows what will happen. Maybe the statements of the theorems in the Stacks project will be formalised — we’re working on it. Maybe the proofs of some of the theorems in the Clausen-Scholze work on a new variant of topological spaces will be formalised — we have made a start, and every day (at least at the time of writing) people are talking about formalising condensed mathematics on the Lean Zulip chat. These projects indicate that these systems are capable of understanding modern mathematics. What will they do with this understanding? I don’t think any of us are in a position to guess. But I want to find out sooner rather than later, and this is why I want to show young people like you what is currently possible — so you can prepare to dream.

The Xena Project aims to get mathematics undergraduates (at Imperial College and beyond) trained in the art of formalising mathematics on a computer. Why? Because I have this feeling that digitising mathematics will be really important one day.
This entry was posted in formalising mathematics course, Uncategorized. Bookmark the permalink.

### 19 Responses to Formalising mathematics: an introduction.

1. Matt Baker says:

Amazingly insightful post, Kevin.

Liked by 1 person

2. John Doe says:

So, if I understand you correctly, we don’t actually need to formally verify mathematics to make sure it’s correct? And there should be no doubt that experts will find any mistakes eventually? Does that mean that Peter Scholze’s doubts which he expressed here earlier are completely unfounded then?

I’m just as provocative as I can imagine 🙂

Like

• xenaproject says:

I don’t think formalisation is necessarily the solution, but there is definitely a problem. I know plenty of nightmare examples. Coleman (Matt Baker’s advisor!) found some error in an important work of Manin about 25 years after it had been accepted by the community (see e.g. page 1 of https://www.math.arizona.edu/~swc/aws/1999/99Tzermias.pdf). There will be more errors in the literature, perhaps some of which are serious — we cannot really tell. There is, however, just far too much to formalise. There is historical evidence to suggest that the vast majority of what we mathematicians believe is fine. Scholze’s doubts are not at all unfounded though. Indeed Scholze posted a new version of analytic.pdf on his website today, fixing up some minor slip which our formalisation procedure has already uncovered (and which experts having study groups on the topic had not). Perhaps one can summarise the situation as saying that the different approaches (expert reading, computer formalising) might find different kinds of problems (high-level objections, technical slips in complex arguments).

Liked by 1 person

3. Yemon Choi says:

Hi Kevin, at the risk of being one of these people criticizing the picture that is only a schematic representation of what both you and I actually know is going on, I don’t think you really meant to say that “there is no finite solvable group of odd order”. Even without using Lean I think I can do that one 🙂

Liked by 1 person

4. Andre says:

When I first read this I mentally parsed your quotation as belonging to the Milne who has papers on Shimura varieties. I thought hearing that must have felt bad.

Like

5. Dear Kevin,

Together with colleagues, I recently created a machine learning dataset based on the Coq code which states and proves the Feit-Thompson theorem, and libraries related to that code (https://github.com/EngineeringSoftware/math-comp-corpus).

Can you elaborate more on why you believe our dataset is “deeply flawed” and “of rather less value” (than other datasets?) for AI research?

I acknowledge that our dataset has many limitations and does not contain results about schemes and other mathematical structures of interest to mathematicians, but I have a difficult time seeing how this in itself would warrant your conclusion.

Like

• xenaproject says:

Oh hi! Oh my goodness! I certainly do not mean to put down your work and I thank you for bringing this up!

I am just saying that if you want to teach a computer mathematics, you *surely* want a nice bunch of theorems which are *useful to know*. The Gonthier et al work contains many many theorems about a group which doesn’t exist, and they are all trivially implied by the last theorem in the work, which says it does not exist. This, *to me*, seems to indicate that it is waste of time teaching all these theorems to a computer, because the final theorem somehow makes every other one of the theorems about the object worthless.

This is *my interpretation* of that dataset for machine learning. But I do not know much about machine learning.

I would be _extremely_ interested to hear your views on how this simple issue, that the “heart” of the dataset is many theorems about an object which one can ultimately prove does not exist, affects what one can use the dataset for.

I would like to apologise, and stress that in no way do I want to put your work down.

I was trying to make the general remark that current datasets seem *to me* to be very poor for training computers to be good mathematicians. I personally have a problem with the general methodology used in essentially every machine learning study — “train on 80% of the theorems in the database and then try and prove the other 20% despite the fact that we already know how to prove them”. This for me *personally* seems to be a very “weak” goal, but this seems to be what essentially *every* ML person is doing in mathematics. I would rather see a machine training on *100%* of the theorems in a database, and then trying to prove new ones. Perhaps the situation right now is that ML people are not ready to tackle this problem. However this is what humans are doing right now.

I personally know of *no dataset* which looks appropriate, to me, for training a computer to be a good mathematician. You guys are forced to work with what you have, and I believe that it is the fault of the *mathematics* community for not giving you a better dataset. With Lean’s maths library we are attempting to do this, but given that we still do not have the contents of an undergraduate mathematics degree I think that this dataset is still very poor. The analogue would be IBM trying to train Deep Blue on a dataset of chess games by club players rather than a database of grandmaster games. You are trying to train on the analogue of *one* grandmaster game, and it’s a game which had very peculiar properties and is arguably not representative of chess games in general.

I would be very happy to make some space for you to respond to this more fully.

Liked by 1 person

• xenaproject says:

Karl, I have rewritten the paragraph, trying to make it clearer that I think it’s the “proper mathematicians” who are the problem. Let me know what you think. I am happy to change more if you like. I am most definitely not tying to criticise either the Gonthier work or the work of AI/machine learning people who are trying to use it. The problem is that we mathematicians didn’t give you a good database yet, and this is our fault. Beyond some point AI people just seem to be happy with any data at all, as long as there’s a lot of it. This is not what I feel about this subject at all. I (and many others) am working to try and make you a better database, in part by advertising the software we use to mathematicians.

Like

6. Enrico Tassi says:

> Hundreds of theorems, some with quite technical proofs, about a group which does not exist.

Kevin, the largest output of that Feit-Thompson project was not the theorem itself, but rather the Mathematical Components library it sits on (which does not focus on a group that does not exist).

For some numbers you can look at these slides: http://videos.rennes.inria.fr/Conference-ITP/slidesMathematicalComponents.pdf , in particular page 6 has a picture of a bookshelf, and the famous proof is in the last two volumes. Training AI on all but the last two books is surely relevant.

Liked by 1 person

• xenaproject says:

Yes I absolutely agree. The Gonthier project has taught us many things — not least that formalising hundreds of pages of research level mathematics is now within reach.

I think that your claim “training AI on seven MSc level books is surely relevant” is analogous to saying “training a chess computer on a few hundred games played between average players is surely relevant”. I think that the entire area of machine learning in mathematics is in its infancy, and right now the AI people are doing the best they can with what the formalising community has currently provided for them.

This statement is not supposed to be interpreted as a criticism of the AI community. They can only use what the formalisation community has put in front of them.

This statement is not supposed to be interpreted as a criticism of the output of the formalisation community. They are not primarily focussed on applications to AI.

I am merely pointing out that, according to Wikipedia, IBM used *700,000 grandmaster games* to train their evaluation function, and so perhaps we should be asking where the 700,000 grandmaster-level formalised mathematical theorems are that the AI people can use to train their database to make a computer mathematician.

And, because the people who understand the grandmaster mathematical theorems are typically the smart research mathematicians, this is why we need to concentrate on teaching people involved in modern mathematical research how to use these systems.

Liked by 1 person

7. A database of mathematics as a basis for an AI system to train itself is probably not a bad idea. However, it is noteworthy that AlphaZero now completely works from first principles when it plays chess. It only knows the rules of the game and then learns completely on its own. The reinforcement learning algorithms, coupled to deep neural networks for policy and evaluation make this possible. Perhaps the right approach for automation of mathematics would be the creation of a kind of foundational axiomatics that is better suited for the computer to start from. I envision a more computer friendly axiomatic system, formulated in a mixture of topos theory, categorical algebra and type theory (these theories are strongly related in any case) already capturing enough geometric structure to be rich enough for the computer to start from and from there develop its own mathematics. Combined with the right ideas how to evaluate non-triviality and value of results, among many other wide open technical problems to be solved, this could be a basis for a system build entirely on first principles.

Like

8. Thomas Eckl says:

Your description of how mathematical research proceeds these days has striking similarities to the workings of 19th century mathematics, as summarized by Frank Quinn in his AMS Notices article “A Revolution in Mathematics?”, https://www.ams.org/notices/201201/rtx120100031p.pdf : “Ambiguities and mistakes [resulting from the use of] intuitive formulations … [were] resolved … [by] testing against accepted conclusions, feedback from authorities and comparison with physical reality.” So now we have “formal” foundations, but following the set rules gets too complicated, so we rely – again – on authorities and their “pictures” (a “mental reality”?). I am afraid you may be too optimistic that this way, in the longer run the correctness or falseness of results will turn out; for example in Algebraic Geometry it took almost 30 years, from the 1950s to the 1980s, until all the claims of the Italian School from the beginning of the 20th century were properly assessed, and many of them refuted.
The main difference to the past may be that nowadays the authorities are no longer the giants of mathematics like Poincare and Klein, but the many experts of specific areas. If they want to use each other’s results they have to ask each other, but they have enormous difficulties to understand both the questions and the answers. The problem with the Annals papers is rooted in such a failure of communication: The first one heavily relies on analytic methods, whereas the second uses an algebraic approach, and it is almost impossible for both sides to evaluate the arguments of their counterparts. I know about these issues from first hand, as I was involved in rewriting another paper of one of the authors in more algebraic terms … And I can testify that the two communities did not get closer to each other in the past 20 years, even when they continued to work on exactly the same problems.
All in all, I would say that the system is not broken but just works less and less effectively. And maybe “formalizing” maths (that is, digitising it, as you point out) is the right way forward to rely better on results from areas outside your expertise, as formalizing mathematics 100 years ago made it easier for mathematicians to become an expert in any area.

Like

9. xenaproject says:

Now I have better understood the scale of the task before us, I am more reluctant to claim that formalisation will help to fix human mathematics in the short term. There is too much of it and it is growing at too fast a rate, as anyone who is signed up to the ArXiv algebraic geometry daily mailings will know. Some days I think of it as one of humanity’s greatest theoretical achievements, other days I think of it as a corrupted virus-ridden pool of semi-broken ideas many of which do not actually work, or cannot be made to work without extra information which is not publically documented and is at risk of being lost as people die. Computers cannot yet read human text, this is the main obstruction to the obvious solution, and although there are a bunch of groups working on this, I know from experience that even if I were to take a random perfectly-written paper in my area and formalise precisely the statements of the mathematical assertions occurring in it (including the claims in each line in the proofs), the computer is still a long way from being able to fill in the “It is now easily seen that…” parts, so then you have to rely on AI trying to fill in the gaps. This all seems to me to be a very long way away.

On the other hand I think it is completely reasonable to imagine that, given enough funding, we can make some kind of a tool which can try to be a useful aid to, say, PhD students working in algebraic geometry. Databases like the stacks project are essentially blueprints for formalisation, and even formalising enough of the definitions and statements of the theorems will give us a query-based tool which is way more powerful than googling the answer / paging through the textbooks. But you’re probably not going to get computer scientists making the databases because they can’t read the stacks project automatically and they are probably not experts in advanced algebraic geometry. Substantial input has to come from the mathematians, at least in 2021.

Like

10. Apurva says:

Another use of digitizing mathematics would be in education. Programming is (arguably) easier to learn than proof writing. Right now, when you learn to code in Lean, you learn just that. However, if we could create a theorem prover (possibly a wrapper around Lean) that can understand some form of a controlled natural language (instead of cryptic words like intro, exact, etc.) and is fun and easy to use, then it is not inconceivable that it could be used to teach proof-writing to non-mathematicians.

Liked by 1 person

• xenaproject says:

I could not agree more about the applications to education. Right now, my experiments seem to indicate that it is still too hard, but this is because I am teaching my undergraduates the mathematics and how to use Lean at the same time, and this is just too much for most people. Right now because I’m teaching PhD students undergraduate mathematics, I can just concentrate on showing them how to do the mathematics in Lean, and then all the intro exact apply stuff is perhaps not too much to ask.

Like

11. “Humans are fallable” should read “Humans are fallible”

If intended, your point extremely well made. And if not, leave as is.

Like

• xenaproject says:

Ha ha! It was not intended, but I like it so much that I’ll leave it 😀

Like

12. Bernard Tatin says:

I love your text because I bought my first CD player in 1990, and I thought as you at that time. Today, we share the same analysis about the fact that “Digitising music was a game-changer”. And I love your idea to prepare the next generations to digitise mathematics. In early 80’s, my generation was not prepared to the explosion of personal computing, we invented it, with a lot of flaws.

Like