Lean is better for proper maths than all the other theorem provers

I apologise for the moronic title. I have learnt over the past year or so that the clever thing to do is to have stupid click-baity titles. That’s not the real title. Maybe a more sensible title is:

Checking “proper maths” on a computer. Which computer prover can do it best?

But actually even that is not a great title: I have had people expressing their concern with my use of the phrase “proper maths” — suggesting that I am perhaps implying that the kind of mathematics that some people do is not “valid” in some way. I am actually using the phrase with my British tongue firmly in my cheek, but I appreciate that this is not always clear so I should be careful.

I could just go with some completely well-defined

Checking EGA IV sections 8 to 11. Which computer provers can be made to do this?

Grothendieck’s EGA IV sections 8 to 11 is a great example of a “magic wand” in mathematics. It means that you can say “WLOG everything is Noetherian” and press on, the moment you realise you need a finiteness condition to proceed. You might even be able to reduce to the excellent case. Plenty of experts in arithmetic geometry have read the proofs, although the details are quite dry. I myself have bought the book, which of course is the next best thing. I have no idea whether young people have read the proofs. Some of the earlier ones are quite easy, they rely on little more than the observation that a finitely-presented map is the pull-back of a finitely-presented map between finitely-generated (and hence Noetherian, and also excellent) rings. But later on it gets thorny — the theorem which descends the property of flatness (Theorem 11.2.6) is hard won — but also very very useful. It should really be a tactic. Brian Conrad tells me that knowing the proofs (and not just the statements) enabled him to give alternative proofs of the algebro-geometric results in appendix A of Breuillard–Green–Tao.

This challenge is still perhaps too hard. But the correct challenge is much harder. The correct title is something like

Understanding the questions of modern mathematics. Which computer provers can be made to do this?

Think of any piece of what my “proper maths” colleagues would call respectable mathematics — e.g. formalising the definition of an algebraic stack, formalising the statement of the PoincarĂ© conjecture, stating the Riemann Hypothesis. Formalise that in your prover — and then formalise a whole bunch of other pieces of respectable mathematics, and keep going until you have learnt what the limits of your prover are when it comes to modern mathematics. All these systems — Lean, Coq, Isabelle/HOL, Mizar, the other HOL systems, UniMath, Arend, and all the others — which of them can understand even the statements of modern mathematics? Can any of them? This is important to know! I’m not worried about the proofs — these systems are still missing definitions. Which of these systems can actually keep up with the “proper mathematicians”? This question is still really poorly understood, because (the penny has finally dropped) many of the users of these computer proof systems are simply not interested in this kind of mathematics. They may want to check that computer code has no bugs, for example, or do things which mathematicians were doing hundreds of years ago. This is why we have no real understanding of which systems mathematicians should be taking seriously. An MSc student at Imperial, Ramon Fernandez Mir, formalised the definition of schemes in Lean (pdf) (following work of myself, Hughes and Lau) but when we then looked to the other provers to see how they had done it, it turned out that none of them had. The communities are just doing other things.

Freek Wiedijk keeps track of 100 theorems to be formalised — but 95 of them are done now and FLT is just silly. We need new challenges. Here are ten off the top of my head:

  1. Formalise the statement of the Riemann Hypothesis.
  2. Formalise the statement of the Poincare conjecture.
  3. Formalise the definition of an algebraic stack.
  4. Formalise the definition of a reductive algebraic group.
  5. Formalise the definition of an adic space.
  6. State Deligne’s theorem attaching a Galois representation to a weight k eigenform.
  7. Do the sheaf-gluing exercise in Hartshorne (chapter 2, exercise 1.22).
  8. Prove sphere eversion.
  9. Do exercise 1.1.i in Elements of infinity-category theory by Riehl and Verity (note that infinity categories are used in section 5 of Scholze’s new paper with Cesnavicius so they’re probably here to stay).
  10. Define singular cohomology of a topological space.

Some are easy, some are hard, I just made the list up on the spot and tomorrow I could come up with ten different things. The list is also to a certain extent a reflection of my interests. I guess 1 is done in Isabelle and perhaps Coq, 2 and 3 are not done in anything as far as I know, although hopefully 2 can be done in Lean soon; 4 may be done in some system but it’s not the sort of thing computer people are typically interested in, 5 is done in Lean, 6 is I think a million miles away from anything in any theorem prover, 7 Kenny Lau did in Lean but it’s not in the maths library, it’s just bitrotting, 8 Patrick Massot mentioned in his talk in Pittsburgh, 9 because we have to start somewhere with infinity categories, and 10 because it is really hard to find any “proper” examples of any cohomology theories at all in any of the theorem provers. I have an MSc student, Shenyang Wu, working on group cohomology in Lean, he’s just defined cocycles and coboundaries this week, and it has been an interesting challenge. But unlike Freek’s list, I don’t think it’s good enough to get all ten done by the union of the theorem provers. The question is: which one prover can do all ten? I think all ten are within reach of Lean, even though it will take quite some time before we get there. But I am not convinced that the mathematical community in general is going to be taking any of these systems seriously until we show them that we can actually express modern mathematics like this in these systems. It is only when this starts to happen that mathematicians will be tempted in. And computer scientists do not seem to be motivated by these questions in general. That’s one reason that the area is moving far too slowly for my liking.

How many of these ten things can your favourite system do? Don’t you think it’s important that we find out the limits of your favourite system when it comes to modern mathematics? I’m not saying these things will be easy in Lean, but Lean’s maths (and more) library mathlib is somehow slowly steamrollering towards all of them in this slightly crazy/interesting way that open source projects work. The reason I’m backing Lean is that when it comes to the kind of questions like those above — accessible questions for theorem provers as they stand currently — I have seen Lean go from nothing to beginning to devour such questions, extremely quickly.

I am aware that many current users of these systems might not be interested in these questions. This is why we need new users. This is why we need documentation for mathematicians for all of these systems. Computer proof verification people — write documentation for mathematicians. Help us to learn your systems!

Mathematicians and theorem provers.

I have now watched so many mathematicians struggling with Lean, and sometimes not struggling with Lean, and it is now becoming clear to me what mathematicians can be expected to pick up quickly and what they will struggle with. One thing I have learnt about mathematics in the last two and a half years is that the formal mathematics that mathematicians do can be broken up into three kinds:

  1. Definitions. Things like the real numbers, or specific real numbers like pi. Also, abstract definitions like rings, homomorphisms of rings, schemes, and so on.
  2. Theorem statements. Things like Fermat’s Last Theorem, the irrationality of the square root of 2, the Birch and Swinnerton-Dyer conjecture, and so on. Just the statements.
  3. Theorem proofs. Proofs of the theorem statements.

All of these things of course have counterparts in the computer proof systems. Let me say something about what these things look like in a computer proof system. I’ll go through them backwards.

Theorem Proofs: These are the same as computer games. This is the mathematician’s favourite part. The natural number game is what happens if you take all the definitions and theorem statements out of the hands of the user, and just ask them to fill in the proofs, and give them enough of an API to make it fun. Bhavik Mehta made an impressive combinatorics repo and he told me that he never once had to think about how finite sets were actually implemented in Lean — he could just use the interface. Note that if you find yourself “grinding” (to use a computer game phrase), doing the same sort of stuff over and over again because you need to do it to make progress, then you can try to persuade a computer scientist to write a tactic to do it for you (or even write your own tactic if you’re brave enough to write meta Lean code).

But I have been formalising my undergraduate algebraic geometry course in Lean and occasionally running into missing API. This is just like some level in a computer game when you have to battle a big boss but you haven’t got the right equipment to make it easy, so you have to spend a lot of time hiding in corners and occasionally getting a hit in. For example I wanted the trivial fact today that if A, B and C were k-algebras, and A\to B is a k-algebra surjection with kernel I, and A\to C is a k-algebra homomorphism with kernel containing I, then there’s a k-algebra homorphism B\to C making the diagram commute. But this fact isn’t there, because the k-algebra API is not yet mature. This is frustrating. We will come back to this.

Theorem Statements: This is basically a simple translation problem, if all the definitions are there. For example if you want to say that a compact metric space is complete in Lean you just need to know how the people designing Lean’s maths library have formalised the concepts of metric space, compactness and completeness. Anyone who knows the relevant part of Lean’s maths library can do this — mathematician or computer scientist. Basically you have to know whether what you’re talking about is a predicate or a typeclass. For example example (X : Type*) [metric_space X] [compact_space X] : complete_space X is how one would say that a compact metric space is complete in Lean, and the proof is by apply_instance.

Of course one problem with theorem statements is that if the definitions are not there, you can’t make the statements. For example, as far as I know none of the systems can even state the theorem that the class group of a number field is finite, because although all the systems have groups, and the concept of finiteness of a group, none of them have the class group of a number field. This is completely standard “proper” undergraduate level mathematics and it just isn’t there in any of the systems, at least as far as I know. So of course this brings us on to

Definitions. This is the hard part. One reason it’s the hard part is that mathematicians are really good at defining a mathematical object to be three things at once, and then saying that a bunch of things are “true by definition”. An actual formalised definition is one thing, and then all the other ways of looking at it might correspond to little functions which pass from one type to another (perhaps equivalent) type. These little functions are of no importance to mathematicians, who cannot tell the difference between a subset and a subtype, or between the integer 3 and the real number 3, or between k^n and affine n-space, or between R[1/f][1/g] and R[1/fg], because in all these cases the two concepts are modelled in the same way in our brain. But the differences are unfortunately of great importance to computer scientists. The job of the Lean programmer is to turn a concept such as the real number 3 or an equivalence class of valuations or a perfectoid space, and to turn that concept into a string of 0’s and 1’s in a computer file in such a way that this string of 0’s and 1’s can be manipulated in the way a mathematician wants to manipulate it. This is where set theory/type theory comes in. This is like a programming language, saving us from having to actually play with 0’s and 1’s, or write machine code or whatever. We can turn a mathematical object either into a set or into a type, and then let the program do the rest of the job of turning it into 0’s and 1’s. A group G really is a four-tuple consisting of a set (also called G), a multiplication, inverse and identity, and some axioms. Or maybe just the multiplication, because we can work out the rest from the axioms. Mathematicians move freely from the group G to the set G, but to the computer scientists this might be a function application — or it might not, depending on how you set things up. An R-algebra homomorphism is also a group homomorphism and a map between sets. These things are not actually the same thing in some sense, however they are confused by mathematicians and this confusion causes no problems for us at all. Computer scientists on the other hand have got to fix one definition of an R-algebra homomorphism and then get some function coercion system going before they can treat this kind of thing working seamlessly. There is this whole host of tedious implementation issues which I really feel like I am not an expert at. Furthermore, this part is the hardest for mathematics undergraduates to learn, I think. Whether to make something a predicate or a typeclass, to extend another class or not, to use the old structure command or not — these are delicate questions in Lean and highly system-dependent in the sense that the same sorts of questions will come up in the other systems and the answers might be radically different. Our proofs are their programs, and if our definitions are bad then their programs will take a long time to run, their prover will become sluggish, and nobody likes lag in a computer game. Definitions are hard, and the computer scientists don’t even always know the right way to do them — sometimes I have been told to “try both ways and see what works best”. These fundamental problems do not seem to have been solved yet, and probably the best solution depends on which prover you’re using.

What is worse about definitions is that you can’t just make the definition — you have to make the API too. When I started with Lean, there were no complex numbers. It was a real freebie — the complex numbers could be defined as the type of ordered pairs of real numbers. I made the definition! And then I was told that I hadn’t finished yet — I needed to prove that they were a field, at the very least. I needed to define i, complex conjugation, the norm, and ideally polar coordinates as well, although this would need trigonometric functions and we didn’t have those at the time either. This sort of API-making is hard work and not particularly well-defined — you think you’ve done a solid job and then someone comes along and tries to use your definition and all of a sudden they are asking you to make ten more definitions or theorems, including perhaps some theorems which are “trivial in mathematics” and hence cannot be proved by mathematicians because the mathematicians have no clue what needs to be proved. For example after making the complex numbers, one has to prove things like: two complex numbers with the same real and imaginary parts are equal. The proof of this depends very much on the implementation — it might be true by definition, it might be true because of a lemma, or actually it might either be true by definition or true because of a lemma depending on the precise definition of how the assertion is phrased in the system. Mathematicians are not so well-placed to deal with these issues. Computer scientists moreover want these hidden little functions which we don’t notice to be fast and efficient, and this introduces some extra layer of run time analysis or whatever which is just a million miles from the mathematician’s model of formal mathematics — our model seems to me to run at infinite speed.

Finally, definitions might need theorems. To define the class group of a number field, it’s not at all hard to make the definition of the underlying set, but proving that it is a group has some content. If the mathematician cannot make the definition because they don’t understand the subtleties of what the programmer needs for the definition to run efficiently, but the computer programmer does not know the proofs of the theorems needed to make the definition, then we are at an impasse. This is one of the situations where the Zulip Lean chat room comes into its own — mathematicians come with questions about how to do things, and computer scientists learn enough about the question to start to be able to interpret it in their own terms and make useful comments about implementation issues. I have seen this happen time and time again.

Of course there is one thing I have forgotten, which mathematicians do: they have ideas and furthermore they reuse ideas which they or others have had. This part is the hardest to formalise. However I think it has something to do with tactics. I have realised that certain quite complex tactics seem to do things which human mathematicians find “trivial”. Lean now has several tactics which do what a “proper mathematician” would describe as “follow your nose”, or “exercise for the reader”. As we go on, and isolate more sophisticated “ideas” or “principles” or whatever they are, will be able to articulate these ideas in our favourite theorem prover? I don’t really see why not. It’s just hard to do right now because so much API and so many definitions are missing in all the provers.


I think we need to try to do all the maths, in all the computer proof systems. This is a simple way to find out where the weak points of the systems are when it comes to modern mathematics. We don’t have to worry that the proof of FLT is inaccessible. There is so much accessible mathematics to do. But each system has its own quirks when implementing new concepts and definitions. Mathematicians cannot be expected to choose the best implementations of a mathematical concept unless they have been trained by computer scientists. The Lean chat room fulfils a very important role here, helping mathematicians to get definitions right in Lean. The question is far more delicate than a mathematician might think. Which system is best for understanding the statements of modern research mathematics? Nobody knows. This has to change. Let’s find out.

About xenaproject

The Xena Project aims to get mathematics undergraduates (at Imperial College and beyond) trained in the art of formalising mathematics on a computer. Why? Because I have this feeling that digitising mathematics will be really important one day.
This entry was posted in General and tagged , , , , , . Bookmark the permalink.

2 Responses to Lean is better for proper maths than all the other theorem provers

  1. Pingback: Where is the fashionable mathematics? | Xena

  2. Pingback: Resumen de lecturas compartidas del 9 al 15 de febrero de 2020 | Vestigium

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s