In December 2020, Fields Medallist Peter Scholze challenged the formal computer proof community to verify one of his theorems with Clausen. By June 2021 a key technical lemma had been formalised by the Lean community and there was a certain amount of media attention (e.g. here and here), to the extent that many people assumed that the challenge was over. It was not. There was the small matter of developing the theory of abelian categories, derived functors, Ext groups and various other standard tools in research mathematics. This took another year. The project was finished at the Lean for the Curious Mathematician workshop in ICERM, where Johan Commelin got the formalisation of the full Clausen–Scholze result compiling. Adam Topaz was another big player here. See the Lean community blog post announcing the result for more details, and a follow-up post discussing the question of how someone who doesn’t want to read much Lean code might attempt to check that the theorem has been correctly stated. There is also the blueprint which describes the precise mathematical details of the route to the goal.

People who know some mathematics but have not used proof assistants might have a mental model of the following form: “discrete stuff should be easy to formalise, continuous stuff is probably harder because there are pictures involved”. Patrick Massot, Floris van Doorn and Oliver Nash ably demonstrated that continuous picture-based mathematics — in this case differential topology — was now in scope for theorem provers, with their paper “Formalising the h-principle and sphere eversion“. The work formalises a modern proof of an instance of Gromov’s h-principle, and deduces Smales’ result that you can turn a sphere inside-out without creasing or snapping it, if you allow the sphere to pass through itself. The work also comes with a blueprint which explains the mathematical route the authors took. The work was accepted for publication at CPP2023 and there will be a talk on the project on 16th January 2023 at CPP.

In late 2021, Thomas Bloom announced a proof of an old conjecture of Erdos and Graham on unit fractions; any positive density subset of the positive integers contains a finite subset the sum of whose reciprocals is 1. This result still surprises me — for example can you *really* find a finite set of distinct positive integers all of which end in 7 and such that the sum of their reciprocals is 1? Apparently you can. Bhavik Mehta and Bloom started work on a full formalisation of Bloom’s result (including the prerequisites, such as an instance of the Hardy-Littlewood circle method). They finished in June 2022, before Bloom had received a referee’s report for the paper. Here’s a link to the blueprint.

Fontaine kick-started modern p-adic Hodge theory with his observation that certain p-adic period rings played a key role. María Inés de Frutos Fernández has developed enough of the theory of local fields to define the first two rings in the theory: `ℂₚ`

and , together with the Galois action. An attack on local class field theory is now viable.

In 2021, Gardam disproved an old conjecture of Kaplansky. In 2022 Anand Rao Tadipatri and Siddhartha Gadgil formalised Gardam’s disproof in Lean (and they’re speaking about the work in the London Learning Lean Zoom seminar on Feb 9th), but in contrast to the three projects above, they used Lean 4, meaning that they had no access to Lean 3’s mathematics library mathlib. Talking of which…

The thing powering mathematics in Lean is the Lean 3 mathematics library mathlib. The problem is that we now have a viable Lean 4, so the library needs to be ported, and Lean 4 is not backwards compatible with Lean 3; it’s *better*. A first pass at the problem was made by computer but it seems that we need humans to finish it off. The port started in earnest in December, and right now we are 12 percent of the way through. Will it get harder or easier as we proceed? Nobody knows! That’s one of the fun things about doing research. Anyone interested in learning Lean 4 by porting some basic mathematics from Lean 3 (finite sets, subgroups, topology etc all still need to be done) is welcome to take a look at the mathlib 4 porting wiki. With the advent of cached compiled olean files the job of porting got much easier in the last week or so, as you no longer have to compile anyone else’s branch manually. I am very much hoping that the port will be complete by the summer, and the way to make this happen is to get more people on board. Please consider joining us if you want to make a contribution to open source software in mathematics!

In December 2020, Fields Medallist Peter Scholze challenged the computer proof community to formally verify an important new theorem of himself and Dustin Clausen. Scholze issued the challenge here, and the challenge itself is now published in Experimental Mathematics, one of very few mathematical journals that is currently showing an interest in the area of formalisation. The name of the challenge was the Liquid Tensor Experiment (LTE), and yes it was a homage to the Liquid Tension Experiment, a band which Scholze is fond of. The theorem behind the challenge is a highly technical result, and I will not go into the details; check out the links I just mentioned for more on the mathematics involved. It would not be unreasonable to describe the theorem as a complex result about complex mathematical objects.

I have a bucket list of milestones for the area of mathematical formalisation, and formalising a complex theorem about complex objects was on this list. Why? One could argue that many of the major computer formalisations (the Kepler conjecture, four colour theorem and odd order theorem) were complex theorems about “simple objects” (spheres, planar graphs and finite groups, respectively). A formalisation which should not be placed in the same category as any of these milestone results, but which could perhaps be regarded as complementary to these major achievements, is my formalisation with Commelin and Massot of the definition and very basic properties of perfectoid spaces — this amounted to very trivial theorems about complex objects. For me, the formalisation of a proof of a complex theorem about complex objects was a missing part of the puzzle.

But not any more, because Johan Commelin, assisted by Adam Topaz and a team of other mathematicians, announced earlier this month that their Lean formalisation of Scholze’s challenge was complete. The final touches of the formalisation were put together at ICERM in Providence RI, USA, during the Lean for the Curious Mathematician 2022 workshop in July. These workshops are for working mathematicians with little or no Lean experience and who are interested in finding out what’s going on and how to get involved. If this sounds like you, then the videos for the ICERM workshop are now online at the above link. If you’re interested in seeing Commelin’s announcement of the completion of the project, it’s just over 40 minutes into my talk on the status of algebraic geometry in Lean.

The reference for the proof which we followed was Scholze’s lectures on analytic geometry, the notes for a course he gave in 2019/20 in Bonn. The theorem we formalised was Theorem 9.1 of these notes (see also Remark 8.15). On p59 of the pdf there is a five line argument reducing Theorem 9.1 to Theorem 9.4, and then the next six pages contain a proof of Theorem 9.4. I’ve claimed above that Theorem 9.1 is a complex theorem about complex objects; however Theorem 9.4 is a more low-level statement, which unravels to a technically messy but essentially completely low-level claim. It took around one month to formalise the *statement* of Theorem 9.4, and a further five months to formalise its proof. This averages out at about one page of mathematics per month. You can see a map of the formalisation of the proof of Theorem 9.4 here. Clicking on a rectangle (definition) or an oval (theorem) will pop up a LaTeX statement of the result and will offer you a link to the corresponding Lean code. We finished the proof of 9.4 about a year ago, and to my mild surprise the media got interested; we were even featured in Nature. What probably caused the interest was that one of Scholze’s motivations to encourage a formalisation was that he was not convinced that too many humans had, or would, plough through the proof of Theorem 9.4; the formalisation gave him the assurance that his work was correct.

All that remained then for the challenge, was to formalise the five lines in the paper which showed that Theorem 9.4 implied Theorem 9.1. I’ll reproduce them here:

“By the preceding discussion, one can compute as the derived inverse limit of over all ; equivalently, all . Theorem 9.4 implies that

for any the pro-system of cohomology groups is pro-zero (as is zero). Thus, the derived inverse limit vanishes, as desired.”

These five lines took a further year to formalise, and that’s what has just been completed. The reason it took so long was that the Lean community had to build a working theory of abelian categories, cohomology, Ext groups, derived functors etc etc as prerequisites, and whilst some of these subjects had been approached before in theorem provers, we had to get everything formalised in one system at once and everything playing well with each other. It is worth noting however that even though I’m claiming that the proof of the theorem is “complex”, the actual length of the paper proof is nowhere near as long as, for example, the proof of the Kepler conjecture. However I do regard the result as an indication that we’re going in the right direction, pushing these systems to do things they’ve never done before (although of course I was involved in the formalisation, so one should bear this in mind when evaluating my claims here).

There’s something called the “de Bruijn factor” of a formalisation, which is the ratio of the number of lines of computer code to the number of lines of human mathematics. One major goal in the area of computer formalisation is to “get the de Bruijn factor down to 1”, that is, to make formalising mathematics in a computer no more difficult than formalising it in LaTeX. However the Commelin-Topaz-… formalisation of the Liquid Tensor Experiment seems to make a mockery of this concept, because around 30,000 lines of code went into formalising the five page proof of Theorem 9.4, and around 60,000 lines of code went into formalising the five lines above. At the very least it indicates that humans can squeeze a huge amount of mathematics into one line of text. Had the required basics (abelian categories etc) been in Lean’s mathematics library `mathlib`

already, the formalisation would have taken far less time and far fewer lines. One thing we did benefit from was Scott Morrison’s work — Scott has built a huge amount of category theory in `mathlib`

over the previous six years, providing us with a stable foundation to build on.

One question which a professional mathematician with no Lean experience could ask when trying to evaluate the work is: how to be sure that the Lean formalisation of the challenge actually corresponds to the mathematical version of the challenge? For example how can we tell that the symbol `ℝ`

in the Lean code actually corresponds to the real numbers, and there is no cheating in the background? Commelin and Topaz have attempted to deal with this issue by making an examples directory in the repository, where they prove basic theorems about the objects involved (for example they show that `ℝ`

is a complete ordered field and that any complete ordered field is uniquely isomorphic to `ℝ`

, as evidence that Lean’s `ℝ`

really does correspond to the real numbers and there is no trickery going on). They also created evidence for Lean’s Ext groups, profinite spaces, condensed abelian groups and so on; if you understand the mathematical definitions of these objects then you can see an indication that the project’s definitions coincide with the usual mathematical definitions, even if you cannot directly read the Lean code.

Peter Scholze already explains in a second blog post two quite interesting things that we learnt during the process. Firstly, we learnt that the stable homotopy theory used in the original proof (via the Breen-Deligne resolution) can be eliminated. The way this was discovered was quite interesting; the formalisers were avoiding having to formalise this part of the proof but we were forced to start formalising the *statements* of what we needed; once these were clarified Commelin and Scholze realised that actually they could get away with a weaker theorem; initially we were calling the trick the “Commelin complex” but later on we discovered that the construction (but not the theorem needed about it) was made by MacLane many decades ago. And secondly Scholze seemed to learn something conceptual about the proof — he described it as “what makes the proof work”. You can see his thoughts on the matter at this mathoverflow question. But in some sense this is old news (the second blog post was published over a year ago). What else did we learn?

In my mind, one of the most important things we learnt is that if you get a group of mathematicians who are focussed on formalising an arbitrary piece of modern mathematics, then it can get done, however complex the theory is. In other words, we have a substantial piece of evidence showing that the software that computer scientists have developed for doing this task has now “caught up with humans”. One could regard this as a triumph. One could also regard it as a pretty weak statement. The computer did not “do” the mathematics — the humans did it; first Scholze and Clausen did it, and then Scholze write the argument down (in pdf notes for the course in Bonn, linked to above) in a form suitable for mathematicians, and then Commelin and Massot wrote a more detailed mathematical blueprint in a “hybrid” part-Lean part-LaTeX format (you can view the blueprint here; it goes into much of the detail of the proof), and then finally a team of people transcribed what they had written from a human language into a computer language. One thing we learn is that the computer language (Lean’s dependent type theory, in this case) is rich and expressive enough to be able to formalise recent and complex work of a Fields Medallist. One could ask whether it is feasible to formalise this work in a weaker logic, for example in Higher Order Logic, the logic used by Isabelle/HOL and the other HOL systems. I don’t think anyone actually knows the answer to this question; I have conjectured that the answer is “no” but there are expert HOL users who seem to conjecture that the answer is “yes”. My real fear is that we will never find out. Manuel Eberl is making fabulous progress formalising what I would call “standard MSc mathematics” such as the basic analytic theory of modular forms in a HOL system, but what we are missing is a HOL evangelist from the mathematical community (a kind of HOL version of me) and I don’t know how to create such a person. Indeed in some sense I am making the problem harder, because my constant evangelism for Lean to mathematicians has the consequence that research mathematicians who get interested in formalisation tend to come to Lean first, and the situation is becoming more and more like a monopoly. There are plus and minus points to this; one minus point is that nobody seems to be working on getting HOL to do things that “it wasn’t designed to do” (for example a modern treatment of sheaf cohomology, etale cohomology etc via derived functors).

We also learn that formalisation can be, at times, very slow going. However one should not lose faith here. The reason it took 1 year to formalise five lines of mathematics was that a huge amount of theory-building had to be embarked upon beforehand. That theory only needs to be built once. Lean now has a working theory of short and long exact sequences, derived functors, abelian categories, complexes and so on; furthermore there is evidence that the formalisation of these definitions is *usable in practice*, because we used them. Indeed this is another thing that we are learning in the Lean community; instead of thinking “what MSc course shall we formalise next” we are thinking “what major result shall we start working towards” because this not only drives development but it ensures that developments of theories are actually usable.

We learn that formalisation can be parallelised, although we already knew this long ago from the work of Gonthier et al on the odd order theorem, and the work of Hales et al on the Kepler conjecture (both papers have 10+ authors). Different people can be put to work on different areas of a formalisation project and do not even have to know about, or understand, what others are working on. The forthcoming paper on the formalisation will have a nice long list of authors, and this is something I’m very happy about; furthermore anyone interested in seeing the precise details of what everyone did can just look at the list of commits and committers on github — the details are completely open to everybody. This is open source mathematics.

So where are we going and what is the point? These are very reasonable questions. One possible reaction that a mathematician might have to the work is that it is pointless, because it is a computer verification of a result which was “basically known to be true anyway”. This is missing the point. Why are we teaching computers modern mathematics? Because I believe that one day future computer systems will be *helping* us, and whilst I cannot see into the future, consider the following question. Scholze and Clausen are arguing that condensed abelian groups are an important new definition in mathematics. How can AI developers create a chatbot which helps graduate students to learn about the category of condensed abelian groups, until professional mathematicians have taught computers what a condensed abelian group *is*? I know that some people in the machine learning crowd would answer things like “use language models, train on the papers about condensed abelian groups” but I am not yet convinced that this alone will turn into a useful tool. Lean (and other theorem provers) offer a framework where one can create a library of useful theorems, examples, and counterexamples. *We do not even have to type in all the proofs*. If “we” (the mathematical community, for example PhD students in arithmetic geometry) start translating important statements, examples, counterexamples, references and so on, and somehow combine this with language models, we will at least make some kind of interesting thing, which I imagine will get more helpful to humans over time. Some dream of a time where computers are better at proving theorems than humans. For me this is still complete science fiction; I have seen no evidence that we are anywhere near this milestone, and have no real vision as to how we are ever going to get there. But one can instead dream of a time where computers are *helping* humans to do research level mathematics, and work like this makes me believe that this goal is something I might see in my lifetime.

The history of interactive theorem proving can be told as a story of allowing the user to interact with the system at gradually higher levels of abstraction, getting further away from the axioms and closer to informal mathematics. Automath, the first interactive prover, floated very close to the axioms indeed. Mizar was the first attempt to make formal proofs comprehensible to mainstream mathematicians. Contemporary systems such as Coq and Isabelle add many useful features such as type classes and tactics, and most users don’t have to think about what the axioms even are. Today, there is still a learning curve to Lean, but once one learns the syntax, mathlib proofs are not terribly far from an informal version—at least compared to Automath. If we extrapolate this trend to its logical conclusion, we might imagine an interactive theorem prover where every single line of code is hidden behind the interface, and the user interacts with the machine purely in natural language. Recent advances in machine learning suggest such a system is not as far in the future as it may sound.

As a very small step towards realizing this vision, myself and Edward Ayers have developed Lean Chat, a VS-code extension that provides an interface for autoformalizing natural language theorem statements in Lean. Under the hood, these translation are generated by OpenAI’s Codex, a text generation model trained on Github source code.

The interface is straightforward: enter a theorem statement in good-old , and the app will have a go at formalizing it. If the generated formal statement is correct, fantastic! You can copy and past the statement into your Lean source file. If it is incorrect, you can give the app feedback on what to fix in a natural, conversational way. Let’s see a few examples.

Let’s begin with a simple statement about groups.

Lean would be happy with this, and we could accept this if we’re feeling lenient. But the predicate `is_group_hom`

is deprecated, so let’s tell the app to use the up-to-date syntax.

I’m happy with that.

Let’s give the app a slightly harder example, this time the statement of an exercise from James Munkres’ famous topology textbook.

Perfect on the first try!

The next example is due to Johan Commelin: let’s ask for a statement of Burnside’s theorem, an important result in finite group theory which is not in `mathlib`

.

Quite a lot is going on here. The app is able to figure out that the correct way of stating a group is finite is `[group G] [fintype G]`

, and the correct way of denoting the order of a group is `fintype.card G`

. Moreover, it is able to interpret quite high-level instructions such as “no need for the existential”, suggesting the app has at least partial grasp of the semantic connection between formal and informal mathematics. Slipping up on minor syntactic conventions, like writing `solvable`

instead of `is_solvable`

, is one of Lean Chat’s most common mistakes. These errors can almost always be fixed with dialogue.

Unfortunately, Lean Chat is far from perfect, and often fails in unpredictable and somewhat comical ways. This next example was also discovered by Johan Commelin:

For some completely unclear reason, the app decides to consider a Lie algebra over instead of my explicit instruction to use a field , and that’s even before addressing the fact that `tangent_space_at_identity`

is not a mathlib definition. I tried a few different responses to the app’s initial attempt, and couldn’t get it to do anything helpful.

However, the app knows at least something about Lie algebras. The next example is due to Oliver Nash.

One particular class of problem Lean Chat struggles with is Olympiad-style questions. The next example was discovered by David Renshaw:

This example is particularly embarrassing, because despite my specific instructions, the app steadfastly sticks by its initial guess.

Under the hood, Lean Chat’s formalizations are generated by OpenAI’s Codex language model. The Codex model is architecturally the same as the famous GPT-3, and has a simple objective: given a string of text, predict the next word. Iterating this next-word prediction allows the model to generate coherent text. Both Codex and GPT-3 are neural networks, which means they are a function parametrized by billions of real numbers, and these parameters are tuned by gradient descent to maximize next-word prediction performance on a corpus of training data. GPT-3’s training data is virtually all the text on the internet, and Codex is trained on all the source code on Github. This explains why Codex has some modest ability at writing Lean, although Lean code is a very tiny fraction of its training data.

If you take seriously the idea that neural networks are a lot like the brain, as I do, it is unsurprising that there exists some configuration of parameters that allows the network to competently use language. However, it is almost entirely a mystery why gradient descent is able to find this configuration of weights.

These language models are pattern matchers: if you ask them to complete the sentence “Not a creature was stirring, not even a _”, it will finish with the only pattern it’s seen in it’s training data: “mouse”. We can exploit this pattern matching ability to steer our language model into doing a desired task using a technique called *few-shot prompting*. Suppose we tell GPT-3 to complete the following text

English: How are you this morning? French: Comment vas tu ce matin? English: I arrived in London late because I missed my connecting flight. French: Je suis arrivé à Londres avec un jour de retard car j'ai raté mon vol de correspondance. English: Grothendieck was an algebraic geometer French:

Because the language model was trained to complete patterns, given this text it will generate “Grothendieck était un géomètre algébrique”. What Lean Chat does under the hood is exactly the same idea. Given the user’s input, we wrap it in the following prompt:

Natural language version: \"If $z_1, \\dots, z_n$ are complex, then $|z_1 + z_2 + \\dots + z_n|\\leq |z_1| + |z_2| + \\dots + |z_n|$.\" Translate the natural language version to a Lean mathlib version: theorem abs_sum_leq_sum_abs (n : ℕ) (f : ℕ → ℂ) : abs (∑ i in finset.range n, f i) ≤ ∑ i in finset.range n, abs (f i) := ... (two more examples go here)... Natural language version: \"Let $X$ be a topological space; let $A$ be a subset of $X$. Suppose that for each $x\\in A$ there is an open set $U$ containing $x$ such that $U\\subset A$. Show that $A$ is open in $X$.\" Translate the natural language version to a Lean mathlib version: theorem subset_of_open_subset_is_open (X : Type*) [topological_space X] (A : set X) (hA : ∀ x ∈ A, ∃ U : set X, is_open U ∧ x ∈ U ∧ U ⊆ A): is_open A := Natural language: [Your input goes here]. Translate the natural language version to a Lean mathlib version:

The Codex language model completes the pattern to generate the response you see in the app. To incorporate the user’s feedback, we simply concatenate the initial few-shot prompt, the model’s incorrect response, and the instruction “[user’s feedback goes here]. Try again:”, then feed the text back into the neural network.

Note that we don’t provide the model with any examples of failed and subsequently corrected statements, since then the model might start intentionally getting its first try wrong!

Lean chat is only the very beginning for autoformalization. Codex is an off-the-shelf model available through OpenAI’s API. Its creators likely never imagined it would be used for formal mathematics, and as such it is not at all optimized for that use case. The few-shot autoformalization capabilities of Codex were only noticed quite recently, and the community working at the intersection of machine learning and formal math has a lot of ideas for how to improve these systems. I expect that in the comings months my examples of Lean Chat’s failures will become completely outdated.

A natural next step after progress on autoformalizing theorem statements is to work on autoformalizing proofs. This is much more daunting that autoformalizing statements, but is definitely within reach. Current language model-based automatic theorem provers take as input the tactic state, and predict the next proof step. One could imagine building a prover that takes as input both a natural language proof and the tactic state to predict the next proof step, but otherwise works in exactly the same way.

Armed with these concrete ideas of how one might approach autoformalizing theorem statements and proofs, building a natural language interface to interactive theorem provers is still an ambitious project, but it is not science fiction either.

]]>In this post I want to talk about the undergraduate course “Formalising Mathematics”, which I gave in January to March 2022 at Imperial College London. The course notes are here. The course was for 3rd and 4th year undergraduates and also for MSc students, and it was specifically about formalising mathematics in a theorem prover (in this case, the Lean theorem prover). I’m certainly not the first person to give such a course in a mathematics department — Patrick Massot has been teaching a formalisation course in Orsay to 1st year undergraduates for several years now — but probably I’m one of the first, so perhaps it’s worth recording what happened.

In short, the aim was to get students formalising undergraduate level mathematics in Lean. In stark contrast to all the other courses offered by pure mathematicians at Imperial, there was no exam. Students taking the course were asked to submit three projects. The projects were completely open-ended; the first was “formalise some mathematics you learnt in your first year”, the second was “formalise some mathematics you learnt in your second year” and the third was “formalise some mathematics you learnt this year” (note in particular that the MSc students are thus forced to formalise harder material than the 3rd year undergraduates). The first project was due in 4 weeks after the course started, the second 4 weeks after that, and the third 4 weeks after that. I’m extremely grateful to Imperial College for letting me run the course in such an unconventional way. Because resources for this area (i.e., information about computer formalisation written with mathematicians in mind) are hard to come by, I spent a lot of time online helping students. One of the most exciting things for me about teaching the course was that even though I was under some kind of contractual obligation to give these people a mark out of 100 at the end of term, my primary motivation was to *teach undergraduate mathematicians how to formalise undergraduate mathematics* and with the set-up I’d created I found myself in a really good position to do this. Students would ask question after question about their projects and I would simply help them, or get them to help each other, or they would ask on Discord and get help there. Nobody could copy from other people because all students chose different projects to work on, and I made it absolutely clear to the students that it was fine for them to ask other students, or me, if they were stuck. There are other courses (typically involving writing code, and run by applied mathematicians or statisticians) in my department which are assessed by projects, but typically in these courses everybody would be given the same project and the lecturer would have to think very hard about how to prevent plagiarism. I had no such problems. The course was a joy to run and a joy to mark; thank you to all 23 students who attended.

So what does a “project” consist of? Well, this cohort of students had no prior examples of projects, and of course I also didn’t know what a project was supposed to look like, because in some sense I was making the whole thing up as I went along. The students were told that a project consisted of a few hundred lines of Lean code, containing comments, backed up by a small pdf write-up of 5 or more pages, explaining what they’d done, what they’d learnt, and what they found hard. Some students were slightly thrown by this rather informal description but I assured them that it was not really possible to say anything more precise because we were all on a journey together. One concrete question, which they stumbled onto after a while, was “what will you be giving marks for?”. I thought this was eminently reasonable, so I told them that each project would be graded out of 100, with 30 marks for presentation, 30 marks for content, and 40 marks for “bonus”. This description seemed to placate them (it’s precisely the description of the mark scheme they’re given for other projects we run in the department, for example MSc projects) and we pressed on from there.

What I got was glorious. Of course there was a huge range in quality. Remember that this is a course for 3rd and 4th year students, but some of those students have been coming to my formalising club, the Xena Project, since they were 1st years, and one or two of them know the software and the mathematics library better than I do. Kexing Ying, a regular at the club, asked if he could formalise Egorov’s theorem in measure theory for his first project despite it not being taught in the first year; I said that this was fine and he went on to formalise the theorem and proof, write up 5 pdf pages on how he did it, and then made a pull request containing the proof to `mathlib`

, Lean’s maths library. One could argue that students such as Kexing had an unfair advantage. I would respond that actually they had a fair advantage. If I had instead been teaching a 3rd year Galois theory course and was running a weekly Galois theory club, and a student had been coming along since their first year and learning Galois theory, then of course after over 2 years of this they are going to get a good mark in a Galois theory course, and they deserve to, because they’ve put in the work.

At the other extreme, I had some students who after 4 weeks were still struggling to get to grips with the system and who needed a lot of support from me and others, and who formalised some very basic stuff, sometimes quite poorly (in a stylistic sense, at least: the code compiled so the proofs were certainly correct!). Their write-ups were however very informative, as they explained what they’d found hard, or how their initial plans of formalising [something complicated] had to be hugely scaled back as they began to understand that sometimes even the simplest-looking things on paper can be difficult to teach a computer. Computers are relentlessly pedantic and also highly non-geometric; our intuition is that `x + y = y + x`

is “obvious”, but this is a theorem, and you cannot make a formal proof from “draw a picture”. Of course the students did not need to worry about this, because the proof that `x + y = y + x`

is already in the maths library so they can just use it.

I had told the students that each project would have an accompanying oral exam, where I would spend 15 minutes discussing the Lean code which the students had handed in, just to check that they had written it and had understood it. I (intentionally) gave few details about how the orals would run. Because we were still nominally in some kind of Covid situation I decided to run the first set of orals online. I will be honest and say that actually part of the idea here was so that I could get to know each student individually and make sure that they understood that for me the important thing was the learning objective: making sure they came away from the course being moderately good at formalising mathematics in a theorem prover. It was manifestly clear that each student understood the code they’d written; I gave encouragement to the weaker students and observed that now they clearly had “got the hang of it”, the second project would surely be much easier.

Examples of what students proved are below. I leave it to you to guess which students had used the software before the course had started.

- All cyclic groups are abelian.
- The first isomorphism theorem.
- The Schroeder-Bernstein theorem.
`(1+p)^n >= 1+np`

for natural numbers`n`

and`p`

, and other similar results.- Egorov’s Theorem.
- The category of small categories is complete (note: also not taught to our first years).
- A sequence of real numbers is Cauchy iff it’s convergent.
- The Bolzano-Weierstrass theorem.
- The theory of nets and its relationship to the theory of filters (not on the first year syllabus).
- If a product of two positive coprime naturals is a k’th power then each natural is a k’th power.

NB in case you didn’t guess, the last one was done by a seasoned Lean user and is rather fiddly to formalise.

One mistake I made was being too generous with marks for the first project. I certainly didn’t go over the top, but I wish I’d left myself some more room. There were some students who had clearly tried hard but had ultimately produced some very average work for their first project, and I rewarded them with a reasonable grade, which meant that later on when they were becoming more competent I was not able to say things like “you have improved vastly and this is reflected in the vast improvement in your grade”. However something I had not mentioned before was that the relative weights of the three projects were 20:30:50, so really everyone knew that the important project was the final one.

I haven’t mentioned the lectures yet! I was given two hours with these poor souls, once a week for 11 weeks. This is the only course I have ever given where attendance at the lectures went *up* as the term progressed, although this might be partly due to the fact that Covid was much less of a thing in March than it was in January in the UK. The lectures were extremely informal. I would typically take some topic, e.g. one of the earlier ones was “the theory of sets, subsets, unions and intersections (both finite and infinite)” and I would just formalise some basic results from first principles, and then typically explain that they were already in Lean’s maths library, and then go on to some more difficult stuff. I was very open to questions. I would occasionally write on the board but it was mostly me live coding and taking questions. The hardest part was deciding what topics to cover, but given that I was generating the course notes on the fly and the students were thinking about what topics their projects would be on, there was no shortage of suggestions. For most topics (there ended up being about 15) I would set some example sheets for the students at the course Github repository and video myself solving the problem sheets live and then dump the results on a YouTube playlist. The results are not always pretty to watch, but Talia Ringer mentioned once on Twitter that she thought it was a good idea to let students see you thinking in real time about course material (not least because they’ll see that you sometimes struggle just like them), and I agree with her, so that was what the students got.

The lectures, and in particular the topics of the lectures, were really the one part of the course where I ended up making big changes to my plans. Last academic year, as preparation for this course, I had given a multi-center graduate level course for students at Bath, Bristol, Imperial, Oxford and Warwick, and you can see the topics I covered at that link. The material covered there was basically what I was planning to do with the undergraduates. However when it came to doing things like filters the students *actively discouraged me* from covering the material. Why? Well, it’s obvious why, when you think about it. We don’t teach them filters in our undergraduate degree, and their task was to formalise stuff from their undergraduate degree, so why take a detour through filters when I could be doing things like basic ring theory and other stuff which they’d in practice find much more useful? This was a shock to me and involved me having to generate a bunch of material at very short notice, but given that I am so obsessed with the software and am happy to have any excuse to formalise anything in it, I played along. In particular, we have a popular graph theory course at Imperial so I was forced to learn how to use the graph theory part of `mathlib`

, something I’d never looked at all before.

The second project was due in 8 weeks after the start of the course, and I really hoped that by this point every student would be on top of the use of the software, but of course it’s very easy to forget (at least if you’re as naive as I am) that (a) the students are actually also studying other courses and (b) theorem proving software is really hard to learn. Some students were still struggling. I still had all the time in the world to help them out though, and of course students were also developing little communities (often online) where they’d help each other out.

Something I haven’t mentioned before: I was the victim of a loophole in the rules. Some of the students doing the course were doing joint mathematics and computer science degrees, and some of those people chose to formalise results which they perceived as “mathematics” but which was getting beyond my pay grade. For their second project, one student formalised frame definability for modal logic and basically I had to learn the theory from their write-up and the references, and then judge the project afterwards. That wasn’t supposed to happen. Another student proved completeness and compactness of a system of logic I’d never heard of. Next year I am unfortunately going to have to ban formalisation of material which was not taught in the mathematics department, for my own sanity.

Again I got a huge variation of projects. A student formalised topology from first principles and developed a bunch of theory. Topology was quite popular in fact, with other students proving facts such as “continuous image of compact is compact” and so on. A student formalised facts about the arithmetic of the Gaussian integers `Z[i]`

(for example that it was a UFD). Some students simply took problem sheet questions from other classes and formalised these: I got worked examples to problem sheet questions from analysis and ring theory, and I quite liked those projects, not least because for some “prove a standard theorem” projects the student may well produce a proof which is somehow much worse than the proof which is already in the maths library, whereas for worked solutions to problem sheet questions you can be almost guaranteed that the results will not be in the library already. The students who were already fluent in Lean (and typically tended to be more advanced mathematically too) again took the option to do some fancy abstract stuff (more category theory, for example). Some people were too ambitious and realised that their plans needed to be curtailed; students began to ask whether they could assume results without proof and then build on them. I said yes, so some of the projects were incomplete in the sense that they assumed various mathematical theorems as axioms and then built on them. One student assumed something false as an axiom (it was nearly true but they made a slip) and this presented me with quite a conundrum; I had to really think hard about how they had used the false statement and whether the related true statement would have sufficed. This project was hard to mark.

By this point I was getting tired, and I really didn’t want to have to lose an entire day doing orals for 23 projects when I knew that students were engaged with the course and were not cheating (indeed by this point I was on first name terms with most of the students and had talked to most of them about their projects several times before the deadline), so I told the students to each find another student and ask them to give them an oral and then to report back to me. I thus had to come up with guidelines for being an oral examiner. The guidelines said “get the other student to show you some of their code, look through it, choose a part you don’t understand and then get them to explain it to you. Once you understand it, they’ve passed; let me know this”. Note that the orals were not worth any points so I felt academically happy to do this.

As part of the marking process I would write feedback for the students. By the second project I had understood that this was an important part of the process and I spent a huge amount of time writing long feedback for most of the students (which kind of made up for the fact that I’d dodged doing the second orals, at least in my mind). Conversely, I had noticed that some of the write-ups were twice the length I had recommended, because students were keen to explain what they had done and would ask if it was OK to write more than 5-7 pages; I said “sure”.

By the last few weeks of the course, we finally all knew what we were doing. I would spend time on material which students had requested, showing the class how to formalise it in Lean. Students came up with ideas about what to formalise very early in the four week cycle; they had four weeks to prepare each project but by this stage it seemed to me that many of them started the four weeks knowing exactly what they were going to be doing. Even the weaker students had figured out a good strategy — don’t bite off more than you can chew, and ask if you need help. Students were completely open about getting help — they would openly say in their write-ups “I got totally stuck when trying to do X, so I asked student Y who explained a really nice trick; lines 153-163 of my code were written by them”. All this was fine. Remember that I didn’t care at all about the marks or how the student had got there — as far as I was concerned the goal was to achieve the learning objective, which was that by the end of the course the students should know something about formalising mathematics, and ideally mathematics at the level they were currently at academically, in a theorem prover. Here’s some of what I got:

- The Vitali convergence theorem.
- Box product of two graphs and proof that the product was connected iff both graphs were connected.
- Chinese remainder theorem for commutative rings.
- Lagrange’s 4 square theorem (incomplete proof but only assuming true things).
- If
`P`

is a presheaf on a category then the category of presheaves over the category of elements of`P`

is equivalent to the over-category of`P`

. - Categorical semantics of the simply-typed lambda calculus (gaargh).
- Classification of integer solutions to
`y^2=x^3-1`

. - Theorems about economic models involving quasi-concavity (gaargh).
- Hensel’s Lemma (fully proved, for complete nonarchimedean fields).
- Solution of a number theory exam question about which rings
`Z/nZ`

have unit group with exponent 2. - Existence of nontrivial integer solutions to Pell’s equation.

By this stage we had somehow all got the hang of things. Students in general asked for much less help, and had a much better feeling for what was feasible. In contrast to the first project, where everyone proved everything, here several people assumed some facts as axioms in their projects (this time all the facts were true though). Marking stuff which students had learnt this year in courses from departments other than the mathematics department was very hard and as I’ve said, will be banned next year. Again I gave copious feedback (even though in some sense it was irrelevant because the course was now over; however I found that it was very easy to find things to say). The “oral” this time was replaced by an internal mini-conference which was going to be hybrid but at the last minute was forced online because of Covid; all the students showed up and each one gave a three minute presentation on their work to the rest of the class. Some were great, some were disastrous (when a student says “wow is that 3 minutes already?” this can be interpreted as “I didn’t practice”), but the course was over by this point and the orals were worth nothing; I was just super-pleased that everyone showed up!

So that was it. As is probably clear, I loved it. I got very positive feedback from the students too. The course notes are still kind of incomplete, and I am not particularly motivated to complete them because Lean 4 is on the horizon and I’ll have to rewrite everything when the port of the mathematics library happens. In the mean time Jeremy Avigad is leading a team which is working on a much more mature document — Mathematics In Lean — and when I’ve finished up writing this and a couple more blog posts I’ll start writing a chapter on linear algebra for that.

The course is running again next year

]]>26th to 30th September 2022 (the week before term starts for many people in the UK).

A week-long **in person **workshop including a bunch of working on group projects in Lean, and some special guest lectures. The goal is ultimately to make some “outputs”, e.g. projects on GitHub and/or publications in journals/conference proceedings.

If you’re on, or have recently finished, a maths undergraduate or masters level degree, and haven’t started on a PhD yet, you’re eligible to apply. There are around 30 slots. For those offered a place on the workshop: I can pay for accommodation for the week in London, and travel to/from locations **within the UK only**. If you are a non-UK person interested then you can certainly apply but you’ll have to make it to/from the UK using your own resources, and I can pay from there to get you to/from London and for the hotel.

Send an email to `xenaprojectimperial@gmail.com`

with a brief cover letter saying something about any past Lean experience you have, or Lean projects you’ve been involved in. Preliminary indications are that I’ll get more than 30 applications so probably not everyone who applies will get a slot. Deadline for applications is 24th July 2022. Anyone who wants to get some Lean experience quick to beef up their CV can take a look at my recent undergraduate course notes and see if they can knock something up. Or just play the natural number game and tell me how it went!

Yeah, why not. Several people from the Lean community have suggested some projects which they are interested in running based on questions in combinatorics, probability theory, number theory and other fields. There will be projects at all levels from undergraduate to more challenging.

I take this seriously. This will not be a room full of cis white males. Applications from anyone who is not a cis white male will be very very welcome, and thanks to the people I’ve already heard from who are not cis white males but who have told me they’re going to apply. Remark: looks like the majority of the project leaders will not be cis white males.

Email to `k.buzzard@imperial.ac.uk`

works.

Thanks go to Imperial College and Jalex Stark for funding.

]]>I’ve spent a few years now trying to figure out how best to teach maths undergraduates how to get up to speed quickly with Lean, a theorem prover which uses dependent type theory and has a large mathematics library. In theory students can do all of the problem sheets I give to them in Lean. In practice things aren’t nearly so simple. As well as a long list of tactics to internalise, it helps if students get a feeling for what’s in Lean’s maths library already, and perhaps they should also know the difference between definitional and non-definitional (i.e. propositional) equality. Unfortunately definitional equality is a non-mathematical concept, in the following sense: if you define addition on the naturals recursively by `n+0:=n`

and `n+(succ m):=succ (n+m)`

then `n+0=n`

is true by definition and `0+n=n`

is not. This is an asymmetry which is of no relevance in real-world mathematics.

This year I’m still avoiding talking about definitional equality, and I’m also avoiding talking about the example sheet questions which I’m giving to my class. A typical example sheet question, even if easy from a mathematical point of view, may only yield to someone who is armed with a whole bunch of Lean tactics. So this year, instead of encouraging the students to work on the example sheets, I’m encouraging them to work on easier problems, so that we can build up to the example sheets later.

The 2021 Lean course home page is here, and I’m making an associated collection of short videos here . I’m going through basic tactics one by one, making what I hope is a more comprehensible introduction to doing mathematics in Lean. All the Lean problem sheets I’ve written so far can be tried online without having to install Lean, but installation instructions for those who want a slicker experience are here. 1st years are busy with their coursework right now, but when it’s over hopefully I will be able to get some feedback from them and others about this new approach. At the time of writing, I have finished the logic sheets, and I’ve just started pushing sets sheets. Still to come: functions and binary relations.

Next term things are getting much more serious. I’m actually teaching an official Lean course as part of our undergraduate program. In contrast to what I’m doing this term (evangelising), next term (Jan to March 2022) I will actually be figuring out how to get students to engage with some more serious undergraduate mathematics. Students will be examined by a series of projects, in contrast to the usual approach here for final year courses (a closed book exam under timed conditions). I’m going to cover standard topics like basic analysis and topology, and also more esoteric ideas like filters and perhaps uniform spaces. Should be interesting! This will involve yet another repository, which I suspect will be to a certain extent based on this one. But more on that later.

]]>The students behind four of the projects managed to get PR’s accepted into mathlib, and for two of them this was their first mathlib PR (and possibly even their first contribution to open source). These students had to use git and github (some for the first time), but these are skills which I personally now value as important and worth teaching (it is commonplace to teach this sort of thing to computer scientists, but mathematicians seem to miss out here). In 2022 I will be teaching a formalisation course to undergraduates at Imperial and we will be using git and github for this too.

I think the video titles are pretty explanatory, and perhaps now isn’t the time to be going through the technicalities of exactly what the students achieved. However for the most part we stuck to the mantra of: learn the maths first, and once you think you understand it, then try to formalise it. That way, of course, you find out whether you *do* actually understand it

We ran the entire thing on the Xena Project Discord server, a server for undergraduate mathematicians interested in formalisation of mathematics. This worked very well for me and, I think, for them. Students could share their screen if they had technical questions. Everything was done in voice or text channels, and in particular there are some students which I supervised and who I would not recognise if I met them in the street. Not that this bothered me in the slightest.

Timetabling: I basically promised to be online every Tuesday in July and August from 10am until about 6pm, so the entire thing functioned as a drop-in. Some students talked to me a lot, some students barely talked to me at all, and of course some students talked to other people. Creating a community was, I felt, a bit harder than previous in-person summer projects which I’ve run in the past (where you can just go to lunch with a bunch of UGs and let them talk to each other) but of course these are extraordinary times and we have to make do. One big advantage of running stuff online was that students could be in different countries and still participate, and more generally students could move around (e.g. to and from London) without it disrupting their supervision. I live and work in London and for some students it’s problematic to stay without a serious source of funding, and working online also solved that problem. Going forward — assuming things are far more normal next year I might be tempted to run summer projects in a hybrid way next year.

Thank you to all students who participated. If you are a *mathematics undergraduate* who is interested in formalising some mathematics in Lean (probably Lean 4 next year I guess!) over the summer of 2022, then get in touch with me at some point and we’ll see what we can do.

Exactly half a year ago I wrote the Liquid Tensor Experiment blog post, challenging the formalization of a difficult foundational theorem from my Analytic Geometry lecture notes on joint work with Dustin Clausen. While this challenge has not been completed yet, I am excited to announce that the Experiment has verified the entire part of the argument that I was unsure about. I find it absolutely insane that interactive proof assistants are now at the level that within a very reasonable time span they can formally verify difficult original research. Congratulations to everyone involved in the formalization!!

In this Q&A-style blog post, I want to reflect on my experience watching this experiment.

Answer: It was formalized in the Lean Proof Assistant, mostly written by Leonardo de Moura from Microsoft Research, and used the extensive mathematical library (mathlib) written by the Lean community over the last four years. Immediately after the blog post, the Lean prover/mathlib community discussed the feasibility of the experiment on the Lean Prover Zulip Chat. Reid Barton did some critical early work, but then Johan Commelin has taken the leading role in this. In outline, Johan made an attack along the path of steepest ascent towards the proof, and handed off all required self-contained lemmas to the community. In particular, to get the project started, by January 14 he had formalized the statement of Theorem 9.4 of [Analytic], whose proof became the first target, and has now been completed on May 28, with the help of the Lean community, including (mathematicians) Riccardo Brasca, Kevin Buzzard, Heather Macbeth, Patrick Massot, Bhavik Mehta, Scott Morrison, Filippo Nuccio, Damiano Testa, Adam Topaz and many others, but also with occasional help from computer scientists like Mario Carneiro. Here is a link to the repository containing the formalised proof of Theorem 9.4, and you can also view its dependency graph, now fully green and completed.

Answer: I joined the Zulip chat to answer any mathematical questions that may arise, but also as an interested spectator.

Answer: Theorem 9.4 is an extremely technical statement, whose proof is however the heart of the challenge, and is the only result I was worried about. So with its formal verification, I have no remaining doubts about the correctness of the main proof. Thus, to me the experiment is already successful; but the challenge of my blog post has not been completed. It is probably fair to guess that the experiment is about half-way done. Note that Theorem 9.4 abstracts away from any actual condensed mathematics, so the remaining half will involve a lot of formalization of things like condensed abelian groups, Ext groups in abelian categories, and surrounding machinery. The basics for this have already been built, but much work remains to be done.

Answer: Initially, I imagined that the first step would be that a group of people study the whole proof in detail and write up a heavily digested version, broken up into many many small lemmas, and only afterwards start the formalization of each individual lemma. This is not what happened. Instead, the formalization followed quite closely the original lecture notes, and directly attacked Lemma after Lemma there. It did seem that the process was to directly type the proofs into Lean. Lean actually gives the user a very clear summary of what the current goal is, so one always needs to get a very clear sense of what the next few steps really are. Sometimes it was then realized that even on paper it does not seem clear how to proceed, and the issue was brought to attention in the chat, where it was usually quickly resolved. Only after a lemma was entirely formalized, the proof, now thoroughly digested, was again written up in the Blueprint in human readable form.

Answer: Right — it’s not the blueprint from which the Lean code was formed, but (largely) the other way around! The Lean Proof Assistant was really that: An assistant in navigating through the thick jungle that this proof is. Really, one key problem I had when I was trying to find this proof was that I was essentially unable to keep all the objects in my “RAM”, and I think the same problem occurs when trying to read the proof. Lean always gives you a clear formulation of the current goal, and Johan confirmed to me that when he formalized the proof of Theorem 9.4, he could — with the help of Lean — really only see one or two steps ahead, formalize those, and then proceed to the next step. So I think here we have witnessed an experiment where the proof assistant has actually assisted in understanding the proof.

Answer: Yes, up to some usual slight imprecisions.

Answer: One day I was sweating a little bit. Basically, the proof uses a variant of “exactness of complexes” that is on the one hand more precise as it involves a quantitative control of norms of elements, and on the other hand weaker as it is only some kind of pro-exactness of a pro-complex. It was implicitly used that this variant notion behaves sufficiently well, and in particular that many well-known results about exact complexes adapt to this context. There was one subtlety related to quotient norms — that the infimum need not be a minimum (this would likely have been overlooked in an informal verification) — that was causing some unexpected headaches. But the issues were quickly resolved, and required only very minor changes to the argument. Still, this was precisely the kind of oversight I was worried about when I asked for the formal verification.

Answer: There was another issue with the third hypothesis in Lemma 9.6 (and some imprecision around Proposition 8.17); it could quickly be corrected, but again was the kind of thing I was worried about. The proof walks a fine line, so if some argument needs constants that are quite a bit different from what I claimed, it might have collapsed.

Answer: I guess the computer does, as does Johan Commelin.

Answer: Yes! The first is a beautiful realization of Johan Commelin. Basically, the computation of the Ext-groups in the Liquid Tensor Experiment is done via a certain non-explicit resolution known as a Breen-Deligne resolution. Although constructed in the 70’s, this seems to have not been in much use until it was used for a couple of computations in condensed mathematics. The Breen-Deligne resolution has certain beautiful structural properties, but is not explicit, and the existence relies on some facts from stable homotopy theory. In order to formalize Theorem 9.4, the Breen-Deligne resolution was axiomatized, formalizing only the key structural properties used for the proof. What Johan realized is that one can actually give a nice and completely explicit object satisfying those axioms, and this is good enough for all the intended applications. This makes the rest of the proof of the Liquid Tensor Experiment considerably more explicit and more elementary, removing any use of stable homotopy theory. I expect that Commelin’s complex may become a standard tool in the coming years.

Answer: What actually makes the proof work! When I wrote the blog post half a year ago, I did not understand why the argument worked, and why we had to move from the reals to a certain ring of arithmetic Laurent series. But during the formalization, a significant amount of convex geometry had to be formalized (in order to prove a well-known lemma known as Gordan’s lemma), and this made me realize that actually the key thing happening is a reduction from a non-convex problem over the reals to a convex problem over the integers. This led me to ask my MathOverflow question whether such a reduction was known before; unfortunately, it did not really receive a satisfactory answer yet.

Answer: Yes, it did, Question 9.9 on the growth of certain constants. There are now explicit recursive definitions of these constants that are formally verified to work, and using this one can verify that indeed they grow roughly doubly-exponentially.

Answer: I learnt that it can now be possible to take a research paper and just start to explain lemma after lemma to a proof assistant, until you’ve formalized it all! I think this is a landmark achievement.

Answer: You know this old joke where a professor gets asked whether some step really is obvious, and then he sits down for half an hour, after which he says “Yes, it is obvious”. It turns out that computers can be like that, too! Sometimes the computer asks you to prove that , and the argument is “That’s obvious — it’s true by definition of and .” And then the computer works for quite some time until it confirms. I found that really surprising.

Answer: The definitions and theorems are surprisingly readable, although I did not receive any training in Lean. But I cannot read the proofs at all — they are analogous to referring to theorems only via their LaTeX labels, together with a specification of the variables to which it gets applied; plus the names of some random proof finding routines. Still, I have the feeling that it should be possible to create a completely normal mathematical manuscript that is cross-linked with the Lean code that makes it possible to navigate the Lean code seamlessly — I think the creation of such an interface has also become a goal of the experiment.

Answer: Definitely! Currently, the Lean code leading up to the proof of Theorem 9.4 is not well-documented, and some parts of the proof could definitely be streamlined. Moreover, large parts of it are basic material that should become part of mathlib. It should be noted that because mathlib is constantly evolving, any project that uses it has to continually make small changes so that it will still compile with the newest version of mathlib. So it is vital that the parts of the proof of general interest are moved into mathlib, where they will be maintained.

Answer: It depends on the method of calculation, but somewhere around 20. I think this is amazingly small! I had expected that the first step of taking the lecture notes and turning them into a properly digested human proof — which as I said didn’t actually happen — would already introduce a factor of ~5. But the blueprint is actually only a factor of ~2.5 longer than the relevant part of the lecture notes right now.

Answer: Good question! Usually the verification of a proof involves trying small variations of the argument and seeing whether they break or not, whether they lead to statements that are too strong etc., in order to get a sense of what is happening. Basically a proof is like a map of how to get up a mountain, say; it may be a nice, slightly winding path with a broad view, or it may lead through the jungle and up some steep wall, requiring climbing skills. Usually there’s not just one way up, and one may try whether taking a left turn the view is nicer, or taking a right turn one can take a shortcut.

In the case at hand, it feels like the main theorem is some high plateau with a wonderful view, but the way there leads through a large detour, to attack the mountain from the other side, where it is dark and slippery, and one has to climb up a certain steep wall; and no other pathways are seen left or right. Answering the questions in the Zulip chat felt like I would give instructions of the form “put your foot here, then your hand there, then pull yourself up this way” at the more difficult passages.

So I have gained the reassurance that it is possible to climb the mountain along this route, but I still have no sense of the terrain.

]]>I got interested in trying to understand if this question even has a meaning. Here are some thoughts.

When we learn linear algebra at high school, we typically first learn the “concrete” theory, where vectors are columns of numbers, and we can multiply them by matrices and thus get a conceptual understanding of systems of linear equations. Then at university we go on to the “abstract” theory, where a real vector space is something defined by a list of axioms, and spaces like are now *examples* of these abstract objects.

We then learn about the fundamental notion of a *basis* of a vector space. Say we have an abstract finite-dimensional vector space. By picking a basis, the vectors in our vector space suddenly transform back into columns of numbers. Not only that, but linear maps between vector-spaces-with-a-basis turn back into matrices. By the time we’ve learnt that every vector space has a basis, we can see that our new theory is in some sense “the same as” our old theory. In some sense it took me a long time to get on top of this principle as an undergraduate; perhaps the key concepts were not emphasized to me enough, or maybe I just got lost in all the new (to me, at the time) ideas. Nowadays I think about it like this: initially we learn that is an *example* of a finite-dimensional vector space, but after learning about bases we can conclude that *every* finite-dimensional real vector space is isomorphic to for some , so in fact can be thought of as a *model* for a finite-dimensional real vector space, just like the collection of equivalence classes can be thought of as a model for a quotient by an equivalence relation. Every vector space has a basis, so one can prove theorems about finite dimensional vector spaces by checking them on models, i.e. by picking a basis.

After a couple of years at university the following idea had somehow sunk in: if possible, one “should not choose a basis”. The canonical example shows up when we learn about the *dual* of a vector space. The dual of a real vector space is just the space of linear maps from to ; this space has a natural vector space structure and is called the *dual space* of , with notation . Confusing example: if , then an element of is represented by a column of numbers. Give its canonical basis . Then any linear map is uniquely determined by where it sends the , and in particular an element of is also uniquely determined by numbers, so we can represent it as a vector of length and we have proved that the dual of is again. Great! Furthermore, every finite-dimensional vector space equals (proof: pick a basis) so we’ve proved that duality is just the identity map!

Except that we haven’t, because we have been a bit too liberal with equality here (and as some computer scientists are very quick to point out, many mathematicians don’t understand equality properly, and hence they might accidentally end up teaching this point badly). This argument proves that if is a finite-dimensional vector space, then it is *isomorphic* to its dual ; however it is not in general *canonically isomorphic* to its dual, whatever that is supposed to mean. In this instance, it means that different bases in general produce different isomorphisms between (identified with ) and (also identified with ). This is a bit confusing because in the group theory course running at the same time as the linear algebra course, a student is simultaneously being taught that when we say “how many groups are there of order 4” we *obviously* mean “…up to isomorphism”, because isomorphic groups *are* equal for that question.

However, if we do this trick twice, we can identify with its double-dual and it turns out that this identification *is* canonical, whatever that means. What it appears to mean in this case is that there is a really cool way of writing down the isomorphism from to which *doesn’t ever pick a basis*!

[Technical note/reminder: Here’s the explicit map from to its double dual. Say . We need to write down a linear map associated to . So take and we need to construct a real number somehow. Well, what about ? That works a treat! One can check that this map sending to is indeed linear, and induces a map from to which can be checked to be an injection and hence, by dimension counting, an isomorphism (NB: some magic just happened there). I suspect that this argument was my initiation into the mysterious word *canonical*, a word I now rail against, not least because in my opinion the Wikipedia page about canonical maps contains a definition which is full of fluff (“In general, it is the map which preserves the widest amount of structure, and it tends to be unique” — this is not a definition — this is waffle).]

The moral: all the canonical kids are too cool to pick a basis.

PS here is a funny way of thinking about it: if we identify with as column vectors, then perhaps we should identify the dual of with as *row* vectors, because multiplication on the left by a row vector sends a column vector to a number, which is what we want a dual vector to do. So identifying with is some high-falutin’ version of *transpose*, and if you transpose once you don’t quite get the same thing, but if you transpose twice then you’re back where you started. Canonically.

OK so here’s a pretty cool theorem about traces (although any formaliser would tell you that it is actually a *definition*, not a theorem). If is an square matrix then it has a *trace*, the sum of the elements on the leading diagonal. Now say is a finite-dimensional real vector space, and is a linear map, crucially from to itself. If we choose a basis of and use the same basis for the source and the target , then becomes a matrix, and we can take its trace. If we change our basis of then the matrix representing changes. But, miraculously, its trace does not! This can be proved by an explicit calculation: changing our basis for changes the matrix representing to for a certain invertible “change of basis” matrix (here was where it was key that the source and the target of the endomorphism were the same, otherwise we would have got ), and the traces of and are equal because of the general fact that the traces of and are equal if and are square matrices of the same size (apply with and ).

As a consequence, this means that if is a finite-dimensional vector space then we can unambiguously talk about the *trace* of a linear map , in the following way. First do the slightly distasteful choose-a-basis thing, then take the trace of the corresponding matrix, and then prove that the calculation was independent of the basis you chose, so we can pretend you never did it. Similarly one can talk about the determinant and characteristic polynomial of , because these are also invariants of matrices which are constant on conjugacy classes.

However, you *did* do it — you chose a basis. Something is a little different to the map from to its double dual — the map from to its double dual really was defined without choosing a basis *at all*. Here we did something slightly different — we chose a basis, and then proved that it didn’t matter. Can we go one better, and define the trace of a linear map from to without choosing a basis *at all*?

So in the process of discussing this question on Twitter and on the Lean chat, established firstly that this very much depends on what the question actually *means*, and secondly I managed to learn something new about Lean, and when I learn something new about Lean I tend to blog about it, so here we are. First I’ll talk about the failed attempts to define the trace of an endomorphism, which led to some other questions and clarifications, and then I’ll talk about `trunc`

, something which looked to me like a completely pointless operation in Lean and which up until now I’ve just ignored, but which somehow might be at the bottom of this.

In characteristic there are some problems with the ideas below, but these are not relevant to what I want to talk about, so let’s just stick to vector spaces over the reals (or more generally any field of characteristic zero). The first observation is that computing the trace, determinant and characteristic polynomial all seem to be pretty much the same question: for example, if you can compute the char poly of then you can compute its trace and det because you can read these things off from the coefficients. Conversely, if you can compute traces then applying this to some exterior powers you can read off the coefficients of the characteristic polynomial including the det. So computing any of these invariants without choosing a basis somehow boils down to the same thing.

Next let’s turn to Wikipedia, where we are informed of a basis-free way to compute the trace of an endomorphism! Here’s the trick. There’s an obvious bilinear map , sending to , and by the universal property of the tensor product this induces a linear map . There is also an obvious bilinear map sending to the linear map sending to , and this induces a linear map , which is easily checked to be an isomorphism if is finite-dimensional. Composing the inverse of this isomorphism with gives us a linear map which we can check to be the trace (e.g. by picking a basis). So we’re done, right?

Well, the thing about this construction is that whilst the map is canonical (indeed, it even exists in the infinite-dimensional case), to prove that it’s surjective in the finite-dimensional case the natural thing to do is to pick a basis and to make the corresponding matrix by taking an appropriate linear combination of tensor products of elements of the basis and the dual basis. I would argue that we have made some progress here — we still picked a basis, but we used it to fill in a proof, rather than to construct data. However, we still picked a basis. Note also that the inverse of a computable bijection might not be computable, if you’re interested in that sort of thing, and I suspect that this might be a situation where this annoyance kicks in.

One might instead be tempted to argue that the map is surjective because it is an injective map between vector spaces of the same dimension (I’m not entirely sure how to prove it’s injective without picking a basis, but it might well be possible; however I do not know how to prove that the dimension of is the product of the dimensions of and without picking a basis ). Anyway, talking of dimensions, here is the other “basis-free” method I learnt to do these sorts of things: the canonical method to work out the determinant of an endomorphism without choosing a basis. If is a linear map, and if is the dimension of then induces a linear map on top wedge powers , and an endomorphism of a 1-dimensional space is canonically a number (proof: pick a basis and check it’s independent of the choice) which can be checked to be the determinant of , and if you can do determinants then you can do char polys and hence traces.

The problem with this approach is that it relies on you knowing what is, the dimension of , and if all you know is that is finite-dimensional, then how do you get , the dimension? Well obviously you pick a basis , count it, and then prove that your answer is independent of the choice of basis. So this “top exterior power” argument has moved us closer to the heart of the problem: forget defining the trace — how do we define the *dimension* of a finite-dimensional vector space without picking a basis? Note that the dimension is just the trace of the identity function, so we can add dimension to our list of things which we cannot “compute” without picking a basis. And now we’re getting somewhere — what does it *mean* to say a vector space is finite-dimensional? Did we pick a basis even to make that statement?

I am not a big fan of constructivism, as many of you will know. I think that the emphasis placed on it by the computer theorem proving community historically has held the area back; it puts off mathematicians (e.g. the 2012 version of me) who have been indoctrinated by a traditional mathematics course which assumes the law of the excluded middle from day one. One problem I have with constructivism, as someone who was classically trained, is that it turns out that sometimes there is more than one way of doing something constructively, and all these ways are the same classically. For example very early on in my Lean career I was surprised to learn that there was `function.bijective`

, the predicate that a function was a bijection, but also there is the concept of a function with a two-sided inverse. As far as I’m concerned these are the same thing, but constructively they’re not, because given a bijection there might not be a “formula” for its inverse. The existence of a two-sided inverse is a true/false statement — but actually *having* the two-sided inverse is **data**, and, in constructive mathematics, data can sometimes be hard to come by. The function which takes as input a set/type `X`

and a proof that `X`

is nonempty, and outputs an element of `X`

, is a noncomputable function and its existence, if you think about it, is closely related to the axiom of choice, which is something that the constructivists are not big fans of.

So it turns out that this whole “ is finite-dimensional” thing, which this entire post has assumed all the way through, is a victim of this subtlety. What does it *mean* for a vector space to be finite-dimensional? The following answers are all the same classically (i.e. in “normal maths”), but constructively they’re all different:

- A
*proof*that has a finite basis; - An actual
*choice*of a finite basis; - An element of the
*truncation*of the set of all pairs , where and is an isomorphism. Here by the truncation of a set we mean the quotient of the set by the “always true” equivalence relation. Yes, you read that right.

OK so we know what the first two things are. The first statement is just a proof. If your proof is nonconstructive that’s fine, I don’t care. The second thing is data. For me this is a problematic definition of finite-dimensional, precisely because it’s *not* a proof of a true-false statement. If I am working with a finite-dimensional vector space in Lean then I might end up having to deal with the fact that if some argument changed ‘s basis whilst leaving alone, I might not have (as finite-dimensional vector spaces) any more, because the data going on behind the scenes saying that is finite-dimensional might not match up. I have enough problems formalising mathematics in type theory without having to deal with this too.

This brings us to the third definition, involving `trunc`

. OK so if `X`

is a type or set or however you want to set up your mathematics, then, as I mentioned above, `trunc X`

is the quotient of `X`

by the equivalence relation which is always true. In particular if `X`

is nonempty then `trunc X`

has one element/term, and if `X`

is empty then it has no elements/terms. If you’re happy about the concept that propositions can be types, with the idea that a true proposition has one term (its proof), and a false proposition has no terms, then `trunc X`

seems to be basically the proposition that `X`

is nonempty. However it is more than that, because it is *data*. It is the missing link in our story.

Let’s say that `V`

is a real vector space, and we have a “proof” that it is finite-dimensional in the sense that we have a term `t`

of type `trunc X(V)`

, where `X(V)`

is the space of all pairs . Here’s how to define the trace of an endomorphism . We’re going to define a map from `trunc X(V)`

to the reals, and the idea is that if you evaluate this map at `t`

then you’ll get the trace of . Now to define a map from a quotient, we use the universal property of quotients. First of all we have to define a map from the space we’re quotienting by to the reals. This is easy: given an isomorphism we get an induced map and we just take its trace with respect to the standard basis. And secondly, we need to prove that this map descends to the quotient, which boils down to proving a theorem, and the theorem of course turns out to be precisely the statement that the trace of an endomorphism is independent of choice of basis.

Similarly we can define a function from `trunc X(V)`

to the naturals, such that if `V`

is finite-dimensional in the sense that we have `t : trunc X(V)`

then its dimension is what you get by evaluating this function. And det and char poly etc.

Note that for *any* vector space `V`

, the type `trunc X(V)`

has at most one term — it’s a so-called *subsingleton* type. Lean’s `exact`

command will be fussy about subsingleton types because it’s a *theorem* that two terms of a subsingleton type are equal, rather than it being a definitional equality. Hence Lean’s `exact`

tactic still might not work if we’re carrying around `t : trunc X(V)`

as our definition of finite-dimensionality. However the `convert`

tactic will operate fine, because it looks out for subsingletons and applies the appropriate theorem.

We now seem to have got to the bottom of this. To do this kind of metamathematics — “did we pick a basis?” — we need to think really carefully about what we even *mean* by the assumption of `V`

being finite-dimensional. My preferred approach, and the one which makes mathematics easier to do in Lean, is to just use the propositional definition “there exists a basis”. This way one never runs into problems with equality, and if one needs a basis one just applies Lean’s version of the axiom of choice. Constructivists would complain that it breaks computation, but I want to prove theorems and I prefer to do computations using tactics rather than trying to persuade the kernel to reduce things. The other approach is to carry around some extra data, and this can lead to problems for beginners with equality being a bit broken, however the experts have learnt ways to deal with this. Ultimately the best choice in a theorem prover will depend on what you actually are *doing* with your objects, and given that I just want to prove theorems, for me, the bare existence definition of finite-dimensional is enough. However to move from the `Prop`

world to the `Type`

world, e.g. when defining the trace of an endomorphism, one has to at some point do some magic, and the moment where this happens is the moment you picked a basis.

I used to teach a course where I defined the notion of what it meant for two integers to be *congruent modulo N*. Here *N* is an integer, and two integers *a* and *b* are said to be congruent modulo *N* if their difference is a multiple of *N*. For example, 37 and 57 are congruent modulo 10.

I would go on to prove that congruence modulo *N* is an equivalence relation on the integers. Reminder: an equivalence relation on `X`

is a binary relation on `X`

which is reflexive, symmetric and transitive. The proof I gave never assumed that *N* was non-zero, and congruence modulo 0 is the same relation as equality, so you might like to deduce from this that equality on the integers is an equivalence relation. Well, tough darts. In the proof that congruence modulo *N* is an equivalence relation, it turns out that we *assume* that equality is an equivalence relation, as you can readily check if you type it into a theorem prover.

So how do we prove that equality is an equivalence relation? In my 1st year undergraduate lecture notes this is stated as an “obvious example” of an equivalence relation. Euclid was more cautious — he explicitly noted that he would be assuming transitivity and reflexivity of equality in his common notions 1 and 4, and symmetry followed from his use of language — he would say “this collection of things are equal” e.g. “all right angles are equal”: so equality of two things was a predicate on *unordered* pairs for him, and symmetry was hence built in.

I also looked at my 3rd year logic and set theory notes, to try and figure out the definition of = , but they were also no help. There is some waffle about “the interpretation in a model is the diagonal of ” (which I think might be circular) and how `x = y`

means that the two terms `x`

and `y`

are actually *the same term*, but don’t get me started about what mathematicians mean when they say two things are “the same”, e.g. “the same proof” or “canonical isomorphism is denoted by =” or all the other highly useful and yet irritatingly non-rigorous stuff we say. Anyway, just “defining” equality to mean another word or phrase which is synonymous with equality isn’t going to get us anywhere. Ultimately my impression is that you might just *assume* that equality is an equivalence relation, when setting up mathematics via first order logic and ZFC set theory. I’m no logician though — feel free to correct me in the comments. This post is about a different way to approach things, which part of me finds a whole lot more satisfactory.

Lean’s type theory contains a **definition** of equality! This is great because it means we know where we stand. Here’s the definition:

```
inductive eq {X : Type} : X → X → Prop
| refl (a : X) : eq a a
infix ` = `:50 := eq
```

What does all that baloney mean? Well, this is an inductive definition of a binary relation. Let `X`

be a type or a set or however you want to think of a collection of stuff. The binary relation `eq`

on `X`

takes as input two things in `X`

, call them `a`

and `b`

, and spits out a true-false statement `eq a b`

. We’ll get to the definition in a second, but that last line means “the notation `a = b`

is defined to mean `eq a b`

, with BIDMAS-power 50”. Let’s use the `=`

notation from now on instead of `eq`

, because it’s familiar-looking.

So, let’s get to the heart of the matter: how do we define `a = b`

? Well, Lean’s definition says the following: “there is only one tool we have to prove the equality of two things, and it’s the theorem `refl a`

, which is a proof of the statement `a = a`

. That’s it.” Thanks to Andrej Bauer who on Twitter pointed out that a neat way to think about this definition is: equality is the smallest binary relation which is reflexive. Andrej also tells me that this is Martin-Löf’s definition of equality.

OK great. We have now apparently defined the true-false statement `a = b`

, and we can prove that `a = a`

is true, so this is a good start. But how the heck are we going to go from this to symmetry and transitivity? We’re going to use induction!

In my last post, I waffled on about inductive types. I was going to talk about equality in that post, but it had already got quite long, so I thought I’d deal with equality separately, and so here we are. The take home message from the last post was that if you define something as an inductive type in a type theory like Lean’s, then Lean automatically generates a new axiom in the system called the *recursor* for that type. This axiom is generated automatically by the rules of the calculus of inductive constructions. For example, if you define the natural numbers inductively using Peano’s axioms, then the recursor is just the statements that induction and recursion are true (propositions are types in Lean, so one function can be interpreted in two ways depending on whether you apply it in the `Prop`

universe or the `Type`

universe). The recursor attached to an inductive type is a way of saying “if you want to do something to every instance of the thing you’re defining, it suffices to do it for all the constructors”, and a special case of it is the “inductor”, which says, in our case, that if you want to deduce something from `a = b`

then all you have to do is make sure you can prove it in the special case `a = a`

. Formally, the inductor for `=`

is the following statement:

```
eq.inductor (X : Type) (R : X → X → Prop)
(h : ∀ x, R x x) (a b : X) : a = b → R a b
```

Note the fancy type theory `R : X → X → Prop`

for what a mathematician would call “*R* is a binary relation on *X*” (reason: `R : X → X → Prop`

means in maths speak, i.e., is a true-false statement attached to every pair of elements of ). So, in words, the inductor says that if you have any binary relation `R`

on `X`

such that `R x x`

is true for all `x`

, and if you know that `a = b`

, then `R a b`

is true. That’s the tool we have. Again let’s go back to Andrej’s observation: the definition of equality can be thought of as saying that it is the smallest (or the initial) binary relation which is reflexive, and the inductor then makes this rigorous by saying that any other reflexive binary relation must contain equality.

Summary : we are armed with two things, and two things only. (1) a proof of `∀ a, a = a`

and (2) a proof that if `R`

is a binary relation on `X`

and `∀ x, R x x`

is true, then `a = b`

implies `R a b`

.

The game now is to prove that `=`

is symmetric and transitive, without accidentally assuming it! And what better framework to play that game in, than Lean! (because trust me, if you play it on paper, you are so going to mess this one up).

Insert coin. That’s hard mode, i.e. spoiler-free. Copy and paste the code at the other end of that link into VS Code if you have Lean installed locally if you want a much speedier experience. The game is to prove `symm`

and `trans`

from `refl`

and `ind`

. I would work out the maths proofs first.

If you want no spoilers *at all*, stop reading now. I usually post some art generated by my children in my blog posts so perhaps now is a good time to do this. If you want to hear more about equality, and in particular have one or more hints and spoilers, read on.

`refine`

is a great tactic. It’s like `apply`

on steroids. You can do everything with `intros`

, `apply`

, `refine`

and `exact`

. You can read about what these tactics do here, in Lean’s documentation. All Lean proofs can be done in just two or three lines.

Life is easier if we have more tools than just `ind`

. The thing about `ind`

is it wants as input a binary relation. A simpler object than a binary relation on `X`

is a subset of `X`

, or equivalently its characteristic function `P : X → Prop`

(sending the elements of the subset to `true`

and the other elements to `false`

). Here’s something which you believe about equality: if `a = b`

and `P a`

is true, then `P b`

is true. This is a much cleaner statement than `ind`

— it’s just as powerful, and more flexible. Logicians call it the *substitution property* of equality. Applying it is the way Lean’s `rewrite`

or `rw`

tactic works. Why not try to prove it first? It makes the proof of `trans`

much easier — it’s basically a master sword. Here’s the statement. Just cut and paste it before `symm`

. Of course, you’ve got to prove it (using `ind`

and `refl`

).

```
theorem subst (hab : a ∼ b) (P : X → Prop) : P a → P b :=
begin
sorry
end
```