[This is a guest post by Peter Scholze.]

Exactly half a year ago I wrote the Liquid Tensor Experiment blog post, challenging the formalization of a difficult foundational theorem from my Analytic Geometry lecture notes on joint work with Dustin Clausen. While this challenge has not been completed yet, I am excited to announce that the Experiment has verified the entire part of the argument that I was unsure about. I find it absolutely insane that interactive proof assistants are now at the level that within a very reasonable time span they can formally verify difficult original research. Congratulations to everyone involved in the formalization!!

In this Q&A-style blog post, I want to reflect on my experience watching this experiment.

## Question: Who is behind the formalization?

Answer: It was formalized in the Lean Proof Assistant, mostly written by Leonardo de Moura from Microsoft Research, and used the extensive mathematical library (mathlib) written by the Lean community over the last four years. Immediately after the blog post, the Lean prover/mathlib community discussed the feasibility of the experiment on the Lean Prover Zulip Chat. Reid Barton did some critical early work, but then Johan Commelin has taken the leading role in this. In outline, Johan made an attack along the path of steepest ascent towards the proof, and handed off all required self-contained lemmas to the community. In particular, to get the project started, by January 14 he had formalized the statement of Theorem 9.4 of [Analytic], whose proof became the first target, and has now been completed on May 28, with the help of the Lean community, including (mathematicians) Riccardo Brasca, Kevin Buzzard, Heather Macbeth, Patrick Massot, Bhavik Mehta, Scott Morrison, Filippo Nuccio, Damiano Testa, Adam Topaz and many others, but also with occasional help from computer scientists like Mario Carneiro. Here is a link to the repository containing the formalised proof of Theorem 9.4, and you can also view its dependency graph, now fully green and completed.

## Question: How did you follow the project?

Answer: I joined the Zulip chat to answer any mathematical questions that may arise, but also as an interested spectator.

## Question: What is the significance of Theorem 9.4? Is the Liquid Tensor Experiment completed?

Answer: Theorem 9.4 is an extremely technical statement, whose proof is however the heart of the challenge, and is the only result I was worried about. So with its formal verification, I have no remaining doubts about the correctness of the main proof. Thus, to me the experiment is already successful; but the challenge of my blog post has not been completed. It is probably fair to guess that the experiment is about half-way done. Note that Theorem 9.4 abstracts away from any actual condensed mathematics, so the remaining half will involve a lot of formalization of things like condensed abelian groups, Ext groups in abelian categories, and surrounding machinery. The basics for this have already been built, but much work remains to be done.

## Question: How did the formalization proceed?

Answer: Initially, I imagined that the first step would be that a group of people study the whole proof in detail and write up a heavily digested version, broken up into many many small lemmas, and only afterwards start the formalization of each individual lemma. This is not what happened. Instead, the formalization followed quite closely the original lecture notes, and directly attacked Lemma after Lemma there. It did seem that the process was to directly type the proofs into Lean. Lean actually gives the user a very clear summary of what the current goal is, so one always needs to get a very clear sense of what the next few steps really are. Sometimes it was then realized that even on paper it does not seem clear how to proceed, and the issue was brought to attention in the chat, where it was usually quickly resolved. Only after a lemma was entirely formalized, the proof, now thoroughly digested, was again written up in the Blueprint in human readable form.

## Question: So it’s not actually a blueprint, is it?

Answer: Right — it’s not the blueprint from which the Lean code was formed, but (largely) the other way around! The Lean Proof Assistant was really that: An assistant in navigating through the thick jungle that this proof is. Really, one key problem I had when I was trying to find this proof was that I was essentially unable to keep all the objects in my “RAM”, and I think the same problem occurs when trying to read the proof. Lean always gives you a clear formulation of the current goal, and Johan confirmed to me that when he formalized the proof of Theorem 9.4, he could — with the help of Lean — really only see one or two steps ahead, formalize those, and then proceed to the next step. So I think here we have witnessed an experiment where the proof assistant has actually assisted in understanding the proof.

## Question: Was the proof in [Analytic] found to be correct?

Answer: Yes, up to some usual slight imprecisions.

## Question: Were any of these imprecisions severe enough to get you worried about the veracity of the argument?

Answer: One day I was sweating a little bit. Basically, the proof uses a variant of “exactness of complexes” that is on the one hand more precise as it involves a quantitative control of norms of elements, and on the other hand weaker as it is only some kind of pro-exactness of a pro-complex. It was implicitly used that this variant notion behaves sufficiently well, and in particular that many well-known results about exact complexes adapt to this context. There was one subtlety related to quotient norms — that the infimum need not be a minimum (this would likely have been overlooked in an informal verification) — that was causing some unexpected headaches. But the issues were quickly resolved, and required only very minor changes to the argument. Still, this was precisely the kind of oversight I was worried about when I asked for the formal verification.

## Question: Were there any other issues?

Answer: There was another issue with the third hypothesis in Lemma 9.6 (and some imprecision around Proposition 8.17); it could quickly be corrected, but again was the kind of thing I was worried about. The proof walks a fine line, so if some argument needs constants that are quite a bit different from what I claimed, it might have collapsed.

## Question: So, besides the authors of course, who understands the proof now?

Answer: I guess the computer does, as does Johan Commelin.

## Question: Did you learn anything about mathematics during the formalization?

Answer: Yes! The first is a beautiful realization of Johan Commelin. Basically, the computation of the Ext-groups in the Liquid Tensor Experiment is done via a certain non-explicit resolution known as a Breen-Deligne resolution. Although constructed in the 70’s, this seems to have not been in much use until it was used for a couple of computations in condensed mathematics. The Breen-Deligne resolution has certain beautiful structural properties, but is not explicit, and the existence relies on some facts from stable homotopy theory. In order to formalize Theorem 9.4, the Breen-Deligne resolution was axiomatized, formalizing only the key structural properties used for the proof. What Johan realized is that one can actually give a nice and completely explicit object satisfying those axioms, and this is good enough for all the intended applications. This makes the rest of the proof of the Liquid Tensor Experiment considerably more explicit and more elementary, removing any use of stable homotopy theory. I expect that Commelin’s complex may become a standard tool in the coming years.

## Question: Interesting! What else did you learn?

Answer: What actually makes the proof work! When I wrote the blog post half a year ago, I did not understand why the argument worked, and why we had to move from the reals to a certain ring of arithmetic Laurent series. But during the formalization, a significant amount of convex geometry had to be formalized (in order to prove a well-known lemma known as Gordan’s lemma), and this made me realize that actually the key thing happening is a reduction from a non-convex problem over the reals to a convex problem over the integers. This led me to ask my MathOverflow question whether such a reduction was known before; unfortunately, it did not really receive a satisfactory answer yet.

## Question: Did the formalization answer any questions from the lecture notes?

Answer: Yes, it did, Question 9.9 on the growth of certain constants. There are now explicit recursive definitions of these constants that are formally verified to work, and using this one can verify that indeed they grow roughly doubly-exponentially.

## Question: What did you learn about the process of formalization?

Answer: I learnt that it can now be possible to take a research paper and just start to explain lemma after lemma to a proof assistant, until you’ve formalized it all! I think this is a landmark achievement.

## Question: And about the details of it?

Answer: You know this old joke where a professor gets asked whether some step really is obvious, and then he sits down for half an hour, after which he says “Yes, it is obvious”. It turns out that computers can be like that, too! Sometimes the computer asks you to prove that , and the argument is “That’s obvious — it’s true by definition of and .” And then the computer works for quite some time until it confirms. I found that really surprising.

## Question: Can you read the Lean code?

Answer: The definitions and theorems are surprisingly readable, although I did not receive any training in Lean. But I cannot read the proofs at all — they are analogous to referring to theorems only via their LaTeX labels, together with a specification of the variables to which it gets applied; plus the names of some random proof finding routines. Still, I have the feeling that it should be possible to create a completely normal mathematical manuscript that is cross-linked with the Lean code that makes it possible to navigate the Lean code seamlessly — I think the creation of such an interface has also become a goal of the experiment.

## Question: So there will be more documentation work to do, also for the part that has just been completed?

Answer: Definitely! Currently, the Lean code leading up to the proof of Theorem 9.4 is not well-documented, and some parts of the proof could definitely be streamlined. Moreover, large parts of it are basic material that should become part of mathlib. It should be noted that because mathlib is constantly evolving, any project that uses it has to continually make small changes so that it will still compile with the newest version of mathlib. So it is vital that the parts of the proof of general interest are moved into mathlib, where they will be maintained.

## Question: What is the de Bruijn factor, i.e. the ratio between the length of the computer-verified proof and the length of the human proof?

Answer: It depends on the method of calculation, but somewhere around 20. I think this is amazingly small! I had expected that the first step of taking the lecture notes and turning them into a properly digested human proof — which as I said didn’t actually happen — would already introduce a factor of ~5. But the blueprint is actually only a factor of ~2.5 longer than the relevant part of the lecture notes right now.

## Question: How do you feel now that the proof has been broken down to the axioms of mathematics?

Answer: Good question! Usually the verification of a proof involves trying small variations of the argument and seeing whether they break or not, whether they lead to statements that are too strong etc., in order to get a sense of what is happening. Basically a proof is like a map of how to get up a mountain, say; it may be a nice, slightly winding path with a broad view, or it may lead through the jungle and up some steep wall, requiring climbing skills. Usually there’s not just one way up, and one may try whether taking a left turn the view is nicer, or taking a right turn one can take a shortcut.

In the case at hand, it feels like the main theorem is some high plateau with a wonderful view, but the way there leads through a large detour, to attack the mountain from the other side, where it is dark and slippery, and one has to climb up a certain steep wall; and no other pathways are seen left or right. Answering the questions in the Zulip chat felt like I would give instructions of the form “put your foot here, then your hand there, then pull yourself up this way” at the more difficult passages.

So I have gained the reassurance that it is possible to climb the mountain along this route, but I still have no sense of the terrain.

Congratulations on what constitutes an amazing achievement for all involved.

Let’s hope such landmark events will have lasting effect on the social process of understanding what constitutes good mathematics.

LikeLike

(just to be clear — this is Kevin Buzzard, not Peter Scholze). I really do not know what kind of effect these things will have. I regard this sort of thing as evidence of “the coming of age” of theorem provers, however I am still very unclear about the future of these things. Forget about the “computer proving the Riemann hypothesis using AI” stuff — even the far more reasonable “computer proof systems being used by researchers to check tedious lemmas / generalisations of results in the literature automatically” is still quite a way away, because the databases are still way too small right now.

LikeLike

Pingback: Various Math Items | Not Even Wrong

This is a question for Peter Scholze: What is it about this process that now convinces you, you have a correct and complete proof of this result? After all, you admitted that the details of the proof of each step are obscure to you. From reading your responses, my guess would be that it is the interactivity (via Zulip chat) that is persuasive — the right kinds of questions were asked of you, the responses to your proposed lemmas often gave what you expected and sometimes suggested a slight amendment that seemed plausible, etc. , all in line with the best kind of mathematical interactions. It appears that your interactions seemed to approximately imitate how one would enjoy a pedantic (and very capable) colleague working with you to make sure you got things correct (albeit communicating in a rather strange way). Is this how you feel? Let’s suppose that Lean had not pointed out any minor flaws at all but just said your original plan had been correct throughout– would you then feel as confident of the end product? Is it the mathematics or the interaction that makes you confident?

LikeLike

Sorry for the late answer, I’m only reading this comment now!

You are absolutely right, it was exactly the interactions via the Zulip chat that convinced me that a proper verification was going on. To me the computers didn’t matter at all, it could also have been Ofer Gabber. If I simply got a blanket “This is all correct” stamp, I would have been extremely suspicious, as I was completely sure that I made *some* slips.

Actually, there would be a very clear thing that might have gone wrong in this: Namely, that the formalized theorem statement isn’t quite what you think it is. Checking that it is indeed the correct statement is quite tedious — as I said Theorem 9.4 is really quite a technical statement, and to check that all definitions that enter it are exactly correct is still something that a human must do, and still a tedious task. The confidence that it is correct comes from looking at this in some detail, but also very much from seeing that during the process, they seemed to run into all the little nasty details that I expected (and sometimes not). When it is later proved that Theorem 9.4 implies the result of the challenge, these problems will also become less important, as the statement of the main theorem is less technical, so it should be somewhat easier to verify that all definitions are correct.

Maybe I didn’t stress this enough in the blog post above, but for me this whole experiment was a genuinely human experience, very similar to going through this with a very careful colleague.

LikeLike

[it’s me who should apologise — for some reason WordPress put Andrew’s message in spam and it took me a week to spot it. Sorry Andrew]

LikeLike

Congratulations to all involved!

Once well documented, is there a plan to publish this in a mathematics journal? If so :

A) how would the code be taken care of?

B) what kind of leeway would the referees have : if they suggest simplifications should they have checked them themselves in Lean (to avoid burdening the authors with further programming, which might cause headaches), or not necessarily?

LikeLike

I guess Clausen and Scholze can submit the “traditional” exposition of the work where they want — no doubt a traditional journal or book series. As for the formalisation, this sort of thing usually seems to end up in a computer science conference proceedings (for example this is where my work with Commelin and Massot on perfectoid spaces ended up, and also Commelin’s work with Lewis on Witt vectors). I have in the past refereed papers in this general area where I have suggested simplifications to formal proofs (via a paper sketch of an argument) which the authors themselves have implemented.

LikeLike

We haven’t really discussed yet how to publish this. I think it makes sense to first finish the complete challenge. We’ve reached a major milestone, but it’s a very technical intermediate step towards a standalone end result.

I imagine that we might have two publications (there is certainly enough material for that), one in a maths journal and another in a CS conference proceedings.

LikeLike

This is an impressive achievement – congratulations to all! I’ve been already very impressed at the variety of mathematical structures which LEAN can handle, but have seen only a few attempts at using Lean to check current research, which for the most part have been complicated for combinatorial or computational reasons, rather than conceptual.

I am curious to know just how many person-days (maybe person-years?) the verification took, for the following reason. A long time ago, professors would have their own secretaries to type their papers (I know that a few still do). Except for the extremely prolific, one secretary was usually enough. Now that almost everyone types their own papers, perhaps we should instead be looking for a “Lean-secretary” to check our proofs instead. But how many will we need? I’m not being frivolous – Kevin, I quite believe your expectation that in not such a long time, mathematicians will need to provide formal verifications of their results, at least in some contexts. But learning to code Lean (as I discovered to my chagrin at last year’s LFTCM workshop) is way harder than learning to write LaTeX, and I don’t imagine many mathematicians will have the time to master it; so there is a new and possibly very large sub-profession opening up – unless, that is, Lean tactics become sufficiently developed that ordinary mathematicians can read and write Lean proofs with about the same facility as they can handle TeX.

LikeLiked by 1 person

It’s difficult to guess the person-years. Probably under one? There was usually Johan working and often someone else.

Hales suggests in his 2017 Newton Institute Big Proof talk that formalisers should be part of “first response” team checking on manuscripts before they’re published. That’s what’s happened here (the project fixed some non-serious slips in the presentation as well as confirming that the argument worked).

I agree that lean is hard to learn. It’s fun to learn but there’s a certain knack to it too. Maybe better teaching tools would be helpful here.

LikeLike

This is very impressive, congratulations to everyone who is involved! I’m happy to see that this “experiment” (I feel this description undersells the project!) is going so well, and the blog post was an enjoyable read.

(1) Remark on the paragraph “Can you read the Lean code”:

I think the reason why Peter says he cannot read the proofs is that they are written using tactics. Each tactic essentially manipulates the current goal (or an assumption); e.g. if we want to prove X, then the tactic “induction” will make Lean say “okay, you now have to prove Y, but you can assume the induction hypothesis Z”. But the final proof script doesn’t tell you about X, Y, and Z. It only contains the word “induction”. Thus, to read the proof script on its own, you need to keep track of what X, Y, Z are yourself, and you can only do this if you know exactly what Lean actually does in each step (in the very beginning, X is the statement of the lemma). For me, the only way to read such a proof script is to go over each step together with Lean, i.e. make Lean tell me what X, Y, Z are. Is there a tool that takes a Lean proof and prints a human-readable version with assumptions and goals in each step, which one can then give to someone who hasn’t installed Lean?

The alternative to tactic-style proofs is proof terms, which I personally find easier to read. Actually, it’s probably entirely possible already with existing tools to write proofs that are understandable for anyone who understands the paper proofs. I remember seeing a formalisation that was written in natural language, and it really looked like a manuscript by a mathematician with limited vocabulary. This was not done using Lean, it was in Agda, and Agda allows lots of cute notational tricks which made this possible. Unfortunately, I can’t find it anymore and don’t remember who the author was – maybe Martín Escardó? In any case, I think it’s fair to say that “not being readable for non-experts” is not a reason to not have formalisations in the future.

(2) Comment on the paragraph “And about the details of it”:

I laughed when reading this! Indeed, it’s tempting to call something “obvious” when it’s by definition, but “by definition” means that unfolding definitions and simplifying is allowed. And this is hard work when done on paper. Proof assistants such as Lean or Agda can simplify a lot on their own (which is why they are *assisting*!).

Here (link) is an attempt to formalise results on homotopy groups of spheres. It’s “obvious” that the number “brunerie” in line 251 is just 2, in the sense that we know that a sufficiently powerful computer will confirm this after unfolding definitions and simplifying terms. However, nobody knows what “sufficiently powerful” means. People have tried it on very powerful machines but always ran out of time or memory before the calculation was complete. That’s why it says “don’t run me” in line 249.

LikeLike