I visited Microsoft Research in September and then came back to the chaos of a new academic year, a new cohort, and a new curriculum. Sorry it has taken me so long to write this post.

So at Microsoft I spent most of my time talking to Leo De Moura and Daniel Selsam. Leo is the main Lean developer, currently working very hard on Lean 4. Daniel is working on the IMO Grand Challenge. 30 years ago I think that was manifestly clear that no computer theorem prover from that time could ever get a gold medal on the IMO. Daniel is using Lean 4 and investigating how it could be integrated with AI. Daniel is adamant that he wants nothing less than a gold. My attitude was “if a computer manages to solve *just one IMO problem* by itself, by running the `IMO`

tactic on a formalised version of the question, then that would already be pretty awesome. But take a look at Machine Proofs In Geometry by Chou, Gao and Zhang and then look at these talk slides — computer scientists are developing tools which solve geometry problems, and remember that IMO geometry, unlike number theory, is a decidable domain (Tarski showed that basically bashing out the coordinates always works). Daniel grilled me about inequalities and I don’t think I did very well — I was never any good at those (or, for that matter, at the geometry questions). Fortunately for Daniel, Reid Barton visited Microsoft Research a few weeks later.

And I finally got to meet Leo properly. One interesting thing about Leo is that whilst he is giving me code which enables me to express all my mathematical ideas completely rigorously, Leo himself has no interest in learning any of the advanced mathematics which we have been doing in Lean. Leo is trying to write a programming language which runs on a certain kind of dependent type theory and which will compute quickly in a way which is provably bug-free. For example I’m sure Leo would be very interested in talking to people who wanted to use Lean to develop a bug-free operating system for a computer.

Having said that, I’m sure that my visit would have been a great opportunity to learn something about the Lean source code, to take a look at all those C++ files and try and find out what the computer actually does with its C++ objects when I’m telling it to apply induction. But I have no idea about C++ and I don’t think we talked about the code at all, even though he wrote most of it.

But Leo and I actually had loads of other things to say to each other. I gave a talk, he asked me what mathematicians needed and I told him some problems we’d had with getting our ideas into Lean. The boundary of my interest in the Lean Theorem Prover is usually the Tedious Implementation Issue . For example, there are about three ways of doing homomorphisms of rings in Lean, and they’re all what I as a mathematician would call *the same*. On the other hand, the computer scientists have technical conversations on the Lean chat about which implementation will suit our needs best, now they are beginning to understand them better. This is a key point. We had a ring morphism and an -module and we wanted to consider it as an -module. They said “Oh, we didn’t know you needed that functionality, and this is unfortunate because we’ve set it up so that modules could only be modules over one ring, and changing it will result in having to create a new copy of the module, which wouldn’t be very good. Hang on, we’ll fix it”. And they did, and it took Mario Carneiro a very long time to fix it, but fix it he did. And now using modules in Lean is quite similar to using modules in mathematics, as long as you know the module library. One day we’ll have automation which means you won’t even need to know the library that well, but sledgehammers in Lean are very much a work in progress right now.

What I want to do is *mathematics*, without ever caring about tedious implementation issues. You could say that the Xena project is trying to figure out how much these issues can be avoided. By encouraging undergraduate and PhD level mathematicians how to formalise their ideas in Lean we are getting better and better at telling computer scientists what modern mathematics looks like, and then people like Leo and others will be better informed when they are making big design decisions about how to write Lean 4. This is one of the reasons why what goes on at the Xena project is important.

Inequalities are an interesting case, because about 15 years ago well-trained IMO students became good at bashing certain kinds of inequalities using techniques such as Lagrange multipliers or Muirhead’s inequality, and as a result such bashable inequalities rather went out of fashion at the IMO and are much less likely to appear than they were in our day as contestants. Inequalities are thus less common now in IMO problems, and more likely to be of a kind that isn’t so straightforward by standard methods (though occasionally something might get through before anyone notices how to solve it by a standard technique, e.g. IMO 2006 problem 3 where after it was selected the Iranian observer found a short solution by Lagrange multipliers). On the IMO 2019 Problem Selection Committee, a problem being too routine by standard techniques, such as Lagrange multipliers, was one of the classes of reasons we gave to submitting countries for why their problems weren’t shortlisted, along with such reasons as “this result appeared in a journal in the 1930s” and “this problem or solution is wrong” (and the very common “other problems were more popular with the PSC”, inevitable given over 200 problem submissions).

It would be interesting to see whether computers becoming good at bashing IMO geometry problems affects the popularity of certain kinds of problems at the IMO. (I once gave a short talk about Tarski’s theorem at a training camp entitled “Why Geometry is Boring”, so my views on such clearly-algorithmically-decidable problems should be clear. I prefer combinatorial geometry and think IMO 2006 problem 6 is an excellent example of a good IMO geometry problem.)

Teaching computers the statements and proofs of standard inequalities, and how to prove other inequalities using them, feels useful (to mathematicians) in a way that teaching them how to solve Euclidean geometry problems isn’t (except where it provides more generically useful tools such as Gröbner bases): inequalities are of routine use in various areas of modern research mathematics in a way that synthetic Euclidean geometry problems and theorems aren’t. (I used both Hölder’s and Jensen’s inequalities, which I’d originally learned for olympiads, in my PhD. I also had an ugly inequality there that I failed to prove by olympiad methods and ended up proving by splitting the domain into regions and bounding the function in each region using numerical computation of values of the function and its derivatives, trusting that the underlying floating-point C library functions were sufficiently accurate for that to be valid. Such a proof is of no mathematical interest and clearly something best left to computers. And now that I know much more about floating-point, maintain all those floating-point functions in glibc and have found and fixed a great many bugs in them, I’d be wary of placing so much trust in them as a basis for a proof any more.)

LikeLike

As I’m sure you know Joseph, people have done formally verified case bashes now, where for each region you come up with a numerical argument and then formally verify that this argument is rigorous for this region. Hales/Ferguson’s proof of Kepler had around 70,000 inequalities which needed checking on various regions, proved using the kind of methods you’re talking about, and the Annals were very uncomfortable about accepting it for a long time (I heard recently that they tried to get the referees to re-implement the code, but the referees wouldn’t). Hales then formalised the proof completely.

LikeLiked by 1 person