Teaching formalisation to mathematics undergraduates

It’s been a hectic 2022 so far, but August is looking a lot calmer; this is the first of hopefully a few blog posts this month catching up on various things.

In this post I want to talk about the undergraduate course “Formalising Mathematics”, which I gave in January to March 2022 at Imperial College London. The course notes are here. The course was for 3rd and 4th year undergraduates and also for MSc students, and it was specifically about formalising mathematics in a theorem prover (in this case, the Lean theorem prover). I’m certainly not the first person to give such a course in a mathematics department — Patrick Massot has been teaching a formalisation course in Orsay to 1st year undergraduates for several years now — but probably I’m one of the first, so perhaps it’s worth recording what happened.

Overview of the course.

In short, the aim was to get students formalising undergraduate level mathematics in Lean. In stark contrast to all the other courses offered by pure mathematicians at Imperial, there was no exam. Students taking the course were asked to submit three projects. The projects were completely open-ended; the first was “formalise some mathematics you learnt in your first year”, the second was “formalise some mathematics you learnt in your second year” and the third was “formalise some mathematics you learnt this year” (note in particular that the MSc students are thus forced to formalise harder material than the 3rd year undergraduates). The first project was due in 4 weeks after the course started, the second 4 weeks after that, and the third 4 weeks after that. I’m extremely grateful to Imperial College for letting me run the course in such an unconventional way. Because resources for this area (i.e., information about computer formalisation written with mathematicians in mind) are hard to come by, I spent a lot of time online helping students. One of the most exciting things for me about teaching the course was that even though I was under some kind of contractual obligation to give these people a mark out of 100 at the end of term, my primary motivation was to teach undergraduate mathematicians how to formalise undergraduate mathematics and with the set-up I’d created I found myself in a really good position to do this. Students would ask question after question about their projects and I would simply help them, or get them to help each other, or they would ask on Discord and get help there. Nobody could copy from other people because all students chose different projects to work on, and I made it absolutely clear to the students that it was fine for them to ask other students, or me, if they were stuck. There are other courses (typically involving writing code, and run by applied mathematicians or statisticians) in my department which are assessed by projects, but typically in these courses everybody would be given the same project and the lecturer would have to think very hard about how to prevent plagiarism. I had no such problems. The course was a joy to run and a joy to mark; thank you to all 23 students who attended.

The projects

So what does a “project” consist of? Well, this cohort of students had no prior examples of projects, and of course I also didn’t know what a project was supposed to look like, because in some sense I was making the whole thing up as I went along. The students were told that a project consisted of a few hundred lines of Lean code, containing comments, backed up by a small pdf write-up of 5 or more pages, explaining what they’d done, what they’d learnt, and what they found hard. Some students were slightly thrown by this rather informal description but I assured them that it was not really possible to say anything more precise because we were all on a journey together. One concrete question, which they stumbled onto after a while, was “what will you be giving marks for?”. I thought this was eminently reasonable, so I told them that each project would be graded out of 100, with 30 marks for presentation, 30 marks for content, and 40 marks for “bonus”. This description seemed to placate them (it’s precisely the description of the mark scheme they’re given for other projects we run in the department, for example MSc projects) and we pressed on from there.

The first project.

What I got was glorious. Of course there was a huge range in quality. Remember that this is a course for 3rd and 4th year students, but some of those students have been coming to my formalising club, the Xena Project, since they were 1st years, and one or two of them know the software and the mathematics library better than I do. Kexing Ying, a regular at the club, asked if he could formalise Egorov’s theorem in measure theory for his first project despite it not being taught in the first year; I said that this was fine and he went on to formalise the theorem and proof, write up 5 pdf pages on how he did it, and then made a pull request containing the proof to mathlib, Lean’s maths library. One could argue that students such as Kexing had an unfair advantage. I would respond that actually they had a fair advantage. If I had instead been teaching a 3rd year Galois theory course and was running a weekly Galois theory club, and a student had been coming along since their first year and learning Galois theory, then of course after over 2 years of this they are going to get a good mark in a Galois theory course, and they deserve to, because they’ve put in the work.

At the other extreme, I had some students who after 4 weeks were still struggling to get to grips with the system and who needed a lot of support from me and others, and who formalised some very basic stuff, sometimes quite poorly (in a stylistic sense, at least: the code compiled so the proofs were certainly correct!). Their write-ups were however very informative, as they explained what they’d found hard, or how their initial plans of formalising [something complicated] had to be hugely scaled back as they began to understand that sometimes even the simplest-looking things on paper can be difficult to teach a computer. Computers are relentlessly pedantic and also highly non-geometric; our intuition is that x + y = y + x is “obvious”, but this is a theorem, and you cannot make a formal proof from “draw a picture”. Of course the students did not need to worry about this, because the proof that x + y = y + x is already in the maths library so they can just use it.

I had told the students that each project would have an accompanying oral exam, where I would spend 15 minutes discussing the Lean code which the students had handed in, just to check that they had written it and had understood it. I (intentionally) gave few details about how the orals would run. Because we were still nominally in some kind of Covid situation I decided to run the first set of orals online. I will be honest and say that actually part of the idea here was so that I could get to know each student individually and make sure that they understood that for me the important thing was the learning objective: making sure they came away from the course being moderately good at formalising mathematics in a theorem prover. It was manifestly clear that each student understood the code they’d written; I gave encouragement to the weaker students and observed that now they clearly had “got the hang of it”, the second project would surely be much easier.

Examples of what students proved are below. I leave it to you to guess which students had used the software before the course had started.

  • All cyclic groups are abelian.
  • The first isomorphism theorem.
  • The Schroeder-Bernstein theorem.
  • (1+p)^n >= 1+np for natural numbers n and p, and other similar results.
  • Egorov’s Theorem.
  • The category of small categories is complete (note: also not taught to our first years).
  • A sequence of real numbers is Cauchy iff it’s convergent.
  • The Bolzano-Weierstrass theorem.
  • The theory of nets and its relationship to the theory of filters (not on the first year syllabus).
  • If a product of two positive coprime naturals is a k’th power then each natural is a k’th power.

NB in case you didn’t guess, the last one was done by a seasoned Lean user and is rather fiddly to formalise.

Marking the first project.

One mistake I made was being too generous with marks for the first project. I certainly didn’t go over the top, but I wish I’d left myself some more room. There were some students who had clearly tried hard but had ultimately produced some very average work for their first project, and I rewarded them with a reasonable grade, which meant that later on when they were becoming more competent I was not able to say things like “you have improved vastly and this is reflected in the vast improvement in your grade”. However something I had not mentioned before was that the relative weights of the three projects were 20:30:50, so really everyone knew that the important project was the final one.

The lectures

I haven’t mentioned the lectures yet! I was given two hours with these poor souls, once a week for 11 weeks. This is the only course I have ever given where attendance at the lectures went up as the term progressed, although this might be partly due to the fact that Covid was much less of a thing in March than it was in January in the UK. The lectures were extremely informal. I would typically take some topic, e.g. one of the earlier ones was “the theory of sets, subsets, unions and intersections (both finite and infinite)” and I would just formalise some basic results from first principles, and then typically explain that they were already in Lean’s maths library, and then go on to some more difficult stuff. I was very open to questions. I would occasionally write on the board but it was mostly me live coding and taking questions. The hardest part was deciding what topics to cover, but given that I was generating the course notes on the fly and the students were thinking about what topics their projects would be on, there was no shortage of suggestions. For most topics (there ended up being about 15) I would set some example sheets for the students at the course Github repository and video myself solving the problem sheets live and then dump the results on a YouTube playlist. The results are not always pretty to watch, but Talia Ringer mentioned once on Twitter that she thought it was a good idea to let students see you thinking in real time about course material (not least because they’ll see that you sometimes struggle just like them), and I agree with her, so that was what the students got.

The lectures, and in particular the topics of the lectures, were really the one part of the course where I ended up making big changes to my plans. Last academic year, as preparation for this course, I had given a multi-center graduate level course for students at Bath, Bristol, Imperial, Oxford and Warwick, and you can see the topics I covered at that link. The material covered there was basically what I was planning to do with the undergraduates. However when it came to doing things like filters the students actively discouraged me from covering the material. Why? Well, it’s obvious why, when you think about it. We don’t teach them filters in our undergraduate degree, and their task was to formalise stuff from their undergraduate degree, so why take a detour through filters when I could be doing things like basic ring theory and other stuff which they’d in practice find much more useful? This was a shock to me and involved me having to generate a bunch of material at very short notice, but given that I am so obsessed with the software and am happy to have any excuse to formalise anything in it, I played along. In particular, we have a popular graph theory course at Imperial so I was forced to learn how to use the graph theory part of mathlib, something I’d never looked at all before.

The second project

The second project was due in 8 weeks after the start of the course, and I really hoped that by this point every student would be on top of the use of the software, but of course it’s very easy to forget (at least if you’re as naive as I am) that (a) the students are actually also studying other courses and (b) theorem proving software is really hard to learn. Some students were still struggling. I still had all the time in the world to help them out though, and of course students were also developing little communities (often online) where they’d help each other out.

Something I haven’t mentioned before: I was the victim of a loophole in the rules. Some of the students doing the course were doing joint mathematics and computer science degrees, and some of those people chose to formalise results which they perceived as “mathematics” but which was getting beyond my pay grade. For their second project, one student formalised frame definability for modal logic and basically I had to learn the theory from their write-up and the references, and then judge the project afterwards. That wasn’t supposed to happen. Another student proved completeness and compactness of a system of logic I’d never heard of. Next year I am unfortunately going to have to ban formalisation of material which was not taught in the mathematics department, for my own sanity.

Again I got a huge variation of projects. A student formalised topology from first principles and developed a bunch of theory. Topology was quite popular in fact, with other students proving facts such as “continuous image of compact is compact” and so on. A student formalised facts about the arithmetic of the Gaussian integers Z[i] (for example that it was a UFD). Some students simply took problem sheet questions from other classes and formalised these: I got worked examples to problem sheet questions from analysis and ring theory, and I quite liked those projects, not least because for some “prove a standard theorem” projects the student may well produce a proof which is somehow much worse than the proof which is already in the maths library, whereas for worked solutions to problem sheet questions you can be almost guaranteed that the results will not be in the library already. The students who were already fluent in Lean (and typically tended to be more advanced mathematically too) again took the option to do some fancy abstract stuff (more category theory, for example). Some people were too ambitious and realised that their plans needed to be curtailed; students began to ask whether they could assume results without proof and then build on them. I said yes, so some of the projects were incomplete in the sense that they assumed various mathematical theorems as axioms and then built on them. One student assumed something false as an axiom (it was nearly true but they made a slip) and this presented me with quite a conundrum; I had to really think hard about how they had used the false statement and whether the related true statement would have sufficed. This project was hard to mark.

By this point I was getting tired, and I really didn’t want to have to lose an entire day doing orals for 23 projects when I knew that students were engaged with the course and were not cheating (indeed by this point I was on first name terms with most of the students and had talked to most of them about their projects several times before the deadline), so I told the students to each find another student and ask them to give them an oral and then to report back to me. I thus had to come up with guidelines for being an oral examiner. The guidelines said “get the other student to show you some of their code, look through it, choose a part you don’t understand and then get them to explain it to you. Once you understand it, they’ve passed; let me know this”. Note that the orals were not worth any points so I felt academically happy to do this.

As part of the marking process I would write feedback for the students. By the second project I had understood that this was an important part of the process and I spent a huge amount of time writing long feedback for most of the students (which kind of made up for the fact that I’d dodged doing the second orals, at least in my mind). Conversely, I had noticed that some of the write-ups were twice the length I had recommended, because students were keen to explain what they had done and would ask if it was OK to write more than 5-7 pages; I said “sure”.

The final project

By the last few weeks of the course, we finally all knew what we were doing. I would spend time on material which students had requested, showing the class how to formalise it in Lean. Students came up with ideas about what to formalise very early in the four week cycle; they had four weeks to prepare each project but by this stage it seemed to me that many of them started the four weeks knowing exactly what they were going to be doing. Even the weaker students had figured out a good strategy — don’t bite off more than you can chew, and ask if you need help. Students were completely open about getting help — they would openly say in their write-ups “I got totally stuck when trying to do X, so I asked student Y who explained a really nice trick; lines 153-163 of my code were written by them”. All this was fine. Remember that I didn’t care at all about the marks or how the student had got there — as far as I was concerned the goal was to achieve the learning objective, which was that by the end of the course the students should know something about formalising mathematics, and ideally mathematics at the level they were currently at academically, in a theorem prover. Here’s some of what I got:

  • The Vitali convergence theorem.
  • Box product of two graphs and proof that the product was connected iff both graphs were connected.
  • Chinese remainder theorem for commutative rings.
  • Lagrange’s 4 square theorem (incomplete proof but only assuming true things).
  • If P is a presheaf on a category then the category of presheaves over the category of elements of P is equivalent to the over-category of P.
  • Categorical semantics of the simply-typed lambda calculus (gaargh).
  • Classification of integer solutions to y^2=x^3-1.
  • Theorems about economic models involving quasi-concavity (gaargh).
  • Hensel’s Lemma (fully proved, for complete nonarchimedean fields).
  • Solution of a number theory exam question about which rings Z/nZ have unit group with exponent 2.
  • Existence of nontrivial integer solutions to Pell’s equation.

By this stage we had somehow all got the hang of things. Students in general asked for much less help, and had a much better feeling for what was feasible. In contrast to the first project, where everyone proved everything, here several people assumed some facts as axioms in their projects (this time all the facts were true though). Marking stuff which students had learnt this year in courses from departments other than the mathematics department was very hard and as I’ve said, will be banned next year. Again I gave copious feedback (even though in some sense it was irrelevant because the course was now over; however I found that it was very easy to find things to say). The “oral” this time was replaced by an internal mini-conference which was going to be hybrid but at the last minute was forced online because of Covid; all the students showed up and each one gave a three minute presentation on their work to the rest of the class. Some were great, some were disastrous (when a student says “wow is that 3 minutes already?” this can be interpreted as “I didn’t practice”), but the course was over by this point and the orals were worth nothing; I was just super-pleased that everyone showed up!

So that was it. As is probably clear, I loved it. I got very positive feedback from the students too. The course notes are still kind of incomplete, and I am not particularly motivated to complete them because Lean 4 is on the horizon and I’ll have to rewrite everything when the port of the mathematics library happens. In the mean time Jeremy Avigad is leading a team which is working on a much more mature document — Mathematics In Lean — and when I’ve finished up writing this and a couple more blog posts I’ll start writing a chapter on linear algebra for that.

The course is running again next year 😀


About xenaproject

The Xena Project aims to get mathematics undergraduates (at Imperial College and beyond) trained in the art of formalising mathematics on a computer. Why? Because I have this feeling that digitising mathematics will be really important one day.
This entry was posted in Uncategorized and tagged . Bookmark the permalink.

2 Responses to Teaching formalisation to mathematics undergraduates

  1. Filippo says:

    Great post, and great report: thank you. I hope I will be able to teach something similar in the future. As an aside, the scene where you are forced to study frame definability for modal logic is really hilarious! But reassure me: you made it on pourpose to have 23 students, knowing that 23 is the first prime with non-trivial relative class group, right? Not like 59… 😉

    Liked by 1 person

  2. kkytola says:

    Nice report and a fantastic course!

    At our university, we started a (currently very very small) formalization club during the second half of the Spring semester 2022 (the COVID situation in the early half of the semester prevented live meetings, so we postponed the originally planned launch). The Natural Number Game remains our main recommendation to new members, simply because the web interface lets avoid the possibly frustrating experience of installation. But for those who would do the installation, this course immediately became the favorite recommended material.

    We offer a lot less help and instruction than the students in your course would have had, but the material is good enough even for such use, and the students do help each other. Section 1 on logic is a great start, because everyone gets through it, and yet the puzzles there provide the right game feeling. Already section 2 on real numbers has turned out rather difficult, however. I wish I had a clear idea why. It must be partially because formalization just is hard and this is the first part that addresses actual math, but I believe there must also be something else at play (since many people feel the subsequent sections are easier). To be clear, this is absolutely not a suggestion to change anything: I think having real number sequences before groups is no doubt the better order for almost all of our participants, and everything is really well designed.

    I of course also thoroughly enjoyed playing the course myself :). I marvel at how you managed to make each section, including even the first one on logic, (a) playable and (b) containing some genuine insights into math and/or formalization.

    This seems like the best existing introduction to formalization for math students (and mathematicians). Thanks so much for it!


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s