What *is* maths? I think it can basically be classified into four types of thing. There are definitions, true/false statements, proofs, and ideas.

**Definitions** (for example the real numbers, or pi) and **true/false statements** (for example the statement of Fermat’s Last Theorem or the statement of the Riemann Hypothesis) are part of the *science* of mathematics: these are black and white things which have a completely rigorous meaning within some foundational system.

**Proofs** are in some sense the currency of mathematics: proofs win prizes. Constructing them is an art, checking them is a science. This explains, very simply, why computer proof verification systems such as Lean, Coq, Isabelle/HOL, Agda… are much better at checking proofs than constructing them.

And **ideas** are the purely artistic part of mathematics. That “lightbulb” moment, the insight which enables you to solve a problem — this is the elusive mathematical idea.

Ideas are the part of mathematics that I understand the least, in a formal sense. Here are two questions:

- What is a group?
- How do you think about groups?

The first one is a precise “scientific” question. A group is a set equipped with some extra structure, and which satisfies some axioms. The formal answer is on Wikipedia’s page on groups. A group is a definition. But the second question is a different kind of question. Different people think about groups in different ways. Say is a group generated by an element satisfying . What can you say about ? If you are a mathematics undergraduate who has just seen the formal definition of a group, you can probably say nothing. If you have a more mature understanding of group theory, you instantly know that this group is trivial, because you have a far more sophisticated model of what is going on. Ideas are complicated, and human-dependent. A computer’s idea of what a group is, is literally a copy of the definition in Wikipedia, and this is one of the reasons that computers are currently bad at proving new theorems by themselves. You can develop a computer’s intuition by teaching it theorems about groups, or teaching it examples of groups, or trying to write AI’s which figure out group theory theorems or examples of groups automatically. But intuition is a very subtle thing, and I do not understand it at all well, so I will say no more about these ideas here. I think that the concept of a map being “canonical” is an idea rather than a definition — I think different mathematicians have different ways of thinking about this weasel word. In this post I’m going to talk about how the three other concepts are implemented in type theory, in the Lean theorem prover.

# Definitions, true/false statements, and proofs

In contrast to ideas, the other parts of mathematics (the definitions, theorems/conjectures, and proofs) can be formalised in a foundational system, and hence can be created and stored on a computer in a precise way. By this, I don’t mean a pdf file! Pdf files are exactly what I want to move away from! I mean that people have designed computer programming languages which understand one of the various foundations of mathematics (set theory, type theory, category theory) and then mathematicians can write code in this language which represents the definition, true/false statement or proof in question.

I am certainly not qualified to explain how all this works in category theory. In set theory, let me just make one observation. A definition in set theory, for example the definition of the real numbers, or , is a *set*. And a proof is a sequence of steps in *logic*. A definition and a proof seem to me to be two completely different things in set theory. A group is a mixture of these things — a group is an ordered quadruple satisfying some axioms, so it’s a set with some logic attached.

In type theory however, things are surprisingly different. All three things — definitions, true/false statements, and proofs — are *all the same kind of thing!* They are all **terms**. A group, a proof, the real numbers — they are all terms. This unification of definitions and proofs — of sets and logic — are what seems to make type theory a practical foundational system for teaching all undergraduate level mathematics to computers.

# Universes, types, and terms.

In type theory, *everything* is a term. But some terms are types. Not every term *is* a type, but every term *has* a type. A colon is used to express the type of a term in Lean — the notation `x : T`

means that `x`

is a term of type `T`

. For example, the real number `π`

(pi) is a term in Lean, and the real numbers `ℝ`

is a type, and we have `π : ℝ`

, that is, `π`

is a term of type `ℝ`

. In set theory one writes `π ∈ ℝ`

, in type theory we write `π : ℝ`

. They both express the same mathematical concept — “`π`

is a real number”.

Now `π`

is a term but it’s not a type. In Lean, `x : π`

*makes no sense*. In set theory, `x ∈ π`

does happen to make sense, but this is a weird coincidence because everything is a set. Furthermore, the actual elements of `π`

will depend on how the real numbers are implemented (as Dedekind cuts or Cauchy sequences, for example), and hence in set theory `x ∈ π`

has no *mathematical* meaning; it happens to make sense, but this is a quirk of the system.

I claimed above that every term has a type. So what is the type of `ℝ`

? It turns out that `ℝ : Type`

. The real numbers are a term of a “universe” type called `Type`

— the type theory analogue of the class of all sets.

Many of the mathematical objects which mathematicians think of as definitions either have type `Type`

, or have type `T`

where `T : Type`

. As a vague rule of thumb, things we write using capital letters (a group, a ring,…) or fancy letters (the reals, the rationals) have type `Type`

, and things we write using small letters (an element `g`

of a group, a real number `r`

or an integer `n`

) have got type `T`

where `T`

is what we think of as the set which contains these elements. For example `2 : ℕ`

and `ℕ : Type`

, or if is an element of the group then in Lean we have `g : G`

and `G : Type`

. You can see that there is a three-layer hiearchy here — terms at the bottom, types above them, and the universe at the top.

- Universe :
`Type`

- Examples of types :
`ℝ`

,`ℕ`

,`G`

(a group),`R`

(a ring),`X`

(something a set theorist would call a set), a Banach space, etc. Formally, we say`ℝ : Type`

. - Examples of terms:
`π`

(a term of type`ℝ`

),`g`

(an element of the group`G`

, so a term of type`G`

),`x`

(an element of`X`

, so a term of type`X`

). Formally, we say`g : G`

.

This hierarchy is more expressive than the hierarchy in set theory, where there are only two levels: classes (e.g. the class of all sets), and sets.

There is a standard use of the colon in mathematics — it’s in the notation for functions. If `X`

and `Y`

are sets (if you’re doing set theory) or types (if you’re doing type theory), then the notation for a function from `X`

to `Y`

is `f : X → Y`

. This is actually consistent with Lean’s usage of the colon; Lean’s notation for the collection of functions from `X`

to `Y`

is `X → Y`

, which is a type (i.e. `X → Y : Type`

, corresponding to the fact that set theorists think of as a set), and `f : X → Y`

means that `f`

is a term of type `X → Y`

, the type-theoretic version of , and the way to say that `f`

is a function from `X`

to `Y`

in type theory.

(Not for exam) Strictly speaking, universes are types, and types are terms, but this is a linguistic issue: often when people speak of types, they mean types which are not universes, and when people speak of terms they mean terms which are not types. But not always. This confused me when I was a beginner.

# Theorems and proofs

This is where the fun starts. So far, it just looks like a type is what a type theorist calls a set, and a term is what they call an element. But let’s now look at another universe in Lean’s type theory, the universe `Prop`

of true/false statements, where our traditional mental model of what’s going on is quite different. We will see how theorems and proofs can be modelled in the same way as types and terms.

So, how does this all work? As well as the universe `Type`

, there is a second universe in Lean’s type theory called `Prop`

. The terms of type `Prop`

are true/false statements. There is an unfortunate notation clash here. In mathematics, the word proposition is often used to mean a baby theorem, and theorems are true (or else they would be conjectures or counterexamples or something). Here we are using the the word Proposition in the same way as the logicians do — a Proposition is a generic true/false statement, whose truth value is of no relevance.

This will all be clearer with examples. `2 + 2 = 4`

is a Proposition, so we can write `2 + 2 = 4 : Prop`

. But `2 + 2 = 5`

is also a Proposition, so `2 + 2 = 5 : Prop`

as well. I’ll say it again — Propositions do not have to be true! Propositions are true/false statements. Let’s see some more complex examples.

The true/false statement that for all natural numbers is a Proposition: in Lean this can be expressed as `(∀ x : ℕ, x + 0 = x) : Prop`

. A Proposition is a term of type `Prop`

(just like the types we saw earlier were terms of type `Type`

). Let `RH`

denote the statement of the Riemann Hypothesis. Then `RH : Prop`

. We don’t care if it’s true, false, independent of the axioms of mathematics, undecidable, whatever. A Proposition is a true/false statement. Let’s look at the part of Lean’s type theory hierarchy which lives in the `Prop`

universe.

- Universe:
`Prop`

- Examples of types :
`2 + 2 = 4`

,`2 + 2 = 5`

, the statement of Fermat’s Last Theorem, the statement of the Riemann Hypothesis. - Examples of terms: ??

So what are the terms in this three-layer `Prop`

hierarchy? They are the proofs!

# Propositions are types, proofs are terms.

This is where the world of type theory seriously diverges from the way things are set up in set theory, and also the way things were set up in my brain up until three years ago. In trying to understand what was going on here, I even realised that mathematicians take some liberties with their language here. Before we start, consider this. The Bolzano-Weierstrass theorem is some statement in analysis about a bounded sequence having a convergent subsequence. I want to talk a little bit about how mathematicians use the phrase “Bolzano-Weierstrass theorem” in practice. A mathematician would say that the Bolzano-Weierstrass theorem is this statement about sequences having convergent subsequences. But if they are in the middle of a proof and need to apply it in order to continue with their proof, they say “by the Bolzano-Weierstrass theorem we deduce that there’s a convergent subsequence”. Nothing seems at all funny about any of this. But what I want to point out is that mathematicians are using the phrase “the Bolzano-Weierstrass theorem” in two different ways. When they say what it *is*, they are referring to the *statement* of the theorem. But when they say they’re *using* the Bolzano Weierstrass theorem, what they are actually using is its *proof*. The Birch and Swinnerton-Dyer conjecture is a perfectly well-formed true/false statement, you can certainly say what it is. But you can’t *use* the Birch and Swinnerton-Dyer conjecture in the middle of a proof of something else if you want your proof to be complete, because at the time of writing the conjecture is an unsolved problem. Making a clear distinction between the statement of a theorem, and the proof of a theorem, is important here. A mathematician might use the phrase “the Bolzano-Weierstrass theorem” to mean either concept. This informal abuse of notation can confuse beginners, because in the below it’s really important to be able to distinguish between a theorem statement, and a theorem proof; they are two very different things.

In the natural number game, I use this abuse of notation because I am trying to communicate to mathematicians. The statement `∀ x : ℕ, x + 0 = x`

is a true statement, and I say things like “this is called `add_zero`

in Lean”. In the natural number game I write statements such as `add_zero : ∀ x : ℕ, x + 0 = x`

. But what this *means* is that the term called `add_zero`

in Lean is a *proof* of `∀ x : ℕ, x + 0 = x`

! The colon is being used in the type theory way. I am intentionally vague about this concept in the natural number game. I let mathematicians believe that `add_zero`

is somehow equal to the “idea” that for all . But what is going on under the hood is that `∀ x : ℕ, x + 0 = x`

is a Proposition, which is a type, and `add_zero`

is its proof, which is a term. Making a clear distinction between the statement of a theorem, and its proof, is important here. The statements are the types, the proofs are the terms.

- Universe:
`Prop`

- Examples of types:
`2 + 2 = 4`

,`2 + 2 = 37`

, the statement of Fermat’s Last Theorem —`∀ x y z : ℕ, n > 2 ∧ x^n + y^n = z^n → x*y = 0`

. - Examples of terms: the proof that
`2 + 2 = 4`

(a term of type`2 + 2 = 4`

), the proof of Fermat’s Last Theorem (a term of type`∀ x y z : ℕ, n > 2 ∧ x^n + y^n = z^n → x*y = 0`

)

# Elements of a theorem

So our mental model of the claim `π : ℝ`

is that `ℝ`

, the type, is “a collection of stuff”, and `π`

, the term, is a member of that collection. If we continue with this analogy, it says that the statement `2 + 2 = 4`

is some kind of collection, and a proof of `2 + 2 = 4`

is a member of that collection. In other words, Lean is suggesting that we model the true/false statement `2 + 2 = 4`

as being some sort of a set, and a proof of `2 + 2 = 4`

is an element of that set. Now in Lean, it is an inbuilt axiom that all proofs of a proposition are *equal*. So if `a : 2 + 2 = 4`

and `b : 2 + 2 = 4`

then `a = b`

. This is because we’re working in the `Prop`

universe — this is how Propositions behave in Lean. In the `Type`

universe the analogue is not *remotely* true — we have `π : ℝ`

and `37 : ℝ`

and certainly `π ≠ 37`

. This special quirk of the `Prop`

universe is called “proof irrelevance”. Formally we could say that if `P : Prop`

, if `a : P`

and if `b : P`

then `a = b`

. Of course if a Proposition is false, then it has no proofs at all! It’s like the empty set. So Lean’s model of Propositions is that the true ones are like sets with 1 element, and the false ones are like sets with 0 elements.

Recall that if `f : X → Y`

then this means that `f`

is a function from `X`

to `Y`

. Now say and are Propositions, and let’s say that we know . What does this *mean*? It means that implies . It means that if is true, then is true. It means that if we have a proof of we can make a proof of . It is a function from the proofs of to the proofs of . It is a function sending an element of to an element of . It is a term of type `P → Q`

. Again: a proof of is a term `h : P → Q`

. This is why in the natural number game we use the `→`

symbol to denote implication.

Let `false`

denote a generic false statement (thought of as a set with 0 elements), and let `true`

denote a generic true statement (thought of as a set with 1 element). Can we construct a term of type `false → false`

or a term of type `true → true`

? Sure — just use the identity function. In fact, in both cases there is a unique function — the hom sets have size 1. Can we construct a term of type `false → true`

? Sure, there is a function from the set with 0 elements to a set with 1 element, and again this function is unique. But can we construct a term of type `true → false`

? No we can’t, because where do we send a proof of `true`

? There are no proofs of `false`

to send it to. So `true → false`

is a set of size 0. This corresponds to the standard truth table for `→`

, where the first three statements we analysed were true and the last was false.

# The proof of Fermat’s Last Theorem is a function

So what does a proof of `∀ x y z : ℕ, n > 2 ∧ x^n + y^n = z^n → x*y = 0`

look like? Well, there is an arrow involved in that Proposition, so the statement of Fermat’s Last Theorem is some kind of set of the form , which means that in Lean, a proof of Fermat’s Last Theorem is actually a function! And here is what that function does. It has four inputs. The first three inputs are natural numbers `x`

, `y`

and `z`

. The fourth input is a proof: it is a proof of the Proposition `n > 2 ∧ x^n + y^n = z^n`

. And the output of this function is a proof of the Proposition `x*y = 0`

. This is quite an unconventional way to think about what the proof of Fermat’s Last Theorem is, and let me stress that it does not help at all with actually trying to understand the proof — but it is a completely consistent mental model for how mathematics works. Unifying the concept of a number and a proof — thinking of them both as terms — enables you to think of proofs as functions. Lean is a functional programming language, and in particular it is designed with functions at its heart. This, I believe, is why theorem provers such as Lean, Coq and Isabelle/HOL, which use type theory, are now moving ahead of provers such as Metamath and Mizar, which use set theory.

Prove a theorem! Write a function!

Universe | `Type` | `Prop` |

Type examples | ℝ | 2 + 2 = 5, `(∀ a : ℕ, a + 0 = a)` |

Term examples | `37` , `π` | `add_zero` , `rfl` |

Great post 😁

LikeLike

Metamath is completely agnostic to logic and axioms that were used. It doesn’t have set theory or type theory embedded in its kernel. It just happened that most of its current proofs were developed in ZFC. There is alternative Metamath foundation based on HOL which derives and proves ZFC axioms from HOL. BTW, this was the work of Mario Carneiro. AFAIC this is very similar to Isabelle/ZFC and Isabelle/HOL distinction.

LikeLiked by 1 person

Thanks! I should be more careful blundering in and talking about things I don’t understand well.

LikeLike

Could you explain how type theory deals with “extra structure” (“signatures” in first order logic or universal algebra)? What I am thinking about is that a group has a multiplication operation as well as elements. In set theory, a group is an ordered pair (G, *) where * is a function G x G -> G (or maybe you represent it as an ordered triple or quadruple also including the identity element and/or inverse function) – so for a set theorist it is an abuse of notation to say “Let G be a group and g \in G”. You explained that in type theory, g is a term of type G, but where does * fit in?

LikeLike

I didn’t talk about ways to make types, but Lean allows you to make so-called structures, which are types which bundle together terms (in particular data and axioms can get bundled together in a structure). So if `G : Type` then I can make the type of group structures on G, called `group G`, to be a new type; to give a term of type `group G` is precisely to give a multiplication, inverse, identity, and the proofs of the axioms for a group. The problem with that as it stands is that I need to find a way to carry this term around in an invisible way, and Lean does this with something called type class inference — the same trick which Haskell uses to carry around notation, but beefed up. Type class inference is hard to get right — and it’s not something users can mess with in Lean 3 because it’s hard coded into the C++. It does the job of generating proofs that a field is a monoid etc, all behind the scenes. It uses a prolog-like search to solve problems. It’s being completely rewritten in Lean 4. You can see me making groups here: https://github.com/ImperialCollegeLondon/group-theory-game/blob/master/src/group/definitions.lean and you can see how the user interacts with them here: https://github.com/ImperialCollegeLondon/group-theory-game/blob/master/src/group/theorems.lean

LikeLike

Thanks! So if I understand correctly, it’s not really true that “a group G is a type”. Rather the underlying set G of a group (G, *) is a type, and the group (G, *) is a term of type `group G`.

Am I right to think that type class inference in Lean does something similar to the abuse of notation which we do all the time in maths of writing G to mean both the group and the underlying set? I don’t know much about Haskell, but this seems consistent with my vague idea about what type classes do there.

Actually now I wonder if the same thing happens with sets: is there a type `set S`(where S is a type) which is non-empty if and only if S is a set? Is there at most one term of type `set S`, or is there some way you can have more than one?

LikeLike

Well there are two ways to set it up. The link I showed you is the way to do it so that the total amount of data involved in saying “let G be a group” is two terms, one being the type G (which is just the set), and the other being a term of type `group G` (which is the structure and the proofs of the axioms). However we also have the category of groups, and this is called `Group`, and so you can make a group by just giving a term of type Group (and this one term then contains all the data, the set/type and the structure and the proofs). There are invisible functions which take you between the two definitions.

Speaking about the method corresponding to the links I posted, what’s going on is that in practice we let G be a type, and then if G is a group we insert the term of type `group G` into the type class inference system and then forget about it. And then if we have `g : G` and `h : G` and write `g * h`, Lean thinks “hmm, where is this multiplication coming from” and asks the typeclass inference system if it can find a term of type `has_mul G` (to create a term of type `has_mul G` you just have to give a map G x G -> G, that’s the only constructor; the multiplication notation `*` is attached to this typeclass). The typeclass inference system can find a term of type `group G`, and it extracts the term of type `has_mul G` from that. As you can imagine, there’s a gigantic graph of these typeclasses — a perfectoid ring is a topological space and a multiplicative monoid and a Huber ring and… — and this is where the prolog comes in I think (in particular when you treat the product of two topological rings as a topological ring without comment — the typeclass system can handle this; I guess that’s when you start having to be clever about these things).

There’s no such thing as sets in type theory. What someone like you thinks of as a set, I guess I now think of as a term of type `Type`. Coincidentally, there is a type `set S`, if `S : Type`, but it is the type of subsets of S (I often thought it was poorly named). If G is a group then `subgroup G` is the type of subgroups of G.

LikeLike

Ah! I had assumed that some types are sets (like the natural numbers) while others are not (like propositions). But maybe that’s not a useful notion.

This made me wonder: what is `set S` when S is a type which I don’t think of as a set? So I tried typing `#reduce (set true)` into the Lean web editor and got an error I don’t understand.

LikeLike

`true : Prop` and `set : Type -> Type` so `set` won’t eat `true`. Some functions are unfussed about whether they eat Props or Types, but apparently `set` is fussy. `Sort u` is Lean’s way of saying “I don’t care if it’s Prop or Type”

LikeLike

Coq has a Set universe. Lean just has Prop and Type (and then the extra universes on top, because Type : Type 1, and Type 1 : Type 2 etc; I don’t usually go up there).

LikeLike

Thanks! I had incorrectly thought that every type is a Type. Now that I write that down, I can see why it’s undesirable due to Russell’s paradox (even apart from Prop). Now that I go back and read that Type is “the type theory analogue of the class of all sets” it makes more sense. Basically I should think of a type as being “like a set” if it is in the Type universe but not otherwise.

LikeLike

Pingback: Mathematics in type theory | Hacker News

Great post! What would an example proof look like? You use add_zero and rfl as examples, but how would these be represented? Could you give an example of a super simple proof?

LikeLike

In Lean you can `#print` a proof and see how it’s stored internally, but a `rfl` proof would really be stored as `rfl`, and `add_zero` is an axiom so would be stored as `add_zero`. For `zero_add` on the other hand, this is proved by induction so there would be a `nat.rec` in there, the recursor for the naturals. In NNG finish a proof, type `end` and then `#print ` and you’ll probably see for yourself!

LikeLike

Great post. I would recommend it just for how clear and simply it explains “proof irrelevance”, which took me quite a long time to understand, having received my computer mathematics education via Isabelle/HOL, where the correspondence between proofs and terms, and propositions and types, is not relevant.

Isabelle/HOL is based on simple types ala Church, and does not make use of this correspondence. Proofs are not formal objects in Isabelle (except in optional add-ons like Berghofer’s proof terms). In this respect it is much closer to how mathematicians actually think about mathematics than full type theoretic systems like Lean.

Personally, I consider the correspondence a mere technical device, and one that obscures things more than it illuminates. You CAN press mathematics into this form, if you want to, but why would you want to?

I think instead formal computer mathematics should look more towards the linguistic origins of mathematics. There exist projects which take natural language as input and produce formal output, like Hales tries to do with Lean. I think one should go even further here and consider a more high-level formal mechanism as first class much closer to actual mathematical language than set theory or type theory.

LikeLike

I totally agree that the fact that propositions/proofs and sets/elements can be bundled together in the form types/terms is a mere technical device. I would not really say that it makes things worse — it’s a different way of looking at things. I would definitely say that understanding it does not give a human an insight into how to do mathematics better. The answer to the question “you can do it like this, but why would you want to” is “maybe computers prefer doing it like this. Maybe it will help to make databases of mathematics. Maybe it will help AI to prove theorems. Maybe. Let’s see.”

I am super-positive about the Hales/Koepke effort to get Lean understanding formal language. Note this enigmatic recent PR to the Lean 4 repo: https://github.com/leanprover/lean4/commit/e6b988a10a1d670327e78a59dd158af54fc50c52 . Exciting times are ahead. But my gut feeling is this potential natural language bridge will be a step towards solving the problem of getting humans and computers to communicate with each other, rather than the problem of getting computers to do mathematics by themselves.

LikeLike

Very interesting commit, exciting times indeed!

So after your huge Lean perfectoid spaces effort, would you say that now you can express yourself in Lean as you would wish? Or do you feel restricted by it in some ways, apart from the need to be absolutely formal? In other words, given the task to formalise the definition of perfectoid spaces, would you say that Lean allows you to express it in the way you wish you could express it?

LikeLike

That’s a really interesting question! I would say that I feel slightly restricted sometimes, but that in the past when I’ve felt restricted it has been because I’ve not been thinking about things in the right way. The fact that a term can only have one type is very jarring when you’re doing algebra and you have a ring contained within a bigger ring. I want to do more algebra in Lean over the summer to try and figure out whether the problem is with me or with type theory. We wrote the definition of a perfectoid space in a natural way, but I’d learnt a lot from formalising schemes, where we had not understood the fact that R[1/fg] and R[1/f][1/g], whilst being canonically isomorphic, were not equal, and the issues which this would raise.

LikeLike

Yes, I think the basic problem in type theory is that two types are either equal or completely disjoint. Hence your trouble with extending rings, etc.

I think it is possible to arrive at a logic which is completely natural for a mathematician. The ingredients are:

1. A powerful set theory embedded in simply-typed higher-order logic, as in : https://drops.dagstuhl.de/opus/volltexte/2019/11064/pdf/LIPIcs-ITP-2019-9.pdf

2. A way to express undefinedness. This can be done as part of the simply-typed higher-order logic, and would therefore also be available then in the set theory: https://arxiv.org/abs/1406.7492

3. A way for data abstraction. I hope that this is maybe possible like this: https://mathoverflow.net/questions/366014/data-abstraction-in-set-theory-via-urelements

And of course, you would need to combine these ingredients in an easy to use system.

LikeLike

`theorem easy : 37 = 37 := rfl`. Now `37=37` is a Proposition, and `easy` is a proof of it. If you try the natural number game http://wwwf.imperial.ac.uk/~buzzard/xena/natural_number_game/ you’ll be able to learn how to make your own proofs.

LikeLike

Where is the art from? Looks really interesting

LikeLike

My daughter makes it

LikeLike