Equality is such a dull topic of conversation to mathematicians. Equality is completely intrinsic to mathematics, it behaves very well, and if you asked a mathematician to prove that equality of real numbers is an equivalence relation then they would probably struggle to say anything of content; it’s just obviously true. Euclid included reflexivity and transitivity of equality as two of his “common notions”, and symmetry was equally clear from his language — he talks about two things being “equal to each other” rather than distinguishing between
a = b and
b = a.
One thing that has helped me start to understand why computer scientists make a fuss about equality is the observation that if you follow Peano’s development of the natural numbers (as I do in the natural number game) then you come to the following realisation: if you define addition by recursion in the second variable (i.e.
a + 0 := a and
a + succ(n) := succ(a + n) ) then life temporarily becomes asymmetric. The fact that
a + 0 = a is an axiom. However the fact that
0 + a = a needs to be proved by induction. Now induction is also an axiom, so a mathematician would just say that despite the fact that the proofs have different lengths,
0 + a = a and
a + 0 = a are both theorems, so the fact that digging down to the foundations shows that the proofs are of a different nature is really of no relevance.
To a computer scientist however, there is a difference in these proofs. This difference seems to be of no relevance to mathematics. But the difference is that, if you set the natural numbers up in this way, then
a + 0 = a is true by definition, and
0 + a = a is not. Indeed, in Lean’s type theory (and probably in many others) there are three types of equality that you have to be aware of:
- Propositional equality;
- Definitional equality;
- Syntactic equality.
Let’s start by getting one thing straight: to a mathematician, all of these things are just called equality. In fact, even more is true: definitional equality is not a mathematical concept. All of these kinds of equalities are easy to explain, so let me go through them now.
a = b is a propositional equality if you can prove it.
a = b is a definitional equality if it’s true by definition.
a = b is a syntactic equality if
b are literally the same string of characters.
For example, let’s go back to Peano’s definition of the natural numbers and the conventions for addition above. Then
a + 0 = a is a definitional equality but not a syntactic one,
0 + a = a is a propositional equality but not a definitional one, and
a = a is a syntactic equality. To show that
0 + a = a is not definitionally true, you have to ask yourself what the definition of
0 + a is; and because we don’t know whether
a = 0 or not,
0 + a cannot be definitionally simplified any further (the definition of
x + y depends on whether
y = 0 or not).
[Technical note: syntactic equality does allow for renaming of bound variables, so is syntactically equal to . If you understand that idea that notation is syntax sugar then you’ll probably know that syntactic equality can see through notation too, which means to say that
add(a,0) = a + 0 is also a syntactic equality. But that’s it.]
2 + 2 = 4 is not a syntactic equality; removing notation and writing
S for “successor”, and working under the assumption that
2 is syntax sugar for
S(S(S(S(0)))), we see that the left hand side is syntactically
add(S(S(0),S(S(0)) and the right hand side is
S(S(S(S(0)))). However it is a definitional equality!
add(x,S(y))=S(add(x,y)) is true by definition, as is
add(x,0)=x, and it’s fun to check that applying these rules a few times will reduce
2 + 2 to
The reason that it’s important to understand the differences if you are writing Lean code, is that different tactics work with different kinds of equality. Lean’s
refl tactic attempts to close the goal by showing that one side of an equality is definitionally equal to the other side. If your goal is
change Y will work if and only if
Y is definitionally equal to
X. On the other hand Lean’s
rw tactic works at the level of syntactic equality: if
h : A = B then
rw h will change everything syntactically equal to
A in the goal, into a
h : a + 0 = b and your goal is
a = b then
rw h will fail because the equality
a + 0 = a is not syntactic. However
exact h will work fine, because
exact works up to definitional equality.
Specification v implementation
The fact that
a + 0 = a is a definitional equality in the natural number game, but
0 + a = a isn’t, is as far as I am concerned a proof that definitional equality is not a mathematical concept. Indeed one can clearly just define addition on the natural numbers by recursion on the first variable instead of the second, and then
0 + a = a would be definitional and
a + 0 = a would not be. What is going on here was made clear to me after a conversation I had with Steve Vickers after a seminar I gave to the University of Birmingham CS theory group. Mathematicians have a specification of the natural numbers. We know what we want: we want it to satisfy induction and recursion, we want it to be a totally ordered commutative semiring (i.e. all the ring axioms other than those involving minus) and we can take it from there thank you very much. If you present me with an object which satisfies these theorems I can use it to prove quadratic reciprocity. I don’t care what the actual definition of addition is, indeed I know several definitions of addition and I can prove they’re all equivalent.
If you’re doing to do mathematics in a theorem prover, you have to make one definition. Mathematicians know that all the definitions of the natural numbers are the same. If you want to set up mathematics in set theory for example, then it doesn’t matter whether you decide to let or : any system which ensures that 3 isn’t any of the sets you’ve already made is clearly going to work. But in a computer theorem prover you need to make choices — you need to make implementations of
3 and of
add — and the moment that choice is made you now have a dichotomy: some stuff is true by definition, and some stuff needs an argument like induction and is not true by definition.
The first time I really noticed the specification / implementation difference was when it dawned on me that Lean’s maths library had to choose a definition of the reals, and it went with the Cauchy sequence definition: a real number in Lean is an equivalence class of Cauchy sequences. An alternative approach would have been to define it as Dedekind cuts. As mathematicians, we don’t care which one is used, because we are well brought up and we promise to only ever access the real numbers via its interface, or its API to borrow a computing term. The interface is the specification. We mathematicians have a list of properties which we want the real numbers to satisfy: we want it to be a complete archimedean ordered field. Furthermore we have proved a theorem that says that any two complete archimedean ordered fields are uniquely isomorphic, and this is why we do not care one jot about whether we are using Cauchy sequences or Dedekind cuts. Lean gives me access to an interface for the real numbers which knows these things, and it’s all I need to build a theory of Banach spaces. As mathematicians we somehow know this fact implicitly. If I am trying to prove a theorem about Banach spaces, and I have a real number , I never say “Ok now for the next part it’s really important that is under the hood defined to be a Dedekind cut”. If I want the Dedekind cut associated to , I can just make it. I don’t care whether it equals by definition or not, because definitional equality is not a mathematical concept. All I care about is access to the interface — I’m proving a theorem about Banach spaces here, and I just need to have access to stuff which is true. The job of Lean’s file
data.real.basic is to give me access to that interface, and I can build mathematics from there.
Computer scientists on the other hand — they have to care about definitional equality, because it’s often their job to make the interface! If two things are definitionally equal then the proof they’re equal is
refl, which is pretty handy. Different definitions — different implementations of the same specification — might give you a very different experience when you are making an interface for the specification. If you really have too much time on your hands this lockdown, why not go and try proving that addition on the real numbers is associative, using both the Dedekind cuts definition and the Cauchy sequences definition? For Cauchy sequences it just boils down to the fact that addition is associative on the rationals. But you’ll find that it’s a real bore with Dedekind cuts, because Dedekind cuts have this annoying property that you need a convention for the cuts corresponding to rational numbers: whether to put the rational number itself into the lower or upper cut. Neither convention gives a nice definition of addition. You can’t just add the lower cuts and the upper cuts, because the sum of two irrationals can be a rational. Multiplication is even worse, because multiplication by a negative number switches the lower and upper cut, so you have to move a boundary rational between cuts. You can see why Lean went for the Cauchy sequences definition.
I ran into this “which implementation to use for my specification” issue myself recently. I notice that I have been encouraging students at Imperial to formalise courses which I have been lecturing, which recently have been algebra courses such as Galois theory. I am by training an algebraic number theorist, and really by now I should have turned my attention the arithmetic of number fields and their integers: as far as I can see, finiteness of the class group of a number field has been formalised in no theorem provers at all, so it is probably low-hanging fruit for anyone interested in a publication, and we surely have all the required prerequisites in Lean now. I thought that I would try and get this area moving by formalising the definitions of a Dedekind Domain and a discrete valuation ring (DVR). I looked up the definition of discrete valuation ring on Wikipedia and discovered to my amusement that there are (at the time of writing) ten definitions 🙂
Now here I am trying to be a theory builder: I want to make a basic API for DVRs so that students can use them to prove results about local and global fields. So now I have to decide upon a definition, and then prove that it is equivalent to some of the other definitions — I need to make enough interface to make it easy for someone else to take over. As far as I could see though, what the actual definition of a DVR is, is of no real importance, because it doesn’t change the contents of the theorems, it only changes the way you state them. So I just chose a random one 😛 and it’s going fine!
Equality of terms, equality of types
When talking about propositional and definitional equality above, my main examples were equality between natural numbers:
0 + a = a and what have you. Set-theoretically, we can happily think about
0 + a = a as equality between two elements of a set — the set of natural numbers. In type theory we are talking about equality between two terms of type
T : Type . But one can take this a level up. Say
A : Type and
B : Type (for example say
A is the Cauchy reals, and
B is the Dedekind reals). What does
A = B mean? These are now not elements of a set, but objects of a category. Certainly the Cauchy reals and the Dedekind reals are not going to be definitionally equal. Can we prove they are propositionally equal though? No — of course not! Becuase they are not actually equal! They are, however canonically isomorphic. A fourth type of equality!
All this equality naval-gazing is important to understand if you are talking about equality of terms. This, we have nailed. For mathematicians there is one kind of equality, namely “it is a theorem that
a = b“. For computer scientists there are three kinds, and understanding the distinction might be important for interface extraction.
But for equality of types, something funny is going on. Which is the most accurate? or ? This is just more notational naval-gazing, who cares whether these things are isomorphic or equal – they are manifestly the same, and we can always replace one by the other in any reasonable situation because they both satisfy the same universal property. However, I am realising that as a Lean user I need to say very carefully what I mean by “a reasonable situation”, and actually the safest way to do that is to not prove any theorems at all about other than the fact that it satisfies its universal property, and then instead prove theorems about all objects which satisfy that universal property. Mathematicians do not use this technique. They write their papers as if they are proving theorems about the concrete object , but their proofs are sensible and hence apply to any object satisfying its universal property, thus can be translated without too much problem, once one has extracted enough API from the universal property. There is an art to making this painless. I learnt from Strickland the three key facts that one should prove about a localisation of rings: the kernel is the elements annihilated by an element of , the image of every element of is invertible, and the map from to the target sending to is surjective. These three facts together are equivalent to the universal property of and now you can throw it away and build the rest of your localisation API on top of it. Indeed, when my former MSc student Ramon Fernández Mir formalised schemes in Lean, he needed this result from the stacks project to prove that affine schemes were schemes, but more generally for the case of rings only satisfying the same universal properties as the rings in the lemma. At the time he needed it (about 18 months ago) the proof only used the three facts above isolated by Strickland, and so was easy to prove in the generality he required.
However, in Milne’s book on etale cohomology, it is decreed in the Notation section that
= means “canonical isomorphism”, and so there will be a lot more of this nonsense which we will have to understand properly if we want to formalise some serious arithmetic geometry in Lean.