Equality, specifications, and implementations

Equality is such a dull topic of conversation to mathematicians. Equality is completely intrinsic to mathematics, it behaves very well, and if you asked a mathematician to prove that equality of real numbers is an equivalence relation then they would probably struggle to say anything of content; it’s just obviously true. Euclid included reflexivity and transitivity of equality as two of his “common notions”, and symmetry was equally clear from his language — he talks about two things being “equal to each other” rather than distinguishing between a = b and b = a.

One thing that has helped me start to understand why computer scientists make a fuss about equality is the observation that if you follow Peano’s development of the natural numbers (as I do in the natural number game) then you come to the following realisation: if you define addition by recursion in the second variable (i.e. a + 0 := a and a + succ(n) := succ(a + n) ) then life temporarily becomes asymmetric. The fact that a + 0 = a is an axiom. However the fact that 0 + a = a needs to be proved by induction. Now induction is also an axiom, so a mathematician would just say that despite the fact that the proofs have different lengths, 0 + a = a and a + 0 = a are both theorems, so the fact that digging down to the foundations shows that the proofs are of a different nature is really of no relevance.

To a computer scientist however, there is a difference in these proofs. This difference seems to be of no relevance to mathematics. But the difference is that, if you set the natural numbers up in this way, then a + 0 = a is true by definition, and 0 + a = a is not. Indeed, in Lean’s type theory (and probably in many others) there are three types of equality that you have to be aware of:

  1. Propositional equality;
  2. Definitional equality;
  3. Syntactic equality.

Let’s start by getting one thing straight: to a mathematician, all of these things are just called equality. In fact, even more is true: definitional equality is not a mathematical concept. All of these kinds of equalities are easy to explain, so let me go through them now.

Propositional equality: a = b is a propositional equality if you can prove it.

Definitional equality: a = b is a definitional equality if it’s true by definition.

Syntactic equality: a = b is a syntactic equality if a and b are literally the same string of characters.

For example, let’s go back to Peano’s definition of the natural numbers and the conventions for addition above. Then a + 0 = a is a definitional equality but not a syntactic one, 0 + a = a is a propositional equality but not a definitional one, and a = a is a syntactic equality. To show that 0 + a = a is not definitionally true, you have to ask yourself what the definition of 0 + a is; and because we don’t know whether a = 0 or not, 0 + a cannot be definitionally simplified any further (the definition of x + y depends on whether y = 0 or not).

[Technical note: syntactic equality does allow for renaming of bound variables, so \{a \in \mathbb{R}\, |\, a^2 = 2\} is syntactically equal to \{b \in \mathbb{R}\, |\, b^2=2\}. If you understand that idea that notation is syntax sugar then you’ll probably know that syntactic equality can see through notation too, which means to say that add(a,0) = a + 0 is also a syntactic equality. But that’s it.]

Of course 2 + 2 = 4 is not a syntactic equality; removing notation and writing S for “successor”, and working under the assumption that 2 is syntax sugar for S(S(0)) and 4 for S(S(S(S(0)))), we see that the left hand side is syntactically add(S(S(0),S(S(0)) and the right hand side is S(S(S(S(0)))). However it is a definitional equality! add(x,S(y))=S(add(x,y)) is true by definition, as is add(x,0)=x, and it’s fun to check that applying these rules a few times will reduce 2 + 2 to 4.

The reason that it’s important to understand the differences if you are writing Lean code, is that different tactics work with different kinds of equality. Lean’s refl tactic attempts to close the goal by showing that one side of an equality is definitionally equal to the other side. If your goal is X then change Y will work if and only if Y is definitionally equal to X. On the other hand Lean’s rw tactic works at the level of syntactic equality: if h : A = B then rw h will change everything syntactically equal to A in the goal, into a B. If h : a + 0 = b and your goal is a = b then rw h will fail because the equality a + 0 = a is not syntactic. However exact h will work fine, because exact works up to definitional equality.

Specification v implementation

The fact that a + 0 = a is a definitional equality in the natural number game, but 0 + a = a isn’t, is as far as I am concerned a proof that definitional equality is not a mathematical concept. Indeed one can clearly just define addition on the natural numbers by recursion on the first variable instead of the second, and then 0 + a = a would be definitional and a + 0 = a would not be. What is going on here was made clear to me after a conversation I had with Steve Vickers after a seminar I gave to the University of Birmingham CS theory group. Mathematicians have a specification of the natural numbers. We know what we want: we want it to satisfy induction and recursion, we want it to be a totally ordered commutative semiring (i.e. all the ring axioms other than those involving minus) and we can take it from there thank you very much. If you present me with an object which satisfies these theorems I can use it to prove quadratic reciprocity. I don’t care what the actual definition of addition is, indeed I know several definitions of addition and I can prove they’re all equivalent.

If you’re doing to do mathematics in a theorem prover, you have to make one definition. Mathematicians know that all the definitions of the natural numbers are the same. If you want to set up mathematics in set theory for example, then it doesn’t matter whether you decide to let 3 = \{2\} or 3 = \{0,1,2\}: any system which ensures that 3 isn’t any of the sets you’ve already made is clearly going to work. But in a computer theorem prover you need to make choices — you need to make implementations of 3 and of add — and the moment that choice is made you now have a dichotomy: some stuff is true by definition, and some stuff needs an argument like induction and is not true by definition.

The first time I really noticed the specification / implementation difference was when it dawned on me that Lean’s maths library had to choose a definition of the reals, and it went with the Cauchy sequence definition: a real number in Lean is an equivalence class of Cauchy sequences. An alternative approach would have been to define it as Dedekind cuts. As mathematicians, we don’t care which one is used, because we are well brought up and we promise to only ever access the real numbers via its interface, or its API to borrow a computing term. The interface is the specification. We mathematicians have a list of properties which we want the real numbers to satisfy: we want it to be a complete archimedean ordered field. Furthermore we have proved a theorem that says that any two complete archimedean ordered fields are uniquely isomorphic, and this is why we do not care one jot about whether we are using Cauchy sequences or Dedekind cuts. Lean gives me access to an interface for the real numbers which knows these things, and it’s all I need to build a theory of Banach spaces. As mathematicians we somehow know this fact implicitly. If I am trying to prove a theorem about Banach spaces, and I have a real number \lambda, I never say “Ok now for the next part it’s really important that \lambda is under the hood defined to be a Dedekind cut”. If I want the Dedekind cut associated to \lambda, I can just make it. I don’t care whether it equals \lambda by definition or not, because definitional equality is not a mathematical concept. All I care about is access to the interface — I’m proving a theorem about Banach spaces here, and I just need to have access to stuff which is true. The job of Lean’s file data.real.basic is to give me access to that interface, and I can build mathematics from there.

Computer scientists on the other hand — they have to care about definitional equality, because it’s often their job to make the interface! If two things are definitionally equal then the proof they’re equal is refl, which is pretty handy. Different definitions — different implementations of the same specification — might give you a very different experience when you are making an interface for the specification. If you really have too much time on your hands this lockdown, why not go and try proving that addition on the real numbers is associative, using both the Dedekind cuts definition and the Cauchy sequences definition? For Cauchy sequences it just boils down to the fact that addition is associative on the rationals. But you’ll find that it’s a real bore with Dedekind cuts, because Dedekind cuts have this annoying property that you need a convention for the cuts corresponding to rational numbers: whether to put the rational number itself into the lower or upper cut. Neither convention gives a nice definition of addition. You can’t just add the lower cuts and the upper cuts, because the sum of two irrationals can be a rational. Multiplication is even worse, because multiplication by a negative number switches the lower and upper cut, so you have to move a boundary rational between cuts. You can see why Lean went for the Cauchy sequences definition.

I ran into this “which implementation to use for my specification” issue myself recently. I notice that I have been encouraging students at Imperial to formalise courses which I have been lecturing, which recently have been algebra courses such as Galois theory. I am by training an algebraic number theorist, and really by now I should have turned my attention the arithmetic of number fields and their integers: as far as I can see, finiteness of the class group of a number field has been formalised in no theorem provers at all, so it is probably low-hanging fruit for anyone interested in a publication, and we surely have all the required prerequisites in Lean now. I thought that I would try and get this area moving by formalising the definitions of a Dedekind Domain and a discrete valuation ring (DVR). I looked up the definition of discrete valuation ring on Wikipedia and discovered to my amusement that there are (at the time of writing) ten definitions 🙂

Now here I am trying to be a theory builder: I want to make a basic API for DVRs so that students can use them to prove results about local and global fields. So now I have to decide upon a definition, and then prove that it is equivalent to some of the other definitions — I need to make enough interface to make it easy for someone else to take over. As far as I could see though, what the actual definition of a DVR is, is of no real importance, because it doesn’t change the contents of the theorems, it only changes the way you state them. So I just chose a random one 😛 and it’s going fine!

Equality of terms, equality of types

When talking about propositional and definitional equality above, my main examples were equality between natural numbers: 0 + a = a and what have you. Set-theoretically, we can happily think about 0 + a = a as equality between two elements of a set — the set of natural numbers. In type theory we are talking about equality between two terms of type T, where T : Type . But one can take this a level up. Say A : Type and B : Type (for example say A is the Cauchy reals, and B is the Dedekind reals). What does A = B mean? These are now not elements of a set, but objects of a category. Certainly the Cauchy reals and the Dedekind reals are not going to be definitionally equal. Can we prove they are propositionally equal though? No — of course not! Becuase they are not actually equal! They are, however canonically isomorphic. A fourth type of equality!

All this equality naval-gazing is important to understand if you are talking about equality of terms. This, we have nailed. For mathematicians there is one kind of equality, namely “it is a theorem that a = b“. For computer scientists there are three kinds, and understanding the distinction might be important for interface extraction.

But for equality of types, something funny is going on. Which is the most accurate? (A \otimes B) \otimes C \cong A \otimes (B \otimes C) or (A \otimes B) \otimes C = A \otimes (B \otimes C)? This is just more notational naval-gazing, who cares whether these things are isomorphic or equal – they are manifestly the same, and we can always replace one by the other in any reasonable situation because they both satisfy the same universal property. However, I am realising that as a Lean user I need to say very carefully what I mean by “a reasonable situation”, and actually the safest way to do that is to not prove any theorems at all about (A \otimes B) \otimes C other than the fact that it satisfies its universal property, and then instead prove theorems about all objects which satisfy that universal property. Mathematicians do not use this technique. They write their papers as if they are proving theorems about the concrete object (A \otimes B) \otimes C, but their proofs are sensible and hence apply to any object satisfying its universal property, thus can be translated without too much problem, once one has extracted enough API from the universal property. There is an art to making this painless. I learnt from Strickland the three key facts that one should prove about a localisation R \to R[1/S] of rings: the kernel is the elements annihilated by an element of S, the image of every element of S is invertible, and the map from R\times S to the target sending (r,s) to r/s is surjective. These three facts together are equivalent to the universal property of R[1/S] and now you can throw it away and build the rest of your localisation API on top of it. Indeed, when my former MSc student Ramon Fernández Mir formalised schemes in Lean, he needed this result from the stacks project to prove that affine schemes were schemes, but more generally for the case of rings only satisfying the same universal properties as the rings in the lemma. At the time he needed it (about 18 months ago) the proof only used the three facts above isolated by Strickland, and so was easy to prove in the generality he required.

However, in Milne’s book on etale cohomology, it is decreed in the Notation section that = means “canonical isomorphism”, and so there will be a lot more of this nonsense which we will have to understand properly if we want to formalise some serious arithmetic geometry in Lean.

Advertisement

About xenaproject

The Xena Project aims to get mathematics undergraduates (at Imperial College and beyond) trained in the art of formalising mathematics on a computer. Why? Because I have this feeling that digitising mathematics will be really important one day.
This entry was posted in Algebraic Geometry, General, Type theory and tagged , , . Bookmark the permalink.

6 Responses to Equality, specifications, and implementations

  1. Robert Brijder says:

    Object-oriented programming languages like C++ and Java have a construct called “private” to hide implementation details. You then implement an interface that can access the internals (this code resides inside of the class that you are building), but the outside world (outside of the class) cannot access these internals and can access it only indirectly, through the interface. This way you are free to later completely change the implementation without breaking anything as long as you keep the interface to the outside world intact. In your example, you could then move from Cauchy sequences to Dedekind cuts without anybody noticing this. Nobody in the outside world can be naughty and look how the reals are actually defined. Does Lean have such a “private” construct or is this just fundamentally impossible somehow?

    Like

    • xenaproject says:

      This is a great question and one that I asked on the lean chat a while ago. Apparently it’s a bit like python — you can strongly discourage people from looking in the box, but you can’t actually stop them from looking in the box

      Like

      • xenaproject says:

        It turns out that in Lean 4 there might be a way to do this. You can “break” Lean 3 by just postulating the existence of an object which doesn’t exist with the `constant` and `axiom` commands; then you can prove anything you like. Hence there is no usage of these commands at all in Lean’s maths library. However in Lean 4 one will have to prove that something satisfying the properties of a `constant` does exist, and this might be a way to implement the language feature you’re asking about. We will have to wait and see 🙂

        Like

      • I don’t think Kevin’s answer here is complete. The `@[irreducible]` attribute goes a long way towards making a definition opaque. It is completely true that it is not a perfect mechanism, but perfect is the enemy of good. As Kevin says Lean 4 may be even better.

        Deciding exactly what should be `irreducible` is an art requiring mathematical experience, and we don’t use it nearly often enough in the Lean mathlib library. Pull requests welcome!

        Like

      • Robert Brijder says:

        Thanks for all the replies. That is very interesting.

        Related to this, polymorphism is also a well-known concept in object-oriented programming. This allows one to make multiple classes/implementations for a single interface. Code can be made against the interface and would work for any class that implements that interface. So, there is no fixed implementation of the interface. From the notion of definitional equality in the blog post, it seems to me that Lean does require a fixed definition/implementation. Is it true that Lean does not have polymorphism? Since mathematicians do not seem to understand the notion of definitional equality, it seems to me that polymorphism would be very natural here. A proof using the reals would then be correct if there is at least one implementation of its interface — doesn’t matter how many.

        Like

      • Lean certainly has polymorphism, primarily via the typeclass mechanism.

        (There’s also allowance for structures with multiple inheritance.)

        The typeclass mechanism is broadly that like in Haskell or in Coq, but my understanding is that Lean is perhaps the most ambitious in this direction. Lean 4 is introducing new algorithms to handle large typeclass inference problems, but several improvements have also been backported to Lean 3, in response to the needs of mathematicians writing mathlib.

        It would certainly be possible, anytime one wanted to use the real numbers, to not write

        `import data.real.basic`

        but instead to write something like

        “`
        variables (R : Type) [complete_archimedean_ordered_field R]
        “`

        (not real Lean code; that’s probably meant to be several separate instances)

        This would then completely insulate one from the actual definition.

        Nevertheless, because there’s only one way (up to isomorphism!) to satisfy these hypotheses, people so far have tended not to introduce this indirection. It’s certainly available however, as an alternative to “maintaining hygiene” by informally promising not to accidentally use an underlying definition (or using the potential new features in Lean 4 which would do this formally).

        Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s