At university I was told about how to quotient by an equivalence relation. But I was not told the right things about it. This workshop is an attempt to put things right.
Let me start by talking about things I learnt in my second and third year as an undergraduate. I went to a course called Further Topics in Algebra, lectured by Jim Roseblade, and in it I learnt how to take the tensor product of two finite-dimensional -vector spaces. Jim explained to us the universal property of the tensor product, and I saw for the first time in my life the abstract nonsense argument which explains that objects which satisfy the universal property for tensor products are unique up to unique isomorphism. He also explained that writing down the universal property did not count as a definition. The abstract nonsense argument shows that tensor products are unique if they exist. To prove that they exist, Jim wrote down an explicit model, involving taking a quotient of a gigantic free abelian group with basis the pairs , modulo the relations saying that it is a vector space satisfying the universal property. I came out of these lectures with a good understanding of this technique. Later on in my undergraduate education I met things such as the localisation of a commutative ring at a multiplicative subset, and again I understood that these things were unique up to unique isomorphism if they existed, and that one could write down an explicit model to show they existed.
I came away with the impression that the key fact was the universal property, from which everything else could be proved, and that the model was basically irrelevant. To my surprise, I have learnt more recently that this is not exactly the whole truth. Here is an example, due to Patrick Massot. Let be a commutative ring, and let be a multiplicative subset. I claim that the kernel of the map is precisely the annihilator of . Using the universal property and elementary arguments we can reduce to the statement that if is in the kernel of every ring homomorphism sending to units, then is annihilated by an element of , but as far as I can see, to prove this we have to come up with a cunning , and letting be the explicit model of constructed as a quotient of does the job. In particular the model appears again in the argument! Not equipped with the proof that it is the initial object in the category of -algebras in which is invertible, but equipped with the proof that it is an -algebra in which is invertible. My MSc student Amelia Livingston came up with other examples of this as part of her work on Koszul complexes in Lean. But I digress. Let’s get on to what we’ll talk about today.
What is a quotient?
At university, in my first year, I was taught the following construction. If is a set and is an equivalence relation on , then one can define the set of equivalence classes for this equivalence relation. There is a natural map from to sending to its equivalence class. We write .
All the way through my undergraduate career, when taking quotients, I imagined that this was what was going on. For example when forming quotient groups, or quotient vector spaces, or later on in my life quotient rings and quotient modules, I imagined the elements of the quotient to be sets. I would occasionally look at elements of elements of a quotient set, something rarely done in other situations. I would define functions from a quotient set to somewhere else by choosing a random representative, saying where the representative went, and then proving that the construction was ultimately independent of the choice of representative and hence “well-defined”. This was always the point of view presented to me.
I have only relatively recently learnt that actually, this model of a quotient as a set of equivalence classes is nothing more than that — it’s just a model.
The universal property of quotients
Here’s the universal property. Say is a set equipped with an equivalence relation. A quotient for this data is a pair consisting of a set and a function which is constant on equivalence classes (i.e. ) and which is furthermore initial with respect to that property. In other words, if is any other set and is any function which is constant on equivalence classes, there exists a unique such that . The usual abstract nonsense argument shows that quotients are unique up to unique isomorphism.
Example 1) Let be a set equipped with an equivalence relation , let be the set of equivalence classes of , equipped with the map sending an element of to its equivalence class. Then is a quotient of by .
Example 2) Let and be any sets, and say is a surjection. Define an equivalence relation on by . Then is a quotient of by .
Example 2 shows that this construction of quotients using equivalence classes is nothing more than a model, and that there are plenty of other sets which show up naturally and which are not sets of equivalence classes but which are quotients anyway. The important point is the universal property. In contrast to localisation of rings, I know of no theorem about quotients for which the “equivalence class” model helps in the proof. The only purposes I can see for this “equivalence class” model now are (1) it supplies a proof that quotients do actually exist and (2) a psychological one, providing a “model” for the quotient.
I have had to teach quotients before, and students find them hard. I think they find them hard because some just basically find it hard to handle this whole “set whose elements are sets” thing. Hence even though the psychological reason was ultimately useful for me, and eventually I “got the hang of” quotients, I do wonder what we should be doing about the people who never master them. An alternative approach in our teaching is to push the universal property angle. I have never tried this. It might turn out even worse!
A worked example : maps from a quotient.
Here is the mantra which we hear as undergraduates and thus go on to feed our own undergraduates. The situation is this: we have some quotient object (e.g. a quotient ring or a quotient group or whatever) and we want to define a map from this quotient to some other thing .
“Set of sets” variant:
The argument goes something like this:
“Recall that is a quotient, so its elements are really equivalence classes. We want to define a map from to , so let’s choose . Now remember that really is actually an equivalence class. Choose an element of this equivalence class, and now apply a construction which seems to depend on , giving us an element of [Note that this construction is just a function from to , so let’s call it ]. That is our map from to . But now we need to check that this map is well-defined. This means the following: during this construction we did something “non-canonical”, namely choosing a random element of . We need to check that our “function” from to is in fact independent of this choice. So say that instead we had chosen . Then and are in the same equivalence class, so they are equivalent. Now an explicit computation shows that and hence we’re OK — our function is well-defined.”
Is the student left wondering what the heck it means for a function to be “not well-defined”? How come nobody ever talks about any other kind of function before being “well-defined”? I thought the axiom of choice said that we can choose an element in each equivalence class all at the same time. How come we can’t just define by taking , using our axiom of choice element and sending to ? Is that “well-defined”?
Universal property variant:
The argument now looks like this.
“Recall that is a quotient, so it satisfies the universal property of quotients. Recall that this says that to give a map from to another set , all we have to do is to give a function which is constant on equivalence classes; the universal property then gives us a unique such that . So let’s define like this [and now define ], and now let’s check it’s constant on equivalence classes [the same calculation as before]. The universal property thus gives us the function which we require.”
Is that the way to teach this stuff to undergraduates? Is it more confusing, less confusing, or maybe just differently confusing?
Lean has quotients of equivalence relations built in. This is not particularly necessary; it is an implementation decision which did not have to be done like this. One can certainly make quotients as types of equivalence classes (and indeed this has been done in mathlib, with the theory of partitions). However Lean also has an opaque
quotient function, which creates another model of a quotient; we don’t know what the elements are, but we know the universal property and this is all we need.
Today we will learn about quotients by working through the explicit construction of the integers as the quotient of by the equivalence relation , and a proof that it is a commutative ring. We will go on to play around a bit with the universal property of quotients, and finish by using abstract nonsense to construct a bijection from our quotient to Lean’s .
There is far far too much material to do in one 2-hour workshop, but I was on a roll. As ever, the material is here, in
Appendix: why not just use equivalence classes?
Mathematicians use quotients everywhere, so it’s kind of interesting that they have their own type, but why not just use the type of equivalence classes? Lean can make that type, and it’s possible to prove all the API for it — I’ve done it. So why explicitly extend Lean’s type theory to add a new quotient type? The answer seems to be this. The universal property for quotients is that if is a quotient of by an equivalence relation , then we have a bijection between the functions and the -equivariant functions . To build from is easy — just compose with . To go the other way Lean has a function called
quotient.lift f h, which spits out
f and a proof
f is -equivariant (i.e. constant on equivalence classes). The claim that these constructions are inverse bijections boils down to the assertion that
f = (quotient.lift f h) ∘ p, and the proof of this, remarkably, is
rfl — it’s true by definition. This is what the baked-in quotient construction buys you. I used to think that this was really important (and indeed in the past I have claimed that this is a key advantage which Lean has over Coq). Now I am not so sure. It probably makes proofs a bit slicker occasionally — but nowadays I am less convinced by the idea of definitional equality in general — I’m happy to rewrite.