Formalising mathematics : Workshop 2 — groups and subgroups

This is some notes on the second workshop in my Formalising Mathematics course, running as part of the EPSRC TCC. The Lean github repo is here.

Groups and subgroups

I start with an apology — there was far too much material last week. I’m still getting the hang of this. Hopefully this week there is a more manageable amount. There is probably still more than can be done in two hours.

In workshop 1 we filled in sorrys, and most of the time the definitions we worked with were things like injective, which is a straightforward “one-liner” definition. Here is the definition of injective from core Lean:

/-- A function `f : α → β` is called injective if `f x = f y` implies `x = y`. -/
@[reducible] def injective (f : α → β) : Prop := ∀ ⦃a₁ a₂⦄, f a₁ = f a₂ → a₁ = a₂

Apart from the @[reducible] tag and the slightly weird brackets, there are no surprises here: it is a one-line Lean definition, with a docstring (a string of text describing in human terms what the definition is about).

There was one more complex definition last week, which was the definition of a partition on a set, but I don’t think anyone got that far anyway, so let’s ignore it and talk about what we’re doing this week, which is to:

define the concept of a group.
make a basic API for groups.
learn how to train Lean’s simplifier simp to prove simple theorems about groups.
define the concept of a subgroup of a group.
make a basic API for subgroups (specifically focussing on the subgroup generated by a subset).

Again I have no feeling for how far we will get. I’d like to think that most people will make it as far as 3 in the two hours we have. I had been using Lean for about a year before it dawned on me what the simplifier was for, so 3 is a nice target.

The material is in the src/week2 directory of the formalising-mathematics github repository.

Defining a group.

We start by discussing the material in src/week_2/part_A_groups.lean. The mathematical definition we will formalise is the following. A group structure on a type G (a type is just Lean’s word for a set) is two collections of things. Firstly, it is the following three pieces of data:

A function mul taking two elements of G and returning an element of G, with notation g * h.
A function inv from G to G, with notation g⁻¹.
A constant one in G, with notation 1.

Secondly, it is the following three assumptions, or axioms, or proofs, or functions, or however you want to think about them:

mul_assoc : ∀ (a b c : G), a * b * c = a * (b * c))
one_mul : ∀ (a : G), 1 * a = a
mul_left_inv : ∀ (a : G), a⁻¹ * a = 1

Internally these are represented in Lean as functions. For example one_mul is a function which takes as input an element a of G (or, as Lean would call it, a term a of type G — it’s the same thing) and spits out a proof that 1 * a = a.

In Lean the full definition in the repo looks like this:

class group (G : Type) extends has_mul G, has_one G, has_inv G :=
(mul_assoc : ∀ (a b c : G), a * b * c = a * (b * c))
(one_mul : ∀ (a : G), 1 * a = a)
(mul_left_inv : ∀ (a : G), a⁻¹ * a = 1)

Now mathematicians themselves don’t really know what the definition of a group is. Some people say that there are two more axioms mul_one : ∀ (a : G), a * 1 = a and mul_right_inv : ∀ (a : G), a * a⁻¹ = 1. Others say that these last two “axioms” can be deduced from the three we have already, so they are not axioms, they are theorems. It’s clear that the issue is not a mathematical one, but an infrastructure one. If we put more axioms in, then it will be easier to prove things about groups, but harder to prove that various things are groups. Given that I am assuming my audience is mathematically mature (this is not supposed to be a first introduction to group theory!) I am going to go for a minimal axiom approach, and the first thing we shall do is to prove mul_one and mul_right_inv. To do this we will need to prove a couple more lemmas first — or, at least, this is the way I have set things up. Feel free to experiment if you think you know a slicker way of proving these things — all that matters is that we get the two “missing axioms”. Here is the route I have taken in the Lean files:

mul_left_cancel : ∀ (a b c : G), a * b = a * c → b = c
mul_eq_of_eq_inv_mul {a x y : G} : x = a⁻¹ * y → a * x = y
mul_one (a : G) : a * 1 = a
mul_right_inv (a : G) : a * a⁻¹ = 1

In some sense, this is mathematically the hardest part of setting up a basic theory of groups, because not being allowed to use mul_one and mul_right_inv until we’ve proved them is rather unlike the way mathematicians usually deal with groups. It rather reminds me of when the Tokay pinch all of Link’s belongings on Crescent Island in Oracle of Ages.

Making an API

Once we have achieved the obvious goal of proving mul_one and mul_right_inv, the question is: what to do next. For reasons which will become apparent later, our next goal is to prove the following theorems (none of which are too hard at this point):

inv_mul_cancel_left : a⁻¹ * (a * b) = b
mul_inv_cancel_left : a * (a⁻¹ * b) = b
inv_mul : (a * b)⁻¹ = b⁻¹ * a⁻¹
one_inv : (1 : G)⁻¹ = 1
inv_inv : (a⁻¹)⁻¹ = a

You might want to take some time staring at the names of these results and noting that they are not just random computer science gobbledegook, there is method in the madness (in particular you should be able to figure out what mul_one and one_mul say without looking!) There are also a bunch of other random things which you can prove if you want, such as a⁻¹ = b⁻¹ ↔ a = b and so on — Lean’s maths library mathlib has many many basic statements about groups. The reason we have focussed on these five results above is that, if you put them together with mul_one, mul_right_inv and the axioms defining a group, then you get a confluent rewrite system! What is that, you ask?

The simplifier

Lean’s simplifier simp has a database of so-called “simp lemmas”, lemmas of the form A = B or A ↔ B. Given a random goal, simp will attempt to spot A‘s in it, and will proceed to replace them with B‘s. Note that order matters! Equality is not symmetric in this syntactic game (this is computer science, not mathematics). If you put the wrong lemmas in the database, this can be a disaster. For example if you put a = 1 * a in the database then the simplifier will start to replace elements g in goals with 1 * g, and then with 1 * (1 * g) and so on. However, putting 1 * a = a into the database is a really good idea, because cancelling a “1” is typically a sensible move in the game of mathematics. Tagging all of the ten lemmas above with @[simp] adds them to the simplifier’s database, and a theorem of Knuth and Bendix then tells us that the simplifier will then put every element of every group into some kind of unique canonical form (if you know about the theory of reduced words in free groups, it’s this, but with added care about exactly where the brackets are). In practice this means that the simp tactic is now guaranteed to solve silly questions like showing (a * b) * 1⁻¹⁻¹ * b⁻¹ * (a⁻¹ * a⁻¹⁻¹⁻¹) * a = 1 — you don’t have to mindlessly do all the rewrites manually.

Subgroups

If G is a type with a group structure (i.e., a group), then there is a type subgroup G of subgroups of G. The Lean definition is in Part_B_subgroups.lean in src/week_2 and it looks like this:

/-- A subgroup of a group G is a subset containing 1
and closed under multiplication and inverse. -/
structure subgroup (G : Type) [group G] :=
(carrier : set G)
(one_mem' : (1 : G) ∈ carrier)
(mul_mem' {x y} : x ∈ carrier → y ∈ carrier → x * y ∈ carrier)
(inv_mem' {x} : x ∈ carrier → x⁻¹ ∈ carrier)

In words, to give a subgroup of G is to give four pieces of information: a subset of G (a.k.a. a term of type set G) called the carrier, and then three proofs that it is closed under the three pieces of data used to define a group — i.e., proofs that the subset contains one and is closed under mul and inv.

Optional paragraph: One slightly weird thing to keep in mind, if you’re into the whole foundational stuff, is that in contrast to the way things are usually set up in type theory, a subgroup of G is a term, not a type. This is a bit weird because usually the idea is that types are sets, and terms are elements of the sets. So how come a subgroup of G is a term? It’s because in type theory every term has exactly one type, so if we have a subset X ⊆ G then we can’t have the same 1 being of type X and of type G. The way we have set things up, we will have 1 : G (so 1 is still a term of type G) but 1 ∈ X (so we will use the set-theoretic notation — ∈ is a relation which takes a term g of type G (an element of G) and a term X of type set G (a subset of G) and spits out a true-false statement (the statement g ∈ X).

We start with some slightly tedious infrastructure (for example extending ∈ so that it applies to subgroups). I was tempted to skip this, but doing it gives you some idea of the “cost” of a definition. We prove random trivial stuff such as two subgroups are equal if and only if they have the same elements, and that subgroups are partially ordered by inclusion.

Closure of a subset

The main mathematical work we do in the file is to set up the theory of the subgroup generated by a subset. There are two ways to do this — a “bottom-up” way where we start with a set S and then throw in the identity and close up under multiplication and inverse. This is certainly possible in Lean, using inductive propositions, but I decided to do the “top-down” approach, where the subgroup generated by S is defined to be the intersection of the subgroups of G which contain S. We prove the following theorems (here ↑H is the underlying subset of G corresponding to the subgroup H of G):

subset_closure (S : set G) : S ⊆ closure S
closure_mono {S T : set G} (hST : S ⊆ T) : closure S ≤ closure T
closure_le (S : set G) (H : subgroup G) : closure S ≤ H ↔ S ⊆ ↑H
closure_closure (S : set G) : closure S = closure (closure S)
lemma closure_self {H : subgroup G} : closure ↑H = H

These results are enough to prove that closure is a closure operator, that closure S is indeed the smallest subset of S containing S and closed under the group structure maps (1, * and ⁻¹), and furthermore that closure and ↑ form a Galois connection. From these few lemmas the “machine can take over” — for example we can deduce for free that subgroups form a complete lattice and hence the 100 or so theorems proved in Lean about complete lattices can all now be applied to subgroups for free.

Again I suspect that there is far far too much for a 2-hour workshop.