Induction, and inductive types.

Here’s a conundrum: why does a schoolkid think recursion is a triviality (“the Fibonacci sequence is defined like this: 1,1,2,3,5,8,…, with each number in the sequence the sum of the two before it” — this is a definition very likely to be accepted by a schoolkid) but they might struggle with the principle of mathematical induction (a procedure for generating a sequence of proofs, each one a consequence of the one before it)? I was taught at university that induction and recursion were different things, but Lean has a common abstraction which makes them both instances of the same thing. In this post I’ll talk about induction, and recursion, and also about how something called the Calculus of Inductive Constructions (the way Lean’s inductive command works) provides some rather surprising proofs of basic mathematical facts.

Defining a type.

It is not unreasonable to think of what Lean calls a type as what a mathematician would usually call a set. For example a mathematician might say “let S be a set with three elements; call the elements a, b and c“. Here’s how to make that set, or type, in Lean:

inductive S : Type
| a : S
| b : S
| c : S

In fact the full names of the elements of S, or the terms of type S as we call them in type theory, are S.a, S.b and S.c, and we might want to open the S namespace so that we can just refer to them as a, b and c. An undergraduate might want to make this kind of definition when they are answering the following kind of question in Lean: “Let $f : X \to Y$ and $g : Y \to Z$ be functions. True or false: if the composition $g \circ f$ is injective, then $g$ is injective”. This is false, and to prove it’s false one can either use the sets we mathematicians have lying around (such as the naturals, integers or reals), or one can just build some explicit sets of small size like S above, and some explicit functions between those sets.

Defining functions between types.

So here’s what we’re going to do. Let’s make a type X with one term p, a type Y with two terms q and r, and a type Z with one term s. This is easy given what we’ve already seen:

inductive X : Type
| p : X

inductive Y : Type
| q : Y
| r : Y

inductive Z : Type
| s : Z

[By the way, if you want to play along but you haven’t got Lean installed on your computer, you can do all this within a web browser by clicking here (although you could instead click here to find out how to install Lean and the community tools, which give you a far slicker experience).]

Our counterexample is going to be the following: We define f : X → Y by f(p)=q and g : Y → Z by g(q)=g(r)=s. Let’s do this.

open X Y Z

def f : X → Y
| p := q

def g : Y → Z
| q := s
| r := s

As a mathematician I find the use of the | symbol quite intimidating (especially that we are now using it in a different way), but given that I’ve told you what we’re doing and now I’m telling you how we are doing it, you can probably guess what it all means. One can now go ahead and state the result:

open function

example : ¬ (∀ (X Y Z : Type) (f : X → Y) (g : Y → Z),
  injective (g ∘ f) → injective g) :=
begin
  sorry
end

and now if you fancy proving a mathematical theorem by playing a puzzle game, you can click here to get all the code at once, and have a go. Instead of talking about the proof though, I want to talk about the rather surprising (at least to me) fact that Lean is defining f and g by recursion.

What happens when we define an inductive type?

What happens under the hood when Lean sees this code

inductive X : Type
| p : X

is quite surprising (at least to me as a mathematician). I’ve been arguing above that we should think of this code as saying “Let $X := \{p\}$ be a set with one element”. But here’s what’s really going on when Lean sees this code. Unsurprisingly, Lean defines a new type X and a new term p (or more precisely X.p) of type X. It also defines one more new thing, which expresses that p is the only element of X. But the way it does this is surprising: it defines the so-called recursor for X, which is the following statement:

X.rec : ∀ {C : X → Sort u}, C p → (∀ (x : X), C x)

Whatever does that mean? Well first I think I’d better explain what this Sort u thing is. I’ve written in the past an explanation of how sets and their elements, and theorems and their proofs, are unified in Lean’s type theory as types and their terms. The sets/elements story goes on in the Type universe, and the theorems/proofs story goes on in the Prop universe. When Lean says Sort u it means “either of these universes”. So we can rewrite X.rec as two statements:

X.recursor : ∀ {C : X → Type}, C p → (∀ (x : X), C x)
X.inductor : ∀ {C : X → Prop}, C p → (∀ (x : X), C x)

The first statement is the principle of recursion for X. In set-theoretic language it says this. “Let’s say that for every element $x$ of $X$ we have a set $C(x)$ , and let’s say we have an element of $C(p)$ . Then we have a method of constructing an element of $C(x)$ for all $x \in X$ .” This looks like a rather long-winded way of saying that $p$ is the only element of $X$ . In fact it is worth looking at the special case of X.recursor where C is the constant function sending every element of X to the set S:

X.recursor_constant : ∀ S, S → (X → S)

This says that if S is any set, and we have an element a of S, then we can get a function from X to S. What is unsaid here, but true by definition, is that it’s the function that sends p to S, as can be checked thus:

-- give X.rec the constant function C sending everything to S
def X.recursor_constant : ∀ S, S → (X → S) := λ S, @X.rec (λ x, S) 

example (S : Type) (a : S) : (X.recursor_constant S a) p = a :=
begin
  -- true by definition
  refl
end

Do you remember our definition of f above?

def f : X → Y
| p := q

This function f is defined using X.recursor_constant, letting S be Y and letting the element of S be q. The notation Lean uses is short, but under the hood this is how f is constructed.

So much for recursion. The second statement coming from X.rec is X.inductor, the principle of induction for X. I should probably say that I made up the word “inductor”, but inductor is to induction as recursor is to recursion. In more mathematical language the inductor says this. “Let’s say that for every element x of X we have a true-false statement $C(x)$ , and let’s say that $C(p)$ is true. Then $C(x)$ is true for every element x of X.” So again it is just a rather long-winded way of saying that p is the only element of X.

Why have computer scientists isolated these rather elaborate statements as the fundamental way to say that p is the only element of X? It’s actually because of a fundamental symmetry. We have defined a new type X, in a functional programming language, and now the fundamental thing we need to do next is to explain how to define functions into X, and how to define functions out of X. To define functions into X, we need to have access to terms of type X, or in computer science lingo to constructors of X. This is exactly what X.p is — a way to construct a term of type X. To define functions out of X, we need access to eliminators for X, that is, some kind of gadget whose output is a function from X to somewhere else. Because X only has one term, namely p, we need a way of saying “to give a function out of X, we only need to say what happens to p“, and this is exactly what the recursor is doing. Between them, the constructor and recursor say in a formal way that the elements of X are “at least p, and at most p, so are exactly p.”

Recursors for general inductive types.

Lean automatically generates constructors and a recursor for every type defined with the inductive command. There is a general rule for how to do this, but informally it’s pretty clear. We define inductive types using this | symbol, and you get a constructor for each line with a | in. The eliminator or recursor simply says that to define a function from the new type you’re defining, all you have to do is to make sure you’ve defined it on each constructor.

The rest of this post is the fun part. I will go through a bunch of inductive types defined in Lean, we can look at the definition, figure out the recursor attached to each of the types, and then see what this corresponds to mathematically. We will see some familiar things popping up in surprising ways.

Example: a set with two elements.

Recall our inductive type Y:

inductive Y : Type
| q : Y
| r : Y

The recursor for Y tells us that if S is a set, then to get a map from Y to S we have to give two elements of S, one corresponding to where q goes and one corresponding to where r goes.

def Y.recursor_constant : ∀ S, S → S → (Y → S) := λ S, @Y.rec (λ y, S)

The full recursor can even be used (with non-constant C) to define a function from Y which sends q into one type and r into a different type, but when defining the function g above we do not need this level of generality. If you want to see what Y‘s recursor looks like, just type #check @Y.rec in a Lean session after the definition of Y, and remember that Π is just computer science for ∀ (in Lean 4 they will be using ∀ instead of Π in fact).

Example: the natural numbers.

Mathematicians who have seen the development of mathematics in ZFC set theory know that a complex number is defined to be a pair of real numbers, a real number is defined to be an equivalence class of Cauchy sequences of rational numbers, a rational number is defined to be a multiplicative localisation of the integers at the nonzero integers, an integer is defined to be an additive localisation of the naturals at the naturals, and the naturals are defined by the ZFC axiom of infinity. In Lean’s type theory a complex number is defined to be a pair of real numbers, a real number is defined to be an equivalence class of Cauchy sequences of rational numbers etc etc etc, and it’s all just the same up to the very end, because in Lean the naturals are defined by Peano’s axioms:

inductive nat : Type
| zero : nat
| succ (n : nat) : nat

This means that we have two ways to make natural numbers. First, zero is a natural number. Second, if n is a natural number, then succ n (usually called n+1 by mathematicians) is a natural number. Now we need a way of expressing the idea that this is the only way to make natural numbers, and this is the recursor, which is automatically generated by Lean, and says a precise version of the following informal thought: “If you want to do something for all naturals, then you need to tell me how to do it for both constructors”. In other words, “…you need to tell me how to do it for zero, and then you have to tell me a way to do it for n+1 assuming we’ve already done it for n“. Sounds familiar?

The recursor in general involves a map to Sort u. Let’s just specialise to the two universes we’re interested in, and take a look at the constant recursor, and the inductor (and let’s use Lean’s notation ℕ for nat):

nat.recursor_constant : ∀ (S : Type), S → (∀ (n : ℕ), S → S) → (ℕ → S)
nat.inductor : ∀ (C : ℕ → Prop), C 0 → (∀ (n : ℕ), C n → C (succ n)) → ∀ (n : ℕ), C n

[The proof of nat.recursor_constant is λ S, @nat.rec (λ n, S) and the proof of nat.inductor is just @nat.rec. ]

The constant recursor says this: if S is a set, and we want to make a function f : ℕ → S, here’s a way of doing it. First we need an element of S (namely f(0)) and second, for each natural number we need a map from S to S (telling us how to make f(n+1) given that we know f(n)).

The inductor says this. Say we have a family C(n) of true-false statements, and that C(0) is true, and that for all n we have a proof that C(n) implies C(n+1). Then we can deduce that C(n) is true for all n.

What I think is really cute about this example is that Peano’s definition of the natural numbers makes it immediately clear why the principle of mathematical induction works. In the natural number game we use the recursor in the background to define addition and multiplication on the naturals. We also use it to prove things which I call “axioms” in the natural number game — for example the proof that 0 is not equal to succ n for any natural number n uses the recursor to define a function from ℕ to $\{0,1\}$ sending 0 to 0 and succ n to 1, and using this function it’s easy to prove the “axiom” zero_ne_succ by contradiction. If you want an exercise, try using nat.recursor_constant to prove injectivity of the succ function, something else I also claimed was an axiom in the natural number game (as Peano did) but which was actually proved using the recursor.

Example : false

false is a true-false statement, and you can probably guess which one it is. In Lean false is defined as an inductive type! Here’s the full definition:

inductive false : Prop

This time there are no | s at all! Every constructor of false would be a proof of a false statement, so this design decision is not surprising. The recursor is

false.rec : Π (C : Sort u), false → C

In other words, to give a map from false to C you have to define it on all constructors, of which there are none. Let’s take a look at the inductor then, by changing Sort u to Prop:

false.inductor : ∀ (P : Prop), false → P

It says that if P is any true-false statement, then false implies P. This logical tautology has been automatically generated by Lean, because Lean’s model of an implication $Q\implies P$ is a function from proofs of Q to proofs of P, and false has no terms, i.e., no proofs.

There is a similar story with inductive empty : Type, Lean’s definition of the empty type. The recursor for empty says that to give a map from the empty type to any type S, you don’t have to do anything other than feed in S.

Example : or

The logical or on propositions is defined as an inductive type in Lean!

inductive or (P Q : Prop) : Prop
| inl (hp : P) : or
| inr (hq : Q) : or

There are two constructors for P ∨ Q, where now I’m using the usual logicians’ notation for or. In other words, there are two ways to prove P ∨ Q. First you can prove P, and second you can prove Q. Lean’s auto-generated inductor for this is

or.inductor : ∀ (P Q R : Prop), (P → R) → (Q → R) → (P ∨ Q → R)

In other words, if you can prove $P\implies R$ and you can prove $Q\implies R$ , then you can deduce $(P\lor Q)\implies R$ . Again no mathematician is surprised that this statement is true, but perhaps some are surprised by the fact that a computer scientist might claim that this is true by induction on or.

Example : equality.

The = symbol in Lean is defined as an inductive type! But I think I’m going to save the topic of what induction on equality is until the next post, where we will prove, by induction, that equality is an equivalence relation.

Some final notes.

I was very surprised when I realised that every inductive type came with a principle of induction. In fact one can even define the reals as an inductive type, which means that there will be an inductor for reals meaning that you can do induction on the reals! But when I figured out what the induction principle said I was disappointed — it says “if you can prove it for every real which is an equivalence class of Cauchy sequences of rationals, you can prove it for every real”. Remember that the idea of the recursor is that it is a way of saying “every term of your type can be made using the constructors”, so if your only constructor for a real is an equivalence class of Cauchy sequences of rationals then this is what you get. However these other examples, and in particular these examples coming from logic, are quite funky. An example I didn’t talk about: and is an inductive type and its inductor is ∀ (P Q R : Prop), (P → (Q → R)) → (P ∧ Q → R), which is some propositional version of uncurrying (indeed the constant recursor for prod, the product of two types, is uncurrying on the nose). The basic facts in propositional logic about and and or and are proved constructively in Lean using recursors rather than by truth tables, because directly constructing the functions corresponding to the proofs is more appealing than a case split.

Not everything is an inductive type in Lean — there are two other kinds of types. There are quotient types, which are there for some kind of computer science efficiency reasons and which could be constructed using inductive types, and then there are function types, which are a different kind of thing. I don’t think it’s of mathematical interest whether they type you’re looking at is an inductive type or a function type, but here’s an example of a function type: logical not. In Lean, ¬ P is defined to be P → false. On the other hand most of the structures used by mathematicians (groups, subgroups, rings, fields, perfectoid spaces and so on) are defined as inductive types (often however with one constructor, so their induction principle is boring). An inductive type with one constructor is known as a structure in Lean. You can read more about inductive types and structures in the very wonderful Theorem Proving In Lean, in sections 7 to 9.

In my next post I’ll talk about induction on equality.

My daughter is busy with exam revision, so here’s some old digital art by one of my sons.

9 Responses to Induction, and inductive types.

ComFreek says:

April 3, 2021 at 4:41 pm

I’ve understood that the recursion/induction principle allows you to define functions by pattern-matching and recursion. I am wondering however what allows us to call functions and to insert function values later. Consider:

> inductive X : Type
> | p : X
>
> X.recursor_constant : ∀ S, S → (X → S)

Let’s say I define a function function that maps `p` to `42` by writing `f: X -> nat := X.recursor_constant nat 42` (using some informal syntax that hopefully translates to Lean).

Now if I later write `f p` (e.g. in a proof), what principle or fundamental of Lean allows me to rewrite that to `42`?

PS: I find it very cool that the usual connectives like `false`, `or`, and `and` can be readily defined by inductive definitions. It seems there is an innate connection between inductive definitions (as in Lean and Coq) and natural deduction.

LikeLike

- xenaproject says:
  
  April 3, 2021 at 4:48 pm
  
  Lean has a very formal definition of “definitional equality” here: https://leanprover.github.io/reference/expressions.html#computation and it includes something called “ι-reduction”, which is the principle that if you evaluate a function defined by a recursor at one of the constructors, then what you get is *by definition* equal to what you fed into the recursor to make the function.
  
  LikeLiked by 1 person
  
David Corfield says:

April 3, 2021 at 7:00 pm

You describe some of this as surprising for you as a mathematician, e.g., “I was very surprised when I realised that every inductive type came with a principle of induction.” But if you look on things from a category-theoretic angle, i.e., understand inductive types as initial algebras for an endofunctor, doesn’t this remove the surprise?

LikeLike

- xenaproject says:
  
  April 3, 2021 at 8:43 pm
  
  Some of these things are propositions, objects which I had no tools for reasoning about until I learnt type theory. And although every initial object has a universal property, is that the same as an induction principle?
  
  LikeLike
  
David Corfield says:

April 4, 2021 at 7:19 am

We seem to be using the induction/recursion distinction differently on the nLab page here:
https://ncatlab.org/nlab/show/inductive+type#CategoricalSemantics.

What we have for induction is saying that for any F-algebra with a F-morphism to the initial F-algebra there’s a unique section. That’s your recursion.

Similar reasoning shows there can’t be a proper sub-F-algebra of the initial one. That’s your induction.

So ‘induction’ there covers both your ‘induction’ and ‘recursion’. ‘Recursion’ there is your constant recursion.

LikeLike

- xenaproject says:
  
  April 4, 2021 at 8:21 am
  
  Oh interesting! I was just using the language the type theorists use. The one recursor generated by an inductive definition is enough to do everything, but it typically “eliminates to sort” meaning that as a mathematician who likes to distinguish between Prop and Type I sometimes think of it as these two different things, at least when trying to explain it to students
  
  LikeLike
  
A Marx says:

April 4, 2021 at 8:07 am

You mentioned that quotient types would be definable via inductives etc. How would that work, exactly? I know it’s possible if there is some kind of “normal form” for every element (like (0,n) and (n,0) for the integers) as well as a function assigning every value to its normal form, but I don’t know how it’d work otherwise.

LikeLike

- xenaproject says:
  
  April 4, 2021 at 8:23 am
  
  This is the way it’s done in Coq, which doesn’t have inbuilt quotients. If I have a type and an equivalence relation on it then I can define a new type such that to make a term of that type I have to give a subset plus a proof that it’s an equivalence class. This is the “standard mathematical model” for a quotient. So it’s a subtype of the type of subsets of the original type.
  
  LikeLike
  
- xenaproject says:
  
  May 19, 2021 at 12:29 pm
  
  You define equivalence classes and make a type of equivalence classes.
  
  LikeLike