## Formalising mathematics : workshop 3 — sequences and limits

This week we’re going to do limits of sequences, of the kind you see in a 1st year analysis course. These are great fun to do in Lean. Because of Rob Lewis’ linarith tactic (which does the “and now this inequality clearly follows” parts of the arguments), the arguments feel to me like they run very close to what we would write on paper as undergraduates, which is a sign that the system is mature enough for use (for example in teaching). The problem sheet is now in the repo or I guess you could even try it online (although it will take a while to start up — you’re better off installing Lean). This week there are 11 sorrys. I was initially planning on doing a bunch more stuff too, however I realised when preparing this class that there are a whole bunch of tricks which I know, which makes the arguments come out smoothly, but I didn’t teach them yet, so I am envisaging students finding these questions harder than I found them! I will start by going through the maths we’re doing, and will then explain a bunch of the tricks.

## Limits of sequences

A sequence $a_0, a_1, a_2,\ldots$ in Lean can simply be encoded as a function a : ℕ → ℝ. You can think of a : ℕ → ℝ as saying “a is a function from the naturals to the reals” (the idea being that $a(37)$ in the function notation represents $a_{37}$ in the sequence). Here’s what’s actually going on with this notation. In Lean, the type ℕ → ℝ as the type of all functions from the naturals to the reals! This point didn’t dawn on me for a long time so let me spell it out: when Lean says ℕ → ℝ it is talking about a type, and it’s the type which mathematicians would call Hom$(\mathbb{N},\mathbb{R})$, the set of all functions from the naturals to the reals. Then the notation a : ℕ → ℝ just means that a is a term of this type, i.e. an element of the hom set. Note that this is one of the few places in traditional mathematics where it is common to use a colon to denote element of a set, or term of a type, or however you want to think about it.

Lean must have the definition of the limit of a sequence, right? Sure! In fact it has the definitions of limits of a sequence, limit of a function $f(x)$ as $x$ tends to $a$, as $x$ tends to $a$ from above, and also as $x$ tends to $+\infty$ and to $-\infty$. In fact it has such a general notion of a limit that we’re going to need an entire workshop to understand it properly — it’s the predicate of a filter tending towards another filter along a map. But to understand filters, you have to understand unions and intersections of sets, and I have not covered these properly. So I propose doing sets and filters next week, and this week let’s just roll our own definition of a limit.

-- Don't ask me why this doesn't come as standard in Lean
notation | x | := abs x

/-- l is the limit of the sequence a of reals -/
definition is_limit (a : ℕ → ℝ) (l : ℝ) : Prop :=
∀ ε > 0, ∃ N, ∀ n ≥ N, | a n - l | < ε


I do not need to tell you what this definition says, you can read it yourselves. This is one of the advantages of Lean being compatible with unicode. Time and again I have heard computer scientists telling me that this whole “Lean does unicode” thing is an irrelevant gimmick. Time and again I have heard mathematicians telling me that it really makes a difference to them. It’s making a difference right now — I don’t have to tell you what that definition says. Lean even figures out the types of all the variables automatically. In order for everything to make sense, a n (the functional programming way of writing $a(n)$) must make sense, and because a takes a natural number as input, n must be a natural number. Because n ≥ N must make sense, N must be a natural number too. And a n - l is a real number, so | a n - l | must be a real number, so ε must be a real number.

In the worksheet we’ll work through the following proofs:

1. Limit of a constant sequence is the constant.
2. Limits are unique (if they exist).
3. Sum of two limits is limit of sums.
4. Product of two limits is limit of products.
5. If $a_n\leq b_n$ for all $n$ then the same is true for the limits (assuming they exist)
6. The sandwich theorem: if $a_n \leq b_n \leq c_n$ and the limits of both $a_n$ and $c_n$ exist and equal $\ell$, then the limit of $b_n$ is also $\ell$.

I will do the first two, so you can see the techniques. The remaining four, and also seven other things, are your job. The proofs are just the proofs which you saw as undergraduate mathematicians, and you will hopefully find the formalisation relatively straightforward once you’ve internalised the tricks, which I explain within the copious comments in the lean file. Let me go through some of them here.

## λ notation for functions

This sometimes intimidates mathematicians. Here’s how it works. We mathematicians say “Let $f$ be the squaring function, sending $x$ to $x^2$. If we don’t want to name $f$ explicitly we might say “consider the squaring function $x\mapsto x^2$“. A computer scientist might say instead “consider the squaring function λ x, x ^ 2“. That’s all the lambda is — it’s just “\mapsto” but written at the beginning instead of the middle.

The one thing you need to know is how to deal with things like (λ x, x ^ 2) 37. Here we have made this anonymously-named function, and then we’re evaluating it at 37. How do you get Lean to simplify this to 37 ^ 2? The answer is dsimp only, a tactic which does some basic definitional tidying up.

I drone on about this notation at great length in the comments in this week’s problem sheet.

## Squiggly brackets {a} in function inputs

In Lean, a proof is a function. Let’s say we have managed to prove my_cool_theorem : ∀ (a : ℝ), 0 < a → 0 < 2 * a. This proof is a function. It takes two inputs, firstly a real number a and secondly a proof that 0 < a, and then it spits out a proof that 0 < 2 * a. If you’re prepared to believe that theorems can be thought of as sets, and proofs as their elements, or more precisely that theorems can be thought of as types, and proofs as their terms, then it makes sense to think about proofs involving implications as functions, and this is an example.

But let’s think for a minute: where does this function go from and to? It’s a function of two variables so it should be of the form $X \times Y \to Z$. Clearly $X$ is the real numbers. And then $Y$ is the set of proofs that $0 < a$. But wait a minute, what is $a$ here? It’s the element of $X$ which we chose as our first input! So actually something slightly fishy is going on — $Y$ (the type of our second input) actually depends on the element of the set $X$ which we chose (or more precisely the term of type X). If you know some geometry or topology, you can see that actually the source of this function is not a product $\mathbb{R}\times Y$, it’s more like the total space of a bundle on $\mathbb{R}$, where the fibre above $a$ is the type of proofs that $0 < a$, and this fibre moves as we move around $\mathbb{R}$. If you don’t want to think about it in this fancy way, just understand that the source of the function is sort of $X\times Y$, but the type Y depends the term of type X.

This is not a big deal foundationally, of course we can make the source space as the disjoint union of the types $0 < a$ as $a$ varies and then think of our proof as a function on this space. But here’s another funny consequence. If we know the second input of the function, i.e. the element of $Y$, then by looking at its type (namely the true-false statement $0 < a$) we can actually figure out the first input of the function (namely $a$). In particular, we are kind of wasting the user’s time asking them to give the first input, when we can just work it out from the second input. Here’s a concrete example. If h : 0 < 37 and we want to use my_cool_theorem to prove that 0 < 2 * 37 then right now we’re going to write my_cool_theorem 37 h. But that 37 input could have been worked out by the system because, given that the second input is h, the number 37 is the only possible first input that makes everything make sense — or, to use a technical term — makes everything typecheck. Lean’s type unification system, or unifier, is the system which checks all of this (it’s a big chunk of C++ code in core Lean which I have never looked at — I just know what it does rather than how it does it), and the trick we can do here is to make the unifier fill in that first input for us. So instead of my_cool_theorem we can define

my_cooler_theorem : ∀ {a : ℝ}, 0 < a → 0 < 2 * a

The squiggly {a} bracket input means “Ok so this a is an actual input to the function, but the unifier is going to supply it, so the user doesn’t have to”. And now if h : 0 < 37 then to prove 0 < 2 * 37 we can just use the term my_cooler_theorem h.

While we’re here, I’ll note that the square bracket inputs [] that you sometimes see mean “OK so this is an actual input to the function, but the type class inference system is going to supply it, so the user doesn’t have to”. The type class inference system is another system whose job it is to supply inputs to functions, but let’s not talk about this here (it’s another big chunk of C++ code and again I’ve never looked at it, I know nothing about C++). The thing you need to remember is that, when hovering over a function and trying to figure out what inputs it needs, your job as a user is to supply the ones in () round brackets, and Lean will supply the other inputs using logic that you don’t need to worry too much about.

## Some tactics you’ll find helpful: specialize, linarith, ring, convert

specialize: A function f : X → Y → Z can be thought of as a function which takes two inputs (a term of type X and a term of type Y) and spits out a term of type Z. This is because X → Y → Z means X → (Y → Z). If you have a term x : X and are pretty sure that you only want to ever evaluate f at x then you may as well fix x as the first input and let f just denote the resulting function Y → Z. This can be done with specialize f x.

linarith: If you have hypotheses hab : a ≤ b and hbc : b < c and your goal is a < c then sure there will be some function in the library called something like lt_of_le_of_lt which does it for you, but you can just type linarith and it will all happen automatically. The linarith tactic should be able to deal with anything involving linear algebra and arithmetic, but note that it will not do non-linear goals like proving a*b>0 from a>0 and b>0. For this you need the nonlinear version nlinarith.

ring: If your goal is something which can be proved from the axioms of a commutative ring (or semiring), e.g. like (x+y)^2=x^2+2*x*y+y^2, then the ring tactic will solve it. Note that the ring tactic does not look at any hypotheses — if you need them, you’re going to have to rewrite them manually first (or write a groebner_basis tactic).

convert: If you have a hypothesis which is pretty much equal to your goal, but there’s just some random small subterm which is a bit different, then convert might be a good way to proceed. For example if your goal is ⊢ a ≤ b ∧ b ^ 2 + 3 ≤ c and you have a hypothesis h : a ≤ b ∧ b * b + 3 ≤ c but you don’t know the name of the theorem saying b^2=b*b so you can’t apply rewrite which will change b^2 to b*b then you can just convert h, and the goal will change to ⊢ b ^ 2 = b * b, which can be solved with ring.

## Random API which will be useful

abs is a definition, so has an API. Three lemmas in it which might be useful:

abs_pos : 0 < |a| ↔ a ≠ 0
abs_mul x y : |x * y| = |x| * |y|
abs_add x y : |x + y| ≤ |x| + |y|



And ring does not deal with division. A couple of things from the division API:

div_pos : 0 < a → 0 < b → 0 < a / b
lt_div_iff : 0 < c → (a < b / c ↔ a * c < b)
lt_div_iff' : 0 < c → (a < b / c ↔ c * a < b)


Oh, and while we’re here, the moment you start on division you have to start splitting into cases depending on whether the denominator is zero or not: Lean has made a design decision to allow x / 0 to make sense, but there are no theorems about it, so rather than division by zero giving an error it just gives a term you can’t use (think of it as “a random real about which we don’t know anything”). So knowing by_cases hc : c = 0 is a handy tactic trick — this splits into two cases depending on whether $c=0$ or $c\not=0$.

## Appendix: do like Insta and use filters

Every definition in Lean comes with a cost. Last week we saw some of this cost. We didn’t use Lean’s inbuilt group definition, we rolled our own, and then we had to write a bunch of lemmas before it was usable.

This week we’ve done the same — we’ve rolled our own is_limit definition and have had to prove a bunch of theorems about it. However this week it’s possible to link our definition to Lean’s own far more high-powered definition of a limit, using its tendsto predicate, which is a predicate on two filters and a map. Here are a bunch of two-line proofs of things we’ve been doing today, using Lean’s filter API (so the content of the proofs has not magically disappeared, it is just being done by invoking general theorems from mathlib and using things like the fact that addition is a continuous function on the reals):

import week_3.Part_A_limits

import topology.instances.real

open filter

open_locale topological_space

namespace xena

-- is_limit is equivalent to a filter.tendsto
lemma is_limit_iff_tendsto (a : ℕ → ℝ) (l : ℝ) :
is_limit a l ↔ tendsto a at_top (𝓝 l) :=
begin
rw metric.tendsto_at_top,
congr',
end

-- this is is_limit_add
example (a b : ℕ → ℝ) (l m : ℝ) : is_limit a l → is_limit b m → is_limit (a + b) (l + m) :=
begin
repeat {rw is_limit_iff_tendsto},
end

-- this is is_limit_mul
example (a b : ℕ → ℝ) (l m : ℝ) : is_limit a l → is_limit b m → is_limit (a * b) (l * m) :=
begin
repeat {rw is_limit_iff_tendsto},
exact tendsto.mul,
end

end xena


I will talk more about these filters at_top (neighbourhoods of infinity on the naturals) and 𝓝 l (neighbourhoods of l in the reals) next time. I write about filters here and will say more about them next week.