Thoughts on the Pythagorean theorem

I’m sure I’m saying nothing new here. I’m just explaining another example of how thinking about how to formalise things has taught me stuff about what mathematics is.

What is the Pythagorean theorem?

The Pythagorean theorem, a.k.a. Pythagoras’ theorem, comes in two parts. Firstly there is the theorem statement, which says that in a right angled triangle (like the dark blue one below), the square of the hypotenuse equals the sum of the squares of the other two sides. And then there is the proof, which originally is either due to Pythagoras or not depending on who you believe. Let’s start with the statement.

What does the statement actually mean?

Let’s take a look at the picture of the squares on the hypotenuse and the other two sides.

Some squares

The dark blue triangle is right-angled. The claim is that the square C is equal to the sums of the squares A and B. On the face of it, this is nonsense. If you take squares A and B, you have a picture containing two squares; but square C is just one square. How can one square equal two squares? But of course the claim is not that the pictures are equal, the claim is that the areas are equal.

But what is area? To find out, let’s go back to Euclid.

Euclid’s take on the theorem

Euclid’s Elements contains a proof of the Pythagorean theorem, right at the end of book 1. The proof involves drawing some triangles and arguing that various things are “equal”. This approach is valid because Euclid has explicitly stated as his Common Notion 1 that equality, whatever it is, is transitive.

One can chase this concept of equality back to Proposition 35, which claims that two parallelograms with the same base and the same height are “equal”. In fact this seems to be the first time that the word “equal” is used to mean “having equal area” in the Elements. Halving the parallelograms we deduce the more familiar Proposition 37, that two triangles with the same base and the same height are also “equal”. So what goes into the proof of Proposition 35, that two parallelograms with the same base and height are “equal” in the sense of having equal area?

The key ideas in the proof are Euclid’s second and third common notions: that “equals added to equals are equal”, and “equals subtracted from equals are equal”. In high-level terms, these common notions imply that equality is not just an equivalence relation, but a congruence relation. But let’s see how Euclid uses these notions in his proofs.

Equals added to equals are equal.

The two orange regions have equal areas, because they are both “equals added to equals”: the small triangles and the big triangles are both congruent.

Equals subtracted from equals are equal

Here, the two larger triangles are congruent, so the two orange areas are equal, because they are equals (the dark blue triangle) subtracted from equals (the larger triangles). For Euclid, the equality of the areas of the two orange regions in these examples is axiomatic. Take a look at the proof of Proposition 35 to see how these facts are used to prove that two parallelograms with the same base and height are “equal”.

Area in Euclid book 1

So, what Euclid does mean by the “area” of a shape? Well this is the funny thing — he never says, throughout book 1! He only says what it means for two shapes to have “equal area”!

This is exactly what an equivalence relation is. An equivalence relation on a type is a concept of equality on terms of that type. It can be thought of as focussing on a particular attribute of the terms you are considering (for example the area of a shape, or the value of an integer modulo 10) and saying that two terms are equivalent if the values of those attributes are equal. Euclid is putting an equivalence relation on shapes. His definition of the relation involves cutting and pasting in geometry, and at the end of the day the proof of the Pythagorean theorem in Euclid is essentially a jigsaw puzzle. Here is an even simpler jigsaw puzzle proof:

A proof of the Pythagorean theorem

Euclid and type theory

When Euclid did mathematics, he was doing type theory. For Euclid, points were terms of the Point type, lines were terms of the Line type, and planes were terms of the Plane type. Euclid wrote down the axioms his types satisfied (for example there was a unique line between any two distinct points) and proceeded to work from there. He has a definition of a 2-dimensional shape, and assuming that a plane exists, his shapes exist too. He defined an equivalence relation on 2D shapes, and proved that the 2D shape corresponding to the square on the hypotenuse was related to the 2D shape corresponding to the union of the squares on the other two sides, using properties of this relation which he has earlier axiomatised.

The proof of Pythagoras’ theorem in Euclid is what is known as a synthetic proof. We assume that a Euclidean Plane exists and satisfies a list of axioms, which Euclid attempted to write down and which most of us never even contemplate. We then formulate the theorem, and prove it using the axioms.

Numbers from geometry?

Note that Euclid is in some kind of a position to define real numbers at this point, or at least the positive real numbers. For example, Euclid knows what it means for two line segments to have equal length — it means that you can translate and rotate one line until it coincides with the other. He could hence define the positive reals to be equivalence classes of line segments, under the equivalence relation of being the same length. However one runs into problems when it comes to completeness, something Euclid’s axioms were not really designed for.

Geometry from numbers: Enter Descartes.

Descartes suggested doing things the other way around, using numbers to do geometry rather than using geometry to define numbers. Descartes observed that one could label a point in the plane with an x and y coordinate. This changed everything. All of a sudden “the plane” (a term whose existence is never talked about in Euclid) becomes modelled by \mathbb{R}^2. Euclid’s definitions, common notions, and axioms now need to be revisited. We need to check that this more refined model satisfies the rules of Euclid’s game (a bit like checking that Einstein’s theory turns into Newton’s in the limit).

We model a point as an ordered pair of real numbers, we can define lines as solutions to linear equations because the reals are a field so we have that language available. We can prove the parallel postulate no problem. The theory of integration gives us a way to measure lines (length), angles (measure), curves (length) and 2-dimensional shapes (area), using the natural (Euclidean) Riemannian metric on the plane. We can now completely rephrase Pythagoras’ theorem: it is now an equality of numbers. We can re-interpret the “jigsaw puzzle” proof in Euclid as a consequence of finite additivity of Lebesgue measure on the plane. We can also give a completely new proof, using the theorem that the distance from (a,b) to (c,d) is \sqrt{(a-c)^2+(b-d)^2}, as one can check using a line integral (modulo the theorem that the shortest distance between two points is a straight line, which needs proving in this context).

I saw measure theory developed as an undergraduate, and probably a few years ago I would have argued that this is now the “proper” proof — but now I realise that this proof still has some synthetic element to it: namely, the real numbers. We have a non-synthetic plane, but it is made from synthetic numbers.

What are the real numbers?

I was told as an undergraduate that it was an axiom that the reals existed, and that they were a complete ordered field. All of the analysis I learnt as an undergraduate was built upon this assumption, or rather, this structure (addition, multiplication, inequality) and these axioms (associativity and commutativity of addition and multiplication etc) on this type (the reals). In some sense it is no different to Euclid, who also had types (e.g. points), structures (e.g. triangles) and axioms (e.g. the common notions, or the parallel postulate), but who was modelling the Pythagorean theorem in a different, arguably more primitive, way.

Enter the analysts, bearing sequences

Descartes solved the problem of how to represent points in a plane with real numbers, but for Descartes, the reals were a type. Many years later, Cauchy and Dedekind gave two ways to represent the reals using simpler objects. Indeed, Cauchy sequences and Dedekind cuts are (different) ways of building the reals from the rationals. Similarly, the rationals can be built from the integers, the integers from the naturals, and the naturals are…well, they’re just some synthetic thing satisfying Peano’s axioms, right? At this point one could argue that Pythagoras’ theorem has become a statement about sequences of pairs of natural numbers (or however else one is modelling positive rationals), and the natural numbers have no definition — they are synthetic. But we can go further.

Enter the logicians, bearing sets.

One thing that ZFC set theory (the usual set-theoretic foundations of 20th century mathematics) has going for it, is that it gives a very unambiguous answer to the question: “Why are the natural numbers a set?”. The answer is “It’s one of the axioms”. One thing against it is that in ZFC set theory, everything is a set, even stuff which you don’t want to be a set. For example, real numbers (like the area of the square on the hypotenuse) are now sets, and the Pythagorean Theorem is now a theorem about the equality of two sets, although we don’t know exactly what the sets are, because we can never be sure whether the real numbers in the Platonic universe (or whichever universe we’re operating in) use Cauchy sequences or Dedekind cuts. [Pro tip: if your universe offers good quotienting facilities, use Cauchy sequences: they’re cleaner.]

The fact that we don’t know whether the reals being used in Pythagoras’ theorem are Cauchy sequences or Dedekind cuts is an indication that we have unfolded things too far, as far as Pythagoras’ theorem goes. Most mathematicians regard the real numbers as a type. A real number is not a set — Gauss or Riemann could certainly have informed you of that.

It is interesting that we can keep unfolding this way — but we can never get to the bottom of things. We can’t define everything, we always have to start somewhere — a synthetic beginning. Euclid started with points, lines and planes. Descartes started with the reals. Cauchy and Dedekind observed that you could start with the naturals. Set theorists start with sets. There is no definition of a set in ZFC set theory — a set is just a term in a model of ZFC set theory. The model can be thought of as the type, and its elements (or whatever you want to call them — they are not elements of a set in the internal logic of the model) as the terms. The behaviour of sets is developed using the axioms.

Pythagoras’ theorem : refinements of equality

So what is Pythagoras’ theorem? Is it that two shapes are “equal”, two numbers are equal, or two sets are equal? In some sense, it’s all of these things. In some sense this story reminds me of chemistry at school. The joke in the chemistry class was that they were always telling you lies — models of atoms which were very naive, and then more careful models which were more accurate, culminating (for me) with an undergraduate quantum mechanics course which told me that an electron was actually some kind of a complex-valued function. It feels very similar here. Euclid had a perfectly adequate notion of geometry, he axiomatised what he needed and argued from there. Later on we found a way to model the plane from more fundamental objects such as the real numbers. After that we found a way of modelling the real numbers using the natural numbers, and in some sense this is where we stop; using either type theory or set theory, the natural numbers have no definition — their existence is asserted, and we use them via their API (addition, multiplication, the principle of induction). Maybe someone will come along in future with a theory of mathematical quarks, which can be used together with the quark axioms to build the natural numbers in a natural way, and then our understanding of what Pythagoras’ theorem is might change again.

About xenaproject

The Xena Project aims to get mathematics undergraduates (at Imperial College and beyond) trained in the art of formalising mathematics on a computer. Why? Because I have this feeling that digitising mathematics will be really important one day.
This entry was posted in General, Olympiad stuff, Type theory, undergrad maths and tagged , , . Bookmark the permalink.

8 Responses to Thoughts on the Pythagorean theorem

  1. ComFreek says:

    I like the API analogy!
    Indeed, programming against interfaces is standard practice in software engineering. And the existence of classes implementing those interfaces actually ensures such code is not useless.

    Analogously, proving against abstract definitions is standard in math. And the existence of objects fulfilling those definitions ensures the math is not useless.

    But from what I’ve heard, refactoring of formalizations in theorem prover assistants is much more cumbersome than in programming IDEs.

    Like

    • xenaproject says:

      We prove against API’s in Lean. We try to avoid using the definition and instead use the interface. So someone could replace Cauchy reals with Dedekind reals and in theory nothing would break.

      Like

  2. Erm, Cantor introduced the Cauchy sequence construction of the reals, but named it after Cauchy. Cauchy was still using infinitesimals when doing calculus, and, famously, couldn’t get his quantifiers in the right order.

    Liked by 1 person

  3. grasp yo hat says:

    The Euclid approach is algebra in the tensor or exterior square of the plane. The area dissection arguments fit directly in that framework and Pythagoras as a statement about areas is straightforward. But to get Cartesian distance out of that is to identify the exterior square of the plane with a line (as vector or affine spaces with inner product) and prove all kinds of basic properties of the distance. There are a lot of buried technicalities in this and it would be interesting if the need for the duality rather than natural identification of the spaces is in some precise sense equivalent to all the difficulties.

    Like

Leave a comment