The inverse of a bijection.

In my “introduction to proof” course I teach the undergraduates about equivalence relations. What the students do not know, is that earlier on in the course I have secretly inserted examples of equivalence relations without ever letting on. For example, in the number theory section I prove that that if N is a fixed positive integer, and a, b, c are integers, then

a \equiv a \pmod N;

a \equiv b \pmod N \implies b \equiv a \pmod N;

a \equiv b \pmod N and b \equiv c\pmod N \implies a \equiv c\pmod N.

And when we do equivalence relations I can then say “oh look! Congruence mod N is an equivalence relation!”

I do the same trick with bijections. I prove that the identity function is a bijection, the inverse of a bijection is a bijection, and that the composite of two bijections is a bijection. Then later on I can say “oh look! ‘I biject with you’ is an equivalence relation!”

Coding up congruences mod N in Lean is a joy to do, and one of my example files on this stuff even made it into mathlib!

But coding up the bijection equivalence relation into mathlib, one runs into an issue which is hard to explain to mathematicians. The problem is with symmetry — proving that the inverse of a bijection is a bijection. Say f: X \to Y is a bijection, and y\in Y. We’re trying to define the inverse function to f and we want to figure out its value on y. Lean has no trouble proving that there exists a unique element of X such that applying f to it gives y. But when you try to access this element of X while defining the inverse of f, you get an intimidating error:

induction tactic failed, recursor 'Exists.dcases_on'
can only eliminate into Prop

Now Lean 3 does not have particularly friendly error messages, but as a Lean beginner I remember finding this one completely baffling. As far as type theory is concerned, there is a universe issue; the user just tried to define a function to Type but the only function Lean could find was one into Prop. A mathematician using Lean might know about Lean’s two kinds of universes — Prop is the universe of true-false statements, where things like the statement of Fermat’s Last Theorem lives, and Type is the universe of data, where things like the real numbers live. So what is going on? The problem is that \exists x \in X, f(x)=y is in Prop, but the actual x which exists is in Type, and Lean is complaining that it does not have an algorithm to get hold of x, and hence to move from the weak Prop universe to the stronger Type one.

Now of course to a mathematician this is just all ridiculous — the algorithm is “there is a unique x — take that one”. The Lean beginner will learn that instead of applying the cases tactic to ∃ (x : X), f x = y to get x, they can apply the choose tactic, and then one gets a new error

definition 'g' is noncomputable, it depends on 'classical.some'

which one fixes by writing noncomputable theory at the top of the file. One is left with a feeling that Lean is kind of bewildering, and why was there so much fuss about applying the obvious algorithm “take the unique x, duh”?

Recently on the Lean Zulip chat there has been talk amongst computationally-minded people about fancy things such as domain theory, topologies, the difference between option and roption, and difference lists. All of this has been slowly sinking into my head in some sort of way, and it culminated this morning in my actually seeing an explicit example (needs Zulip login) of a bijection between two sets where even a mathematician can see that there is some sort of issue with defining the inverse bijection. Thanks to Mario Carneiro for coming up with the example. The rest of this post explains it.

The example.

The sets in question are as follows. Let X=\{-1,0,1,2,3,\ldots\}=\mathbb{Z}_{\geq-1}. That’s easy enough.

Let Y be the set of infinite sequences P(0), P(1), P(2), \ldots where each P(i)\in\{T,F\} (true or false), and with the property that for all m\geq0 we have P(m)\implies P(m+1). In words — if at some point in the sequence we see T, then everything after that point is also T.

Examples of elements of Y:

FFFFTTTTTTTTTTTTTTT...
FFFFFFFFFTTTTTTTTTT...
FFFFFFFFFFFFFFFFFFF...
TTTTTTTTTTTTTTTTTTT...

It is not hard to convince yourself that sequences in Y come in two flavours: either the sequence has some finite number n\geq0 of Fs and then everything afterwards is T, or the sequence can be all Fs.

The bijection f: X\to Y sends -1 to the “all F” sequence, and n\geq0 to the sequence which contains n Fs and then reverts to T forevermore. This is easily checked to be a bijection, and one even feels that it’s possible to write down the inverse. But how easy is it in practice?

Here’s the issue. Say I now give you a “black box” sequence P in Y. It’s a function, I can guarantee that it’s in Y (I even have a proof!), and if you give me any i\in\mathbb{Z}_{\geq0} I will return P(i). We know that there is a unique n\in X with f(n)=P and it’s now your job to find it. What is your algorithm?

Initially you can start off by evaluating the function at some values. It turns out that P(3) is F, that P(100) is F and then that P(10^{18}) is F. You realise that the n you’re looking for is either a very large integer, or -1. But which one? You have what looks like complete access to P\in Y, but you cannot work out which n\in X it corresponds to — indeed you cannot even say whether n\geq0 or not!

After a while, you get sick of this and say that actually you haven’t got complete access to P and I am suppressing information. You complain that you don’t just want to evaluate P at lots of numbers, you want to see the actual definition. So I reveal the definition: P(i) is defined to be T if there is a counterexample to the Goldbach conjecture which is at most i, and F otherwise. To put it another way — if the Goldbach conjecture is true then the sequence is all Fs and n=-1, but if it fails at some point then n is the positive integer where it fails. I even show you the one-line proof that P is an element of Y: if there’s a counterexample which is at most m then clearly there’s a counterexample which is at most m+1. Even with the definition, you are none the wiser!

Conclusion.

When a function is marked as noncomputable in Lean it means that Lean has not been told a method for working it out for every input. The innocuous-looking statement “there exists a unique x\in X with f(x)=y, so choose that x” is clearly something which can be done abstractly, but there is no method for finding x, even if we can prove that it must exist. There can’t be a method for finding it in general — any such method would translate into a method for proving Goldbach’s conjecture, the 3n+1 problem, Fermat’s Last Theorem, the twin primes conjecture and so on; these are hard problems. Indeed, one can prove using a diagonalisation argument that no algorithm can exist to invert this bijection. The inverse of our computable function f: X\to Y certainly exists, but it is a noncomputable function. Lean can handle such functions, no problem, but it has to be told explicitly that it should be ignoring computability issues (e.g. with the line noncomputable theory at the top of a file).

About xenaproject

The Xena Project aims to get mathematics undergraduates (at Imperial College and beyond) trained in the art of formalising mathematics on a computer. Why? Because I have this feeling that digitising mathematics will be really important one day.
This entry was posted in Learning Lean, M1F and tagged , . Bookmark the permalink.

10 Responses to The inverse of a bijection.

  1. The fun thing is that one knows that there is the inverse of the relation arising from the bijection, and that this relation is total and functional, but can’t conclude that it contains the graph of a function.

    Like

  2. DT says:

    Your example reminds me of Ed Nelson. What is the set of positive integers, to Lean? Is it closer to “finite length sequences of decimals” (call that D) or to “finite length sequences of tally marks” (call that T)?

    I know nothing about Lean but I assume it would not return any error messages if you tried to define the bijection s:T –> D that counted up all the tally marks as a decimal. (e.g. 1111 –> 4), nor if you tried to define the inverse to s. I bet it would not even complain that this inverse is not “computable,” am I mistaken?

    But it could never successfully compute the composite function s^{-1}(e(5000)), where e:D –> D is the function given by the formula e(d) = 10^d. Note there’s plenty of space in this text box to write down the answer to e(5000), though it would be a big eyesore. Nelson suspected that one could use this asymmetry between T and D to derive a contradiction from the Peano axioms.

    Like

    • xenaproject says:

      In Lean we have both, but `nat` is the tally marks. The decimals are called `num` (and in fact they’re binary). The bijection between `nat` and `num` is indeed computable in both directions. “computable” in Lean has nothing to do with feasibility or practicality (a la P=NP), it’s just about whether an algorithm exists (Turing, Halting problem etc).

      Like

  3. Pingback: Proofs are not programs | Xena

  4. Andrej Bauer says:

    Why do you feel mislead by proofs-as-programs? Has someone tried to oversell the case to you? It is fairly well understood which kinds of proofs correspond to programs, and it’s not all of them, of course. There are researchers who push the boundaries of the slogan and try to extend it in various directions. This has brought a great deal of success to understanding the relationship between deduction and computation, but where did you hear claims on the basis of which you feel mislead?

    Like

  5. xenaproject says:

    This is a fair question. I guess I say a bit more about this in my follow-up post https://xenaproject.wordpress.com/2019/06/15/proofs-are-not-programs/ . Maybe the case was over-sold; but perhaps I just misunderstood the claims initially. I have been watching Xavier Leroy’s lectures https://www.college-de-france.fr/site/xavier-leroy/course-2018-11-21-10h00.htm here and the “proofs = programs” slogan comes up a lot. Of course it will be very well-known to people who properly understand the correspondence what the actual limitations of the claim are. I have read your work Andrej on constructivism https://www.ams.org/journals/bull/2017-54-03/S0273-0979-2016-01556-4/ and I must say that I found it far easier to understand fully once I had used Lean for a while (Lean forced me to think more carefully about the logic behind doing mathematics). What I want to argue is that the kind of mathematicians I hang out with (mostly number theorists and geometers) are doing mathematics which in areas which are highly respected by the community and which are resolutely and inherently non-constructive. I have no proof of this, but in my work formalising perfectoid spaces I had to switch Lean into “mathematician mode — no code generated but feel free to use excluded middle and the axiom of choice” sufficiently often for me to become convinced that, at least in my area, nonconstructivism is all over the place. I might be wrong. As far as I know, nobody has ever thought carefully about this, and I conjecture that this is because most people thinking about perfectoid spaces don’t know too much about constructive maths, and most people thinking about constructive maths don’t know too much about perfectoid spaces.

    Like

  6. I think there are *two* things going on here:

    (1) Constructive vs. non-constructive reasoning.

    (2) Whether the axiom of unique choice is available constructively.

    For both Lean and Coq, with propositions modeled by the built-in type Prop, the axiom of unique choice is not available when one is under the constructive regime.

    However, this is not to say that unique choice is inherently non-constructive. If instead of using the type Prop one uses what Voevodsky calls hProp, which is simply the type of types with at most one element (as is done in HoTT/UF), then the axiom of choice just holds, even under the constructive regime, and doesn’t need to be postulated. And then you do get that a function has an inverse if and only if it is a bijection (I presume you are assuming the definition that says that a bijection is a function which is both and injection and a surjection).

    But this wouldn’t help with the example you attribute to Carneiro, because the fact that the proposed function *is* a bijection requires non-constructive reasoning, *before* you attempt to invert it by applying unique choice. Technically, the fact that your function is a bijection is equivalent to what constructive mathematicians call LPO, which means that we can tell whether a sequence of the kind you are considering is eventuallyT or not, which, as you say, decides the Goldbach conjecture and many other things like it.

    The obstacle to constructivity here is not unique choice but the the fact that the proof of bijectivity of your is not constructive in the first place.

    Like

  7. xenaproject says:

    Thanks for this response! Yes, in Lean at least, a bijection is defined to be a function which is both an injection and a surjection.

    Like

  8. Lambda Fairy says:

    Nice post!

    I studied constructive analysis in undergrad – we made a very similar argument to Mario’s, but with the execution of a Turing machine.

    By the way, if you know that there is a T somewhere, just not exactly where, then you can search for the index with Markov’s principle. From a computational point of view, that represents an algorithm that takes an unknown but finite number of steps.

    https://en.wikipedia.org/wiki/Markov%27s_principle

    Like

Leave a comment