Category Archives: Solutions

Exercise 3.5.12

Exercise statement

This exercise will establish a rigorous version of Proposition 2.1.16. Let X be an arbitrary set. Let f : \mathbf N \times X \to X be a function, and let c \in X be an element of X. Show that there exists a function a : \mathbf N \to X such that

a(0) = c

and

a(n{{+}\!{+}}) = f(n, a(n)) for all n \in \mathbf N,

and furthermore that this function is unique. For an additional challenge, prove this result without using any properties of the natural numbers other than the Peano axioms directly (in particular, without using the ordering of the natural numbers, and without appealing to Proposition 2.1.16).

Hints

  1. First show inductively, by a modification of the proof of Lemma 3.5.12, that for every natural number N \in \mathbf N, there exists a unique function a_N : \{n \in \mathbf N : n \leq N\} \to X such that a_N(0) = c and a_N(n{{+}\!{+}}) = f(n, a_N(n)) for all n \in \mathbf N such that n < N.
  2. For the “additional challenge” part: first show inductively, using only the Peano axioms and basic set theory, that for every natural number N \in \mathbf N, there exists a unique pair A_N, B_N of subsets of \mathbf N which obeys the following properties: (a) A_N \cap B_N = \emptyset, (b) A_N \cup B_N = \mathbf N, (c) 0 \in A_N, (d) N{{+}\!{+}} \in B_N, (e) Whenever n \in B_N, we have n{{+}\!{+}} \in B_N. (f) Whenever n \in A_N and n \ne N, we have n{{+}\!{+}} \in A_N. Once one obtains these sets, use A_N as a substitute for \{n \in \mathbf N : n \leq N\} in the previous argument.

How to think about the exercise

This exercise takes quite a bit of work, especially if you do both the standard version and the “additional challenge” version. It’s a real struggle because there’s just so much to do in the proof, many separate induction proofs, so many variables to keep track of, etc. In the challenge version, you don’t even get to use the properties of order, so you’re essentially re-proving all those properties using bare set theory.

In my opinion, this exercise doesn’t actually have any “deep” ideas, in the sense that the proof is basically all automatic: you just keep doing the next obvious thing, until you’re (eventually, after a very long time) done with the proof. There are I think three non-automatic parts to the proof: (1) figuring out that instead of defining a directly, you should build it up in stages by defining the functions a_N (this was already done for us in the hint); (2) figuring out that a(n) should be defined as a_n(n); and (3) coming up with the properties (a) through (f) (like (1), we didn’t actually have to do this, since the book just gave these to us).

My honest opinion is that it’s fine to skip this exercise unless you’re dying to verify that recursive definitions really do work. If you’re not dying to know, one option would be to skip this exercise for now; then, once you’ve finished the rest of the book, you hopefully have more mathematical chops to push through this exercise without it being utter torture. I’m not sure that working through this exercise gives you a better understanding of recursion.

Some tips for thinking about this proof: in the challenge version, A_N is supposed to contain the numbers at most N, so n \in A_N is the rigorous way to say n \leq N (alternatively we could also say A_n \subseteq A_N). Similarly, to say n < N you can write n \in A_N \setminus \{N\} (alternatively, A_n \subsetneq A_N).

You might wonder if there’s some short, elegant way to do this exercise. I don’t think there’s a way to shorten the proof unless you’re willing to skip some parts by saying e.g. “this follows from a routine induction”.

In the challenge version, at one point we define A_{N{{+}\!{+}}} := A_N \cup \{N{{+}\!{+}}\} and B_{N{{+}\!{+}}} := B_N \setminus \{N{{+}\!{+}}\}. You might be nervous about this, since it looks like we are defining the sets A_N, B_N recursively. But the whole point of the exercise is to show that recursive definitions work! Isn’t this circular? It turns out that this is fine (hopefully, unless I’m horribly mistaken), and here is why. For each N, the sets A_N, B_N are described using properties (a) through (f), which are not recursive. So our property P(N) pertaining to N that we use in the inductive proof can be “there exist sets A,B with properties (a) through (f)”. Notice the missing subscripts in A,B; because we did this, in our inductive step, we will define some sets C:=A \cup \{N{{+}\!{+}}\} and D:=B \setminus \{N{{+}\!{+}}\}. Again, no subscripts, and this makes a difference. If we tried to do this when defining a, it wouldn’t work because we can’t define a(n{{+}\!{+}}) without reference to a(n). (If you think you can do this, try to write a property P(n) pertaining to n such that proving P(n) for all n would give us the theorem. And of course, P(n) can’t mention a at all.)

There is a paper by Leon Henkin titled “On Mathematical Induction” which proves this same theorem using a very similar method. What Henkin calls “segments” are the sets A_N in the proof below, and what he calls “partial functions” are the functions a_N. I haven’t read his proof in detail, but the paper seems to be very readable (i.e. if you know enough math to read Analysis I, then you shouldn’t have a problem with Henkin’s proof).

Some things I haven’t yet figured out (and am too tired to):

  • Do we actually ever use the uniqueness of A_N,B_N and a_N in the proof? If so, where? Actually for A_N,B_N we do use uniqueness when we use the fact that A_{N{{+}\!{+}}} = A_N \cup \{N{{+}\!{+}}\}.

Model solution (standard version)

(This version is circular, since even defining the sets \{n \in \mathbf N : n \leq N\} requires the notion of order, which was defined using addition, and addition was defined using a recursive definition. Also just as in the proof of Lemma 3.5.12, we make use of the property that if n is a number such that n \leq N{{+}\!{+}}, then exactly one of n \leq N or n = N{{+}\!{+}} is true. This again requires the properties of order.)

Let P(N) be the statement “there exists a function a_N : \{n \in \mathbf N : n \leq N\} \to X such that a_N(0) = c and a_N(n{{+}\!{+}}) = f(n, a_N(n)) for all n < N”. We shall prove that P(N) is true for all natural numbers N using induction. For the base case we must show P(0), which states that there is a function a_0 : \{0\} \to X such that a_0(0) = c. (The second condition is vacuously true, since there are no natural numbers n such that n < 0.) We can simply define a_0(0) := c to complete the base case.

Now suppose inductively that P(N) is true, so that there exists a function a_N : \{n \in \mathbf N : n \leq N\} \to X such that a_N(0) = c and a_N(n{{+}\!{+}}) = f(n, a_N(n)) for all n < N. We must show that P(N{{+}\!{+}}) is true, i.e. there exists a function a_{N{{+}\!{+}}} : \{n \in \mathbf N : n \leq N{{+}\!{+}}\} \to X such that a_{N{{+}\!{+}}}(0) = c and a_{N{{+}\!{+}}}(n{{+}\!{+}}) = f(n, a_{N{{+}\!{+}}}(n)) for all n < N{{+}\!{+}}. We will define a_{N{{+}\!{+}}} as follows. We define a_{N{{+}\!{+}}}(n) := a_N(n) when n \leq N and a_{N{{+}\!{+}}}(n) = a_{N{{+}\!{+}}}(N{{+}\!{+}}) := f(N, a_N(N)) when n = N{{+}\!{+}}. We verify that this definition satisfies the required properties. We have a_{N{{+}\!{+}}}(0) = a_N(0) = c since 0 \leq N (here we used the inductive hypothesis a_N(0) = c). Now suppose n < N{{+}\!{+}}. Then n{{+}\!{+}} \leq N{{+}\!{+}}, so either n{{+}\!{+}} \leq N or n{{+}\!{+}} = N{{+}\!{+}}.

  • Suppose first that n{{+}\!{+}} \leq N. By definition of a_{N{{+}\!{+}}} we have a_{N{{+}\!{+}}}(n{{+}\!{+}}) = a_N(n{{+}\!{+}}). Also, since n < N, we have by the inductive hypothesis that a_N(n{{+}\!{+}}) = f(n, a_N(n)). Finally because n \leq N, we have a_N(n) = a_{N{{+}\!{+}}}(n) by definition of a_{N{{+}\!{+}}}. Putting these three equalities together, we have a_{N{{+}\!{+}}}(n{{+}\!{+}}) = f(n, a_{N{{+}\!{+}}}(n)), which is what we wanted to show.
  • On the other hand, if n{{+}\!{+}} = N{{+}\!{+}} then by definition of a_{N{{+}\!{+}}} we have a_{N{{+}\!{+}}}(n{{+}\!{+}}) = a_{N{{+}\!{+}}}(N{{+}\!{+}}) = f(N, a_N(N)). Since N \leq N, by definition of a_{N{{+}\!{+}}} we have a_{N{{+}\!{+}}}(N) = a_N(N). Also since n{{+}\!{+}} = N{{+}\!{+}}, by Axiom 2.4 we see that n=N. Putting these equalities together we see that a_{N{{+}\!{+}}}(n{{+}\!{+}}) = f(N, a_{N{{+}\!{+}}}(N)) = f(n, a_{N{{+}\!{+}}}(n)), which is what we wanted.

In either case, we have shown that the desired function a_{N{{+}\!{+}}} exists, so this closes the induction.

Now we show that for each N, the function a_N is unique. Let N be given, and suppose g : \{n \in \mathbf N : n \leq N\} \to X is a function such that g(0) = c and g(n{{+}\!{+}}) = f(n, g(n)) for all n < N. We will show using induction on n that a_N = g, i.e. a_N(n) = g(n) for all n \leq N. For the base case, we have a_N(0) = c = g(0). Now suppose inductively that if n \leq N, then a_N(n) = g(n). We must show that if n{{+}\!{+}} \leq N then a_N(n{{+}\!{+}}) = g(n{{+}\!{+}}). So suppose n{{+}\!{+}} \leq N. We have n \leq n{{+}\!{+}} \leq N so a_N(n) = g(n) by inductive hypothesis. Thus a_N(n{{+}\!{+}}) = f(n, a_N(n)) = f(n, g(n)) = g(n{{+}\!{+}}). This closes the induction.

Now that we have the functions a_N for each natural number N, we can finally define a. We define a(n) := a_n(n) for each natural number n. We verify that a has the desired properties. First, we have a(0) = a_0(0) = c. Now we will use induction. Let Q(n) be the statement “a(n{{+}\!{+}}) = f(n, a(n))”. We will show that Q(n) is true for all n \in \mathbf N. For the base case, we must show that a(1) = f(0, a(0)). By definition of a, we have a(1) = a_1(1). Now a_1 has the property that a_1(n{{+}\!{+}}) = f(n, a_1(n)) for all n < 1 so for n=0 we get a_1(0{{+}\!{+}}) = f(0, a_1(0)). Thus a(1) = a_1(1) = f(0, a_1(0)) = f(0, a(0)), where the last step follows because a_1(0) = c = a(0).

Now suppose inductively that Q(n) is true, i.e. a(n{{+}\!{+}}) = f(n, a(n)). We must show that Q(n{{+}\!{+}}) is true, i.e. a((n{{+}\!{+}}){{+}\!{+}}) = f(n{{+}\!{+}}, a(n{{+}\!{+}})). Starting from the left-hand side, by definition of a, we have a((n{{+}\!{+}}){{+}\!{+}}) = a_{(n{{+}\!{+}}){{+}\!{+}}}((n{{+}\!{+}}){{+}\!{+}}). By definition of a_{(n{{+}\!{+}}){{+}\!{+}}}, we have a_{(n{{+}\!{+}}){{+}\!{+}}}((n{{+}\!{+}}){{+}\!{+}}) = f(n{{+}\!{+}}, a_{(n{{+}\!{+}}){{+}\!{+}}}(n{{+}\!{+}})). Now starting from the right-hand side, we have f(n{{+}\!{+}}, a(n{{+}\!{+}})) = f(n{{+}\!{+}}, a_{n{{+}\!{+}}}(n{{+}\!{+}})) by definition of a. Thus if we can show that a_{(n{{+}\!{+}}){{+}\!{+}}}(n{{+}\!{+}}) = a_{n{{+}\!{+}}}(n{{+}\!{+}}) then we will be done. We invoke the Lemma below with N:=n{{+}\!{+}} and M:=(n{{+}\!{+}}){{+}\!{+}}, which closes the induction.

Lemma. Let N, M be natural numbers such that N \leq M. Then a_N(n) = a_M(n) for all n \leq N.

Proof. We prove this by inducting on n. For the base case, we have a_N(0) = c = a_M(0). Now suppose inductively that if n \leq N then a_N(n) = a_M(n). We want to show that if n{{+}\!{+}} \leq N then a_N(n{{+}\!{+}}) = a_M(n{{+}\!{+}}). So suppose n{{+}\!{+}} \leq N. Then we have n \leq n{{+}\!{+}} \leq N so by inductive hypothesis a_N(n) = a_M(n). Now we have

\begin{aligned} a_N(n{{+}\!{+}}) &= f(n, a_N(n)) \\ &= f(n, a_M(n)) \\ &= a_M(n{{+}\!{+}})\end{aligned}

This closes the induction. \Box

We have shown the existence of a. Our final task is to show the uniqueness of a. Let h : \mathbf N \to X be a function such that h(0) = c and h(n{{+}\!{+}}) = f(n, h(n)) for all n \in \mathbf N. We want to show that a= h, i.e. a(n) = h(n) for all n \in \mathbf N. We prove this by induction on n. For the base case, we have a(0) = c = h(0). Now suppose inductively that a(n) = h(n). We want to show that a(n{{+}\!{+}}) = h(n{{+}\!{+}}). We have a(n{{+}\!{+}}) = f(n, a(n)) = f(n, h(n)) = h(n{{+}\!{+}}), where the middle equality follows by the inductive hypothesis. This closes the induction.

Model solution (challenge version)

First we show by induction that the sets A_N, B_N exist. Let P(N) be the statement “there exists a pair of sets A_N, B_N with the properties (a) through (f), and with the additional property that N \in A_N”.

For the base case, N=0, we will show that A_N := \{0\} and B_N := \mathbf N \setminus \{0\} work. (a), (b) We have A_N \cap B_N = \emptyset and A_N \cup B_N = \mathbf N by definition of set difference. (c) Also, 0 \in A_N and (d) 0{{+}\!{+}} \in B_N since 0{{+}\!{+}} \ne 0. (e) Now suppose n \in B_N. Then n{{+}\!{+}} \ne 0 so n{{+}\!{+}} \in B_N. (f) Now suppose n \in A_N and n \ne 0. But n \in A_N implies n=0 so we have both n=0 and n\ne 0. Thus vacuously we have n{{+}\!{+}} \in A_N. So now we have shown that A_0, B_0 satisfy the properties (a) through (f). The additional property is equivalent to (c) in this case, so we have shown that P(0) is true.

Now suppose inductively that P(N) is true. We want to show that there are sets A_{N{{+}\!{+}}} and B_{N{{+}\!{+}}} that also have properties (a) through (f), and that N{{+}\!{+}} \in A_{N{{+}\!{+}}}. We define A_{N{{+}\!{+}}} := A_N \cup \{N{{+}\!{+}}\} and B_{N{{+}\!{+}}} := B_N \setminus \{N{{+}\!{+}}\}. (a), (b) We have A_{N{{+}\!{+}}} \cap B_{N{{+}\!{+}}} = (A_N \cup \{N{{+}\!{+}}\}) \cap (B_N \setminus \{N{{+}\!{+}}\}) = A_N \cap B_N = \emptyset and A_{N{{+}\!{+}}} \cup B_{N{{+}\!{+}}} = (A_N \cup \{N{{+}\!{+}}\}) \cup (B_N \setminus \{N{{+}\!{+}}\}) = A_N \cup B_N = \mathbf N by inductive hypothesis. (c) Also, 0 \in A_N \subseteq A_N \cup \{N{{+}\!{+}}\} = A_{N{{+}\!{+}}} so 0 \in A_{N{{+}\!{+}}}. (d) Next we have to show (N{{+}\!{+}}){{+}\!{+}} \in B_{N{{+}\!{+}}}. But N{{+}\!{+}} \in B_N by inductive hypothesis, so by property (e) we have (N{{+}\!{+}}){{+}\!{+}} \in B_N. Also (N{{+}\!{+}}){{+}\!{+}} \ne N{{+}\!{+}}, which means that in going from B_N to B_{N{{+}\!{+}}}, we couldn’t have removed (N{{+}\!{+}}){{+}\!{+}}. So (N{{+}\!{+}}){{+}\!{+}} \in B_{N{{+}\!{+}}}. (e) Now suppose n \in B_{N{{+}\!{+}}}. Then n \in B_N so n{{+}\!{+}} \in B_N. So if we can show that n\ne N, then n{{+}\!{+}}\ne N{{+}\!{+}} so n{{+}\!{+}} \in B_{N{{+}\!{+}}} (because only N{{+}\!{+}} was removed in going from B_N to B_{N{{+}\!{+}}}). But if n=N then N \in B_N, which would mean N \notin A_N, which contradicts the additional property in the inductive hypothesis. So n{{+}\!{+}} \in B_{N{{+}\!{+}}}. (f) Now suppose n\in A_{N{{+}\!{+}}} and n\ne N{{+}\!{+}}. We want to show that n{{+}\!{+}} \in A_{N{{+}\!{+}}}. Since A_{N{{+}\!{+}}} = A_N \cup \{N{{+}\!{+}}\}, we have n \in A_N. Now we have two cases. If n\ne N, then by inductive hypothesis we have n{{+}\!{+}} \in A_N \subseteq A_{N{{+}\!{+}}}. On the other hand if n=N then n{{+}\!{+}} = N{{+}\!{+}} \in A_N\cup\{N{{+}\!{+}}\} = A_{N{{+}\!{+}}}. Finally we show the additional property, that N{{+}\!{+}} \in A_{N{{+}\!{+}}}. By definition A_{N{{+}\!{+}}} = A_N \cup \{N{{+}\!{+}}\} so N{{+}\!{+}} \in A_{N{{+}\!{+}}}. This completes the proof that P(N{{+}\!{+}}) is true, which closes the induction.

Now we show that the sets A_N, B_N are unique. Fix N, and let C, D be sets such that properties (a) through (f) hold (with C in place of A_N and D in place of B_N). We want to show that C=A_N and D=B_N. We will show using induction that n \in A_N iff n \in C. The base case follows from property (c). Now suppose inductively that n \in A_N iff n \in C. We must show that n{{+}\!{+}} \in A_N iff n{{+}\!{+}} \in C. Suppose n{{+}\!{+}} \in A_N. Either n=N or n\ne N. If n=N, then n{{+}\!{+}}=N{{+}\!{+}} so n{{+}\!{+}} = N{{+}\!{+}} \in B_N, which means n{{+}\!{+}} \notin A_N, a contradiction. Thus n\ne N. Since n{{+}\!{+}} \in A_N, we have n \in A_N (for if n \in B_N then we would have n{{+}\!{+}} \in B_N by property (e)). So by inductive hypothesis, n \in C. Since n\ne N, by property (f) we thus have n{{+}\!{+}} \in C. The converse is proved in exactly the same way (we only used properties (a) through (f) regarding A_N, B_N, rather than their concrete definitions, so we can just relabel the sets in the above proof). This closes the induction. Since A_N=C, we can use properties (a) and (b) to see that B_N=D.

Now we use the sets A_N, B_N to show that the functions a_N exist. Let Q(N) be the statement “there exists a function a_N : A_N \to X such that a_N(0) = c and a_N(n{{+}\!{+}}) = f(n, a_N(n)) for all n \in A_N \setminus \{N\}”. We will show that Q(N) is true for all natural numbers N using induction. For the base case, N=0, we must show that there is a function a_0 : A_0 \to X such that a_0(0) = c and a_0(n{{+}\!{+}}) = f(n, a_0(n)) for all n \in A_0 \setminus \{0\}. We showed above that A_0 = \{0\} so A_0 \setminus \{0\} is empty. We can define a_0(0) := c. This defines a_0 for all of A_0, and the condition that a_0(n{{+}\!{+}}) = f(n, a_0(n)) for all n \in A_0 \setminus \{0\} is vacuously satisfied, since A_0 \setminus \{0\} is empty.

Now suppose inductively that Q(N) is true, i.e. there is a function a_N : A_N \to X such that a_N(0) = c and a_N(n{{+}\!{+}}) = f(n, a_N(n)) for all n \in A_N \setminus \{N\}. We want to show that Q(N{{+}\!{+}}) is true, i.e. there is a function a_{N{{+}\!{+}}} : A_{N{{+}\!{+}}} \to X such that a_{N{{+}\!{+}}}(0) = c and a_{N{{+}\!{+}}}(n{{+}\!{+}}) = f(n, a_{N{{+}\!{+}}}(n)) for all n \in A_{N{{+}\!{+}}} \setminus \{N{{+}\!{+}}\}.

We define a_{N{{+}\!{+}}}(n) := a_N(n) if n \in A_N and a_{N{{+}\!{+}}}(n) := f(N, a_N(N)) if n = N{{+}\!{+}}. To show that this is a valid definition, we must show that every number in A_{N{{+}\!{+}}} is in exactly one of A_N or \{N{{+}\!{+}}\}. To show this, it suffices to show that A_{N{{+}\!{+}}} \subseteq A_N \cup \{N{{+}\!{+}}\} and A_N \cap \{N{{+}\!{+}}\} = \emptyset. By definition, A_{N{{+}\!{+}}} = A_N \cup \{N{{+}\!{+}}\} so A_{N{{+}\!{+}}} \subseteq A_N \cup \{N{{+}\!{+}}\}. To show that A_N \cap \{N{{+}\!{+}}\} = \emptyset we can show that N{{+}\!{+}} \notin A_N. By property (d), N{{+}\!{+}} \in B_N, so by property (a) N{{+}\!{+}} \notin A_N. So the definition of a_{N{{+}\!{+}}} is valid.

Next we show that a_{N{{+}\!{+}}} has the desired properties. Since 0 \in A_N by property (c), we have a_{N{{+}\!{+}}}(0) = a_N(0) = c by definition of a_{N{{+}\!{+}}} and inductive hypothesis that a_N(0) = c. Now we show that a_{N{{+}\!{+}}}(n{{+}\!{+}}) = f(n, a_{N{{+}\!{+}}}(n)) for all n \in A_{N{{+}\!{+}}} \setminus \{N{{+}\!{+}}\}. So suppose n \in A_{N{{+}\!{+}}} \setminus \{N{{+}\!{+}}\}. We have A_{N{{+}\!{+}}} \setminus \{N{{+}\!{+}}\} = (A_N \cup \{N{{+}\!{+}}\}) \setminus \{N{{+}\!{+}}\} = A_N so n \in A_N. We will now show that either n{{+}\!{+}} \in A_N or n=N. Indeed, if n\ne N, then since n \in A_N we can use property (f) to conclude that n{{+}\!{+}} \in A_N. Thus we have two cases:

  • Suppose first that n{{+}\!{+}} \in A_N. By definition of a_{N{{+}\!{+}}} we have a_{N{{+}\!{+}}}(n{{+}\!{+}}) = a_N(n{{+}\!{+}}). By inductive hypothesis, a_N(n{{+}\!{+}}) = f(n, a_N(n)) for all n \in A_N \setminus \{N\}. If n \in B_N, then by property (e) we would have n{{+}\!{+}} \in B_N, which contradicts property (a). So n \notin B_N, which means n must be in A_N by property (b). If n=N then we would have N{{+}\!{+}} = n{{+}\!{+}} \in A_N, which contradicts property (d). Thus n\ne N. So n \in A_N \setminus \{N\}. This means we can use our inductive hypothesis to conclude that a_N(n{{+}\!{+}}) = f(n, a_N(n)). Finally, because n \in A_N, we have a_{N{{+}\!{+}}}(n) := a_N(n) by definition of a_{N{{+}\!{+}}}. Putting all these equalities together, we have a_{N{{+}\!{+}}}(n{{+}\!{+}}) = a_N(n{{+}\!{+}}) = f(n, a_N(n)) = f(n, a_{N{{+}\!{+}}}(n)), which is what we wanted to show.
  • Now suppose n = N. Then n{{+}\!{+}} = N{{+}\!{+}} so by definition of a_{N{{+}\!{+}}} we have a_{N{{+}\!{+}}}(n{{+}\!{+}}) = f(N, a_N(N)). We have N \in A_N. (This is yet another induction proof. Actually I don’t think we need to prove this since this is the “additional property” that we already proved above. But I wrote this part before I did the earlier part, so I’ll just leave it in. For the base case, 0 \in A_0 by property (c). Now suppose inductively that N \in A_N. We want to show that N{{+}\!{+}} \in A_{N{{+}\!{+}}}. But A_{N{{+}\!{+}}} = A_N \cup \{N{{+}\!{+}}\}, and N\in A_N by inductive hypothesis, so N\in A_{N{{+}\!{+}}}. Since N\ne N{{+}\!{+}}, by property (f) we thus have N{{+}\!{+}} \in A_{N{{+}\!{+}}}.) So by definition of a_{N{{+}\!{+}}} we have a_{N{{+}\!{+}}}(N) = a_N(N). Thus a_{N{{+}\!{+}}}(n{{+}\!{+}}) = f(N, a_N(N)) = f(n, a_{N{{+}\!{+}}}(n)), which is what we wanted.

In either case, we have shown that the desired function a_{N{{+}\!{+}}} exists, so this closes the induction.

Now we want to show that for each N, the function a_N is unique. Fix N, and suppose g : A_N \to X is a function such that g(0) = c and g(n{{+}\!{+}}) = f(n, g(n)) for all n \in A_N \setminus \{N\}. We will show using induction on n that a_N=g, i.e. a_N(n)=g(n) for all n \in A_N. For the base case, a_N(0)=c=g(0). Now suppose inductively that if n \in A_N then a_N(n)=g(n). We must show that if n{{+}\!{+}} \in A_N then a_N(n{{+}\!{+}})=g(n{{+}\!{+}}). So suppose n{{+}\!{+}} \in A_N. We have n\in A_N (for if n \in B_N then n{{+}\!{+}}\in B_N by property (e), a contradiction). Thus a_N(n)=g(n) by inductive hypothesis. Thus a_N(n{{+}\!{+}}) = f(n, a_N(n)) = f(n, g(n)) = g(n{{+}\!{+}}). This closes the induction.

Now that we have the functions a_N for each natural number N, we can finally define a. We define a(n) := a_n(n) for each natural number n. We verify that a has the desired properties. First, we have a(0) = a_0(0) = c. Now we will use induction. Let R(n) be the statement “a(n{{+}\!{+}}) = f(n, a(n))”. We will show that R(n) is true for all n \in \mathbf N. For the base case, we must show that a(1) = f(0, a(0)). By definition of a, we have a(1) = a_1(1). Now a_1 has the property that a_1(n{{+}\!{+}}) = f(n, a_1(n)) for all n \in A_1 \setminus \{1\}; since A_1 = A_0 \cup \{1\} = \{0,1\}, we see that A_1 \setminus \{1\} = \{0\}, so for n=0 we get a_1(0{{+}\!{+}}) = f(0, a_1(0)). Thus a(1) = a_1(1) = f(0, a_1(0)) = f(0, a(0)), where the last step follows because a_1(0)=c=a(0).

Now suppose inductively that R(n) is true, i.e. a(n{{+}\!{+}}) = f(n, a(n)). We must show that R(n{{+}\!{+}}) is true, i.e. a((n{{+}\!{+}}){{+}\!{+}}) = f(n{{+}\!{+}}, a(n{{+}\!{+}})). Starting from the left-hand side, by definition of a, we have a((n{{+}\!{+}}){{+}\!{+}}) = a_{(n{{+}\!{+}}){{+}\!{+}}}((n{{+}\!{+}}){{+}\!{+}}). By definition of a_{(n{{+}\!{+}}){{+}\!{+}}}, we have a_{(n{{+}\!{+}}){{+}\!{+}}}((n{{+}\!{+}}){{+}\!{+}}) = f(n{{+}\!{+}}, a_{(n{{+}\!{+}}){{+}\!{+}}}(n{{+}\!{+}})). Now starting from the right-hand side, we have f(n{{+}\!{+}}, a(n{{+}\!{+}})) = f(n{{+}\!{+}}, a_{n{{+}\!{+}}}(n{{+}\!{+}})) by definition of a. Thus if we can show that a_{(n{{+}\!{+}}){{+}\!{+}}}(n{{+}\!{+}}) = a_{n{{+}\!{+}}}(n{{+}\!{+}}) then we will be done. We apply Lemma 2 below with N:=n{{+}\!{+}} and M:=(n{{+}\!{+}}){{+}\!{+}}. Indeed, N=n{{+}\!{+}} \in A_{(n{{+}\!{+}}){{+}\!{+}}}=A_M since A_{(n{{+}\!{+}}){{+}\!{+}}} = A_n \cup \{n{{+}\!{+}}, (n{{+}\!{+}}){{+}\!{+}}\}. Also, N{{+}\!{+}} \in A_{N{{+}\!{+}}} by the additional property. Thus we can apply Lemma 2, and the induction is closed.

Lemma 1. Let N,M be natural numbers. If N \in A_M then A_N \subseteq A_M.

Proof. We prove this by induction on M. For the base case, we must show that if N \in A_0 then A_N \subseteq A_0. So suppose N \in A_0. But A_0 = \{0\} so this means N=0. So we have A_N = A_0 as required.

Now suppose inductively that if N \in A_M then A_N \subseteq A_M. We must show that if N \in A_{M{{+}\!{+}}} then A_N \subseteq A_{M{{+}\!{+}}}. So suppose N \in A_{M{{+}\!{+}}}. Then A_{M{{+}\!{+}}} = A_M \cup \{M{{+}\!{+}}\} so N \in A_M or N=M{{+}\!{+}}. If N\in A_M then by induction hypothesis we have A_N \subseteq A_M \subseteq A_{M{{+}\!{+}}}. If N=M{{+}\!{+}} then A_N = A_{M{{+}\!{+}}}. So in either case, we have A_N\subseteq A_M. This closes the induction. \Box

Lemma 2. Let N,M be natural numbers such that N \in A_M. Then a_N(n) = a_M(n) for all n \in A_N.

Proof. We prove this by inducting on n. For the base case, we have a_N(0) = c = a_M(0). Now suppose inductively that if n \in A_N then a_N(n) = a_M(n). We want to show that if n{{+}\!{+}} \in A_N then a_N(n{{+}\!{+}}) = a_M(n{{+}\!{+}}). So suppose n{{+}\!{+}} \in A_N. Then by Lemma 1 we have A_{n{{+}\!{+}}} \subseteq A_N. Thus we have n \in A_n \subseteq A_{n{{+}\!{+}}} \subseteq A_N. So n \in A_N, which means by inductive hypothesis we have a_N(n) = a_M(n). Now we have a_N(n{{+}\!{+}}) = f(n, a_N(n)) = f(n, a_M(n)) = a_M(n{{+}\!{+}}). This closes the induction. \Box

We have shown the existence of a. Our final task is to show the uniqueness of a. Let h : \mathbf N \to X be a function such that h(0) = c and h(n{{+}\!{+}}) = f(n, h(n)) for all n \in \mathbf N. We want to show that a=h, i.e. a(n) = h(n) for all n \in \mathbf N. We prove this by induction on n. For the base case, we have a(0)=c=h(0). Now suppose inductively that a(n)=h(n). We want to show that a(n{{+}\!{+}})=h(n{{+}\!{+}}). We have a(n{{+}\!{+}})=f(n,a(n))=f(n,h(n))=h(n{{+}\!{+}}), where the middle equality follows by the inductive hypothesis. This closes the induction.

Exercise 4.4.3

Exercise statement

Fill in the gaps marked (why?) in the proof of Proposition 4.4.4.

Proposition 4.4.4. There does not exist any rational number x for which x^2 = 2.

Hints

None.

How to think about the exercise

When we want to show that a natural number cannot be both even and odd, here’s a mistake you might make: you might say that if n is both even and odd, then there is a natural number k such that n = 2k and n = 2k+1. So 2k=2k+1, which means 0=1, a contradiction. This is wrong! You cannot assume that the k that acts as a “witness” for the evenness property is the same as the k that acts as a “witness” for the oddness property. The most we can say is that 2k = 2r + 1 for some natural numbers k and r (without any assumption that k=r).

Once you have 2k = 2r + 1, you might say k-r = 1/2 > 0 so that k-r is a natural number. Then you might prove by induction that m \ne 1/2 for every natural number m. If you remember back to Chapter 2, this is precisely the example given in the book after Remark 2.1.10. In fact, now that we have defined notions such as “integer”, “half-integer”, and “0.5”, this is actually a valid proof. However in my opinion, this is kind of an “impure” proof. The evenness/oddness properties are defined for natural numbers, and the definitions of these properties mention only natural numbers. This means that any fact we can prove about these properties can sort of be done as if we never knew what integers or rational numbers are. The proof I give below does use subtraction (which we never defined for natural numbers), but if you look closely, it uses only “virtual subtraction” (subtraction where we never get negative numbers) so I consider it a more “pure” proof.

Model solution

Every natural number is either even or odd, but not both: We first show using induction that every natural number is even or odd. For the base case, we see that 0 = 2\times 0 + 0 so 0 is even. Now suppose inductively that n is even or odd. We have to show that n+1 is even or odd. If n is even, then n = 2k for some natural number k. Thus n+1 = 2k+1 so n+1 is odd. On the other hand, if n is odd, then n = 2k+1 for some natural number k. Thus n+1 = 2k+2 = 2(k+1) so n+1 is even. In either case, we have shown that n+1 is even or odd, so this closes the induction.

Now we show that a natural number cannot be both even and odd. Suppose for sake of contradiction that there is some number n which is both even and odd. Thus we have n = 2k for some natural number k and n = 2r + 1 for some natural number r. So 2k = 2r + 1. Now we have two cases, r \geq k or r < k. Suppose first that r \geq k. Then 2k = 2r + 1 implies 0 = 2(r-k) + 1. Since r \geq k, we see that r-k is a natural number. Thus by Corollary 2.2.9 we have 1=0, a contradiction. Now suppose that r < k. Then r+1 \leq k so 1 \leq k - r. Since k-r is positive, 0 < k-r so we can add k-r to both sides to get k-r < 2(k-r). This means we have 1 \leq k-r < 2(k-r) = 1, so 1 < 1, a contradiction. In either case we arrive at a contradiction, which means our assumption that n is both even and odd was false.

If p is odd, then p^2 is also odd: If p is odd, we can write p = 2k + 1 for some natural number k. This means p^2 = (2k+1)^2 = 4k^2 + 4k + 1 = 2(2k^2 + 2k) + 1. Thus for r = 2k^2 + 2k we have p^2 = 2r + 1, which shows that p^2 is odd.

Since p^2 = 2q^2, we have q < p: Suppose for sake of contradiction that q \geq p. Then since p,q are positive, we can multiply on both sides by p to get pq \geq p^2 and multiply by q on both sides to get q^2 \geq pq, so that q^2 \geq pq \geq p^2. Also since q is positive, so is q^2, so q^2>0, which means that we can add q^2 to both sides to get 2q^2 > q^2. Putting these two inequalities together, we have 2q^2 > q^2 \geq p^2, which implies that 2q^2 \ne p^2, a contradiction.

Exercise 4.4.2

Exercise statement

A definition: a sequence a_0, a_1, a_2, \ldots of numbers (natural numbers, integers, rationals, or reals) is said to be in infinite descent if we have a_n > a_{n+1} for all natural numbers n (i.e., a_0 > a_1 > a_2 > \ldots).

(a) Prove the principle of infinite descent: that it is not possible to have a sequence of natural numbers which is in infinite descent.

(b) Does the principle of infinite descent work if the sequence a_1, a_2, a_3, \ldots is allowed to take integer values instead of natural number values? What about if it is allowed to take positive rational values instead of natural numbers? Explain.

Hints

  1. For (a): Assume for sake of contradiction that you can find a sequence of natural numbers which is in infinite descent. Since all the a_n are natural numbers, you know that a_n \geq 0 for all n. Now use induction to show in fact that a_n \geq k for all k \in \mathbf N and all n \in \mathbf N, and obtain a contradiction.

How to think about the exercise

For part (b), here’s one way to think about it: We showed in Exercise 4.1.8 that induction does not work for the integers. Since the suggested proof in part (a) is via induction, this is a hint that the principle of infinite descent does not work for the integers. At the very least, the same proof won’t work, so we’ll need to find a different proof.

Also minor pedantic note: the principle of infinite descent says that it is not possible to find a certain sequence. So if the principle of infinite descent does not work for a certain set of numbers, that means that it is possible to find such a sequence.

Model solution

(a) Suppose for sake of contradiction that a_0 > a_1 > a_2 > \ldots is a sequence of natural numbers which is in infinite descent. We will use induction on k to show that a_n \geq k for all k \in \mathbf N and all n \in \mathbf N. (More formally, we let P(k) be the statement “for each n \in \mathbf N, we have a_n \geq k”, and our goal is to prove that P(k) is true for each natural number k.) For the base case, k=0, we see that a_n \geq 0 for all n, since a_0, a_1, a_2, \ldots are all natural numbers. Now suppose inductively that we have a_n \geq k for all n. We must show that a_n \geq k+1 for all n. Let n be given. Then by the inductive hypothesis, we know that a_n \geq k for all n, so in particular for n+1 we have a_{n+1} \geq k. We also know that a_n > a_{n+1} since the sequence is in infinite descent. Thus we have a_n > k by transitivity of order, which means a_n \geq k+1 by Proposition 2.2.12(e). This closes the induction. Now we will obtain a contradiction. For n=0 and k=a_0+1 in particular, we must have a_0 \geq a_0+1. This means 0\geq 1, a contradiction.

(b) The principle of infinite descent does not work if the sequence is allowed to take integer values. Indeed, define a_n := -n for n = 1, 2, 3, \ldots. Then we have a_{n+1} < a_n since -(n+1) < -n (to show this more rigorously, we can start from 0 < 1, subtract one from both sides to obtain -1 < 0, then add -n to both sides to obtain -n-1 < -n). Since we have found an example of an infinite descent, it is possible to have an infinite descent.

The principle of infinite descent does not work if the sequence is allowed to take positive rational values. Two examples are a_n := 1/n and b_n := 2^{-n} for n = 1,2,3,\ldots. We’ll prove that this is an infinite descent for a_1,a_2,a_3,\ldots. We have n+1 > n, so multiplying through by \frac{1}{n(n+1)} gives 1/n > 1/(n+1), i.e. a_n > a_{n+1}.

Exercise 4.4.1

Exercise statement

Prove Proposition 4.4.1.

Proposition 4.4.1 (Interspersing of integers by rationals). Let x be a rational number. Then there exists an integer n such that n \leq x < n+1. In fact, this integer is unique (i.e., for each x there is only one n for which n \leq x < n+1). In particular, there exists a natural number N such that N > x (i.e., there is no such thing as a rational number which is larger than all the natural numbers).

Hints

  1. Use Proposition 2.3.9.

How to think about the exercise

I think this is a great exercise. It’s an interesting fact about numbers, but also the exercise does a good job of bringing in results about integers and natural numbers (which you might be beginning to forget about).

Model solution

Suppose that x \geq 0. Then we can write x = a/b, where a is a natural number and b is a positive integer. By Proposition 2.3.9 there exist natural numbers m,r such that a = mb + r and 0 \leq r < b. Since 0 \leq r < b, we have mb \leq mb + r < mb + b = (m+1)b. Since b is positive, so is 1/b, and we can multiply through by 1/b to obtain m \leq a/b < m+1. Thus we can take n := m to be the integer we are looking for.

Now suppose x is negative. Then -x is positive so by what we showed in the previous paragraph, we have m \leq -x < m+1 for some integer m. Thus we have -(m+1) < x \leq -m. We now have two cases. If x \ne -m, then we can take n := -(m+1) to be the integer. Indeed, we have n+1 = -m and n \leq x < n+1. On the other hand, if x = -m then we can take n := -m to be the integer. We have n = x < n+1 as required.

Next, we show uniqueness. Suppose for sake of contradiction that we had distinct integers n,m such that n \leq x < n+1 and m \leq x < m+1. Since n\ne m we have either n < m or n > m. We can assume n < m, since if n > m we can just repeat what follows by swapping n and m. So we have n < m \leq x < n+1, which we can simplify to n < m < n+1. We will show that this inequality leads to a contradiction. By definition of order, we have m = n + d and n+1 = m + d' for positive integers d,d'. So we have n+1 = n + d + d', which means 1 = d+d'. Since d is positive, by Lemma 2.2.10 there is a natural number k such that d = k+1. Thus we have 1 = k+1+d', i.e. 0 = k + d'. By Corollary 2.2.9, this means that d' = 0, a contradiction. This contradiction shows that we must have n=m.

To find a natural number N such that N > x, we can split into cases. If x is positive or zero, we have an integer n such that n \leq x < n+1, so we can take N := n+1. Then 0 \leq x < N (in particular, N is a natural number). On the other hand, if x is negative, we can take N:=0. We have x < 0 = N as required.

Exercise 4.3.5

Exercise statement

Prove that 2^N \geq N for all positive integers N.

Hints

  1. Use induction.

How to think about the exercise

If N=0, we have 2^N = 2^0 = 1 > 0 = N. If N is negative, then 2^N is positive, so 2^N > N. So in fact this result holds for all integers, not just the positive integers. Why did Tao make us prove it only for positive N? The two reasons I can think of are: (1) if I remember correctly, this result is used later on in the book, and when we do use it, we only need to know this fact for positive N; (2) the method of proof is different for negative N, and the induction step doesn’t quite work if we start with the base case of N=0 (try it!) so it’s simplest if we only prove this for positive N.

Model solution

We can actually prove something stronger, that 2^N > N for all positive integers N. For the base case, we have 2^1 = 2 > 1. Now suppose inductively that 2^N > N. We want to show that 2^{N+1} > N+1. We have 2^{N+1} = 2\times 2^N > 2\times N Also, since N \geq 1, we can add N to both sides to get 2N \geq N+1. Thus we have 2^{N+1} > 2N \geq N+1, which closes the induction.

Exercise 4.3.4

Exercise statement

Prove Proposition 4.3.12.

Proposition 4.3.12 (Properties of exponentiation, II). Let x,y be non-zero rational numbers, and let n,m be integers.

(a) We have x^n x^m = x^{n+m}, (x^n)^m = x^{nm}, and (xy)^n = x^n y^n.
(b) If x \geq y > 0, then x^n \geq y^n > 0 if n is positive, and 0 < x^n \leq y^n if n is negative.
(c) If x,y > 0, n\ne 0, and x^n=y^n, then x=y.
(d) We have |x^n| = |x|^n.

Hints

  1. Induction is not suitable here. Instead, use Proposition 4.3.10.
  2. Actually, you might want to use induction for small portions of this.

How to think about the exercise

I think this is a pretty tough exercise. The main reason is that it’s easy to accidentally use something that you aren’t supposed to know, so you have to be vigilant about making sure each step really does follow from the set of allowed facts.

Model solution

Throughout this exercise, we will use n,m to denote positive or non-negative integers, and -n,-m to denote negative integers. This means that instead of showing e.g. x^n x^m = x^{n+m} for all integers n,m, we will be showing that x^n x^m = x^{n+m} for non-negative integers n,m and separately x^{-n} x^{-m} = x^{-n-m} for negative integers -n, -m. This is consistent with the notation in Definition 4.3.11 and I find that this visual distinction helps quite a bit.

(a) Addition law for exponents: Suppose first that n and m are both non-negative. Then the result follows from Proposition 4.3.10(a). Next, suppose that both -n and -m are negative. Then we have x^{-n} x^{-m} = \frac1{x^n} \cdot \frac1{x^m} = \frac1{x^n x^m}. Since n, m are positive, x^{n}x^{m} = x^{n+m}. Thus we have \frac1{x^{n}x^{m}} = \frac1{x^{n+m}} = x^{-n-m}. Next suppose n is non-negative and -m is negative. We have two cases, n-m \geq 0 or n-m < 0:

  • Suppose first that n-m \geq 0. Then m is positive, so by Proposition 4.3.10(a) we have x^{n-m} x^{m} = x^n. Thus we have x^n x^{-m} = x^{n-m} x^{m} x^{-m} = x^{n-m} x^{m} /x^{m} = x^{n-m}.
  • Suppose instead that n-m < 0. Then -n+m > 0 and since n \geq 0, we can use Proposition 4.3.10(a) to get x^{-n+m} x^n = x^{m}. By definition of exponentiation, x^{n-m} = \frac1{x^{-n+m}}. If we multiply both sides by \frac1{x^n} we obtain x^{n-m} \cdot \frac1{x^n} = \frac1{x^{-n+m}} \cdot \frac1{x^n}. Simplifying the right side, we have \frac1{x^{-n+m}} \cdot \frac1{x^n} = \frac1{x^{-n+m} x^n} = \frac1{x^{m}}. But x^{-m} = 1/x^{m} so we have x^{n-m} \cdot \frac1{x^n} = x^{-m}. So multiplying through by x^n we thus get x^{n-m} = x^n x^{-m}.

The final case when -n is negative and m is non-negative is analogous to the previous case, but we could also note that x^{-n} x^m = x^m x^{-n} and x^{-n+m} = x^{m-n} by the commutativity of multiplication and addition. By the previous case, we know that x^m x^{-n} = x^{m-n} so the result follows.

Multiplication law for exponents: First we show using induction that 1/y^n = (1/y)^n for any natural number n. We already know that y^{-n} = 1/y^n = (y^n)^{-1} and (1/y)^n = (y^{-1})^n, so this will show that all of these expressions are equal. For the base case, we have 1/y^0 = 1/1 = 1 = (1/y)^0. Now suppose inductively that 1/y^n = (1/y)^n. We must show that 1/y^{n+1} = (1/y)^{n+1}. We have 1/y^{n+1} = 1/(y^n y) = (1/y^n)(1/y) = (1/y)^n (1/y) = (1/y)^{n+1}. This closes the induction.

First, suppose n \geq 0 and m \geq 0. Then the result follows from Proposition 4.3.10(a). Next suppose -n < 0 and -m < 0. We must show (x^{-n})^{-m} = x^{(-n)(-m)}. We have

\displaystyle (x^{-n})^{-m} = \frac{1}{(\frac{1}{x^n})^m} = \frac{1}{\frac{1}{(x^n)^m}} = \frac{1}{\frac{1}{x^{nm}}} = x^{nm} = x^{(-n)(-m)}

Next suppose n \geq 0 and -m < 0. We have (x^n)^{-m} = \frac{1}{(x^n)^m} = \frac{1}{x^{nm}} = x^{-nm} = x^{n(-m)}.

Finally suppose -n < 0 and m \geq 0. We have (x^{-n})^m = (\frac1{x^n})^m = \frac{1}{(x^{n})^m} = x^{-nm} = x^{(-n)m}.

Distributing exponents: If n \geq 0, the result follows from Proposition 4.3.10(a). If -n < 0, then (xy)^{-n} = \frac1{(xy)^n} = \frac1{x^n y^n} = \frac1{x^n}\cdot \frac1{y^n} = x^{-n} y^{-n}, which proves the result.

(b) Suppose x \geq y > 0. First suppose n is positive. By Proposition 4.3.10(c) we have x^n \geq y^n \geq 0, so we just have to show that y^n \ne 0. By Proposition 4.3.10(b), y^n =0 if and only if y =0. Since y > 0, we conclude that y^n \ne 0.

Now suppose that -n is negative. We must show that 0 < x^{-n} \leq y^{-n}. Since x,y are positive, \frac1{xy} is positive. We can thus multiply on both sides of x \geq y to obtain 1/y = x \cdot \frac1{xy} \geq y \cdot \frac1{xy} = 1/x > 0. Applying the previous case, we thus obtain (1/y)^n \geq (1/x)^n > 0. But (1/y)^n = y^{-n} and (1/x)^n = x^{-n} (see part (a) for proof) so the result follows.

(c) Let x,y > 0 be given. First suppose that n > 0, and suppose x^n = y^n.  Suppose for sake of contradiction that x\ne y. Then either x < y or x > y. In the former case, x^n < y^n by Proposition 4.3.10(c), which contradicts the fact that x^n=y^n. Similarly, if x > y we obtain a contradiction. Thus the only option left is x=y. Now suppose -n < 0, and suppose that x^{-n} = y^{-n}. By multiplying on both sides by x^n y^n we obtain x^n=y^n. Suppose for sake of contradiction that x \ne y. Then either x < y or x > y. In either case, we see that x^n \ne y^n, a contradiction. Thus we conclude that x=y.

(d) If n \geq 0, this follows from Proposition 4.3.10(d). So suppose -n < 0. Then |x^{-n}| = |(1/x)^n| = |1/x|^n = (1/|x|)^n = 1/|x|^n = |x|^{-n}. The first equality follows from part (a), the second equality follows from Proposition 4.3.10(d), the third equality follows from an easy proof by cases (just show that -(1/x) = 1/(-x)), the fourth equality follows from part (a) again, and the fifth equality follows from the definition of exponentiation.

Exercise 4.3.3

Exercise statement

Prove Proposition 4.3.10.

Proposition 4.3.10 (Properties of exponentiation, I). Let x,y be rational numbers, and let n,m be natural numbers.

(a) We have x^n x^m = x^{n+m}, (x^n)^m = x^{nm}, and (xy)^n = x^n y^n.
(b) Suppose n > 0. Then we have x^n = 0 if and only if x = 0.
(c) If x \geq y \geq 0, then x^n \geq y^n \geq 0. If x > y \geq 0 and n > 0, then x^n > y^n \geq 0.
(d) We have |x^n| = |x|^n.

Hints

  1. Use induction.

How to think about the exercise

This is a pretty straightforward exercise.

Model solution

(a) First we show x^n x^m = x^{n+m}. We fix m and induct on n. For the case n=0, we must show x^0 x^m = x^{0+m}. But x^0 x^m = 1 \times x^m = x^m = x^{0+m}. Now suppose inductively that x^n x^m = x^{n+m}. We must show that x^{n+1} x^m = x^{(n+1)+m}. We have x^{n+1} x^m = x^n \times x \times x^m =(x^n \times x^m) \times x. By the inductive hypothesis, x^n x^m = x^{n+m}, so this is equal to x^{n+m} \times x = x^{n+m+1} = x^{(n+1)+m}. This closes the induction.

Next we show that (x^n)^m = x^{nm}. To do this, we first show that 1^m = 1 for every natural number m. Indeed we can induct on m. For the base case, 1^0=1 by definition of exponentiation, and if we suppose inductively that 1^m=1 then we have 1^{m+1} = 1^m\times 1 = 1\times 1 = 1, which closes the induction. Now we actually prove the result. We fix m and induct on n. For the case n=0, we must show (x^0)^m = x^{0m}. But (x^0)^m = 1^m=1 and x^{0m} = x^0 = 1. For the inductive case, suppose (x^n)^m = x^{nm}. We must show (x^{n+1})^m = x^{(n+1)m}. We have

\begin{aligned} (x^{n+1})^m &= (x^n\times x)^m \\ &= (x^n)^m x^m \\ &= x^{nm}x^m \\ &= x^{nm+m} \\ &= x^{(n+1)m}\end{aligned}

Here the second equality follows from the third result in part (a), which we prove next (there is no circularity since that result does not rely on this one), the third equality uses the inductive hypothesis, and the fourth equality follows from the first result in part (a).

Finally, we show (xy)^n = x^n y^n. We induct on n. For the base case, we have (xy)^0 = 1 = 1\times 1 = x^0 y^0. Now suppose inductively that (xy)^n = x^n y^n. We must show that (xy)^{n+1} = x^{n+1} y^{n+1}. We have (xy)^{n+1} = (xy)^n \times xy = x^n y^n\times xy = x^n\times x \times y^n \times y = x^{n+1} y^{n+1}, where the second equality uses the induction hypothesis.

(b) We can induct on n, starting at n=1. For the base case, we must show that x^1 = 0 if and only if x=0. But this is true since x^1 = x. Now suppose inductively that x^n=0 if and only if x=0. We must show that x^{n+1}=0 if and only if x=0. Suppose first that x^{n+1}=0. Then x^n \times x = 0 so either x^n=0 or x=0 (for if both x^n \ne 0 and x\ne0 then by Proposition 4.2.4 we have x^n(x^n)^{-1} = 1 and xx^{-1}=1 so 1 = x^n(x^n)^{-1} xx^{-1} = (x^n\times x) (x^n)^{-1} x^{-1} = 0, a contradiction). In the latter case we’re done. In the former case, by the inductive hypothesis we see that x=0. Conversely, suppose that x=0. Then x^{n+1} = 0^n \times 0 = 0. This closes the induction.

(If starting the induction at n=1 makes you nervous, you can start at n=0 using the statement “if n > 0, then x^n = 0 if and only if x = 0”. The base case is then vacuous, and you will need to split into cases depending on n=0 or n>0 in the inductive case.)

(c) Suppose x\geq y \geq 0. We want to show x^n \geq y^n \geq 0. We will induct on n. For the base case, we must show x^0 \geq y^0 \geq 0. But this is true, since x^0 = y^0 = 1 \geq 0. Now suppose inductively that x^n \geq y^n \geq 0. We can multiply x \geq y \geq 0 by x^n (which is non-negative by induction hypothesis) to get x^n \times x \geq x^n \times y \geq 0. Similarly, we can multiply x^n \geq y^n \geq 0 by y to get x^n \times y \geq y^n \times y \geq 0. Thus we get x^n \times x \geq x^n \times y\geq y^n \times y \geq 0. Now use x^n \times x = x^{n+1} and y\geq y^n = y^{n+1} to complete the proof.

The version for strict inequality is proved in the same way, except that the induction will start at n=1.

(d) We will induct on n. For the case n=0, we want to show |x^0|=|x|^0. But |x^0| = |1| = 1 = |x|^0. Now suppose inductively that |x^n| = |x|^n. We want to show that |x^{n+1}| = |x|^{n+1}. We have |x^{n+1}| = |x^n \times x| = |x^n|\times |x| = |x|^n \times |x| = |x|^{n+1}. Here, the second equality follows from Proposition 4.3.3(d) and the third equality follows from the inductive hypothesis.

Exercise 4.3.2

Exercise statement

Prove the remaining claims in Proposition 4.3.7.

Proposition 4.3.7. Let x,y,z,w be rational numbers.

(a) If x=y, then x is \varepsilon-close to y for every \varepsilon > 0. Conversely, if x is \varepsilon-close to y for every \varepsilon > 0, then we have x=y.
(b) Let \varepsilon > 0. If x is \varepsilon-close to y, then y is \varepsilon-close to x.
(c) Let \varepsilon, \delta > 0. If x is \varepsilon-close to y, and y is \delta-close to z, then x and z are (\varepsilon+\delta)-close.
(d) Let \varepsilon,\delta > 0. If x and y are \varepsilon-close, and z and w are \delta-close, then x+z and y+w are (\varepsilon+\delta)-close, and x-z and y-w are also (\varepsilon+\delta)-close.
(e) Let \varepsilon > 0. If x and y are \varepsilon-close, they are also \varepsilon'-close for every \varepsilon' > \varepsilon.
(f) Let \varepsilon > 0. If y and z are both \varepsilon-close to x, and w is between y and z (i.e., y \leq w \leq z or z \leq w \leq y), then w is also \varepsilon-close to x.
(g) Let \varepsilon > 0. If x and y are \varepsilon-close, and z is non-zero, then xz and yz are \varepsilon|z|-close.
(h) Let \varepsilon, \delta > 0. If x and y are \varepsilon-close, and z and w are \delta-close, then xz and yw are (\varepsilon|z| + \delta|x| + \varepsilon\delta)-close.

Hints

  1. You may want to use Proposition 4.3.3.

How to think about the exercise

Part (e) of this proposition justifies why we should care about small values of \varepsilon in analysis. Since the property is satisfied automatically for all large values of \varepsilon, it is only with the small values that we are in danger of not satisfying the property.

I don’t have much to say about the exercise itself. There are a lot of parts, but each part is straightforward.

Model solution

(a) If x=y, then x-y=0 so d(x,y) = |x-y| = 0 \leq \varepsilon.

Conversely, suppose x and y are \varepsilon-close for every \varepsilon > 0. Suppose for sake of contradiction that x \ne y. Then x-y \ne 0 so |x-y| is a positive number. Since x and y are \varepsilon-close for every \varepsilon > 0, in particular for \varepsilon := |x-y|/2 we must have |x-y| \leq |x-y|/2. We can add -|x-y|/2 to both sides to get |x-y|/2 \leq 0, a contradiction.

(b) This follows from Proposition 4.3.3(f), the symmetry of distance.

(c) We have d(x,y) \leq \varepsilon and d(y,z) \leq \delta. By the triangle inequality, we have d(x,z) \leq d(x,y) + d(y,z) \leq \varepsilon + \delta.

(d) We have d(x,y) \leq \varepsilon and d(z,w) \leq \delta. Using the triangle inequality, we have

\begin{aligned} d(x+z, y+w) &= |(x+z) - (y+w)| \\ &= |(x-y) + (z-w)| \\ &\leq |x-y| + |z-w| \\ &= d(x,y) + d(z,w) \\ &\leq \varepsilon + \delta\end{aligned}

Similarly we have

\begin{aligned} d(x-z, y-w) &= |(x-z) - (y-w)| \\ &= |(x-y) + (w-z)| \\ &\leq |x-y| + |w-z| \\ &= d(x,y) + d(w,z) \\ &\leq \varepsilon + \delta\end{aligned}

The only extra step here is to use the symmetry of distance, d(w,z)=d(z,w).

(e) We have d(x,y) \leq \varepsilon < \varepsilon' so the result follows from the transitivity of order.

(f) Suppose first that y \leq w \leq z. Subtract x from each expression to get y-x \leq w-x \leq z-x. Since |y-x| \leq \varepsilon and |z-x| \leq \varepsilon, we can use Proposition 4.3.3(c) to get -\varepsilon \leq y-x \leq \varepsilon and -\varepsilon \leq z-x \leq \varepsilon. So summarizing all the inequalities, we have -\varepsilon \leq y-x \leq w-x \leq z-x \leq \varepsilon which means that -\varepsilon \leq w-x \leq \varepsilon, so |w-x| \leq \varepsilon by Proposition 4.3.3(c) again.

The case for z \leq w \leq y is similar.

(g) We have |x-y| \leq \varepsilon. Since z is non-zero, we can multiply by |z| > 0 on both sides to get |x-y||z| \leq \varepsilon |z|. But |x-y||z| = |(x-y)z| = |xz-yz| by Proposition 4.3.3(d). Thus |xz-yz| \leq \varepsilon|z|, which means that xz and yz are \varepsilon|z|-close.

(h) This was already shown in the book.

Exercise 4.3.1

Exercise statement

Prove Proposition 4.3.3.

Proposition 4.3.3 (Basic properties of absolute value and distance). Let x,y,z be rational numbers.

(a) (Non-degeneracy of absolute value) We have |x| \geq 0. Also, |x| = 0 if and only if x is 0.
(b) (Triangle inequality for absolute value) We have |x+y| \leq |x| + |y|.
(c) We have the inequalities -y \leq x \leq y if and only if y \geq |x|. In particular, we have -|x| \leq x \leq |x|.
(d) (Multiplicativity of absolute value) We have |xy| = |x| |y|. In particular, |{-x}| = |x|.
(e) (Non-degeneracy of distance) We have d(x,y) \geq 0. Also, d(x,y) = 0 if and only if x=y.
(f) (Symmetry of distance) d(x,y) = d(y,x).
(g) (Triangle inequality for distance) d(x,z) \leq d(x,y) + d(y,z).

Hints

  1. While all of these claims can be proven by dividing into cases, such as when x is positive, negative, or zero, several parts of the proposition can be proven without such a tedious division into cases. For instance one can use earlier parts of the proposition to prove later ones.
  2. You can also prove some of the later parts first, to help with proving the earlier parts. This is fine to do as long as you don’t then use the earlier part to prove the later part (which would be circular).

How to think about the exercise

I want to prove a few supplementary facts here that will be useful in the proofs below. I think proving these will make some of the proofs below conceptually cleaner. It’s totally possible to do this exercise without proving any of these supplementary facts.

In two of the proofs for the exercise (parts (b) and (d)) I prove a square version first, and use it to conclude the non-square version. I wish I had an explanation of why this works, but I don’t. It’s a simple and standard trick, but I don’t know why it should work.

Fact 1. Let x be a rational number. Then |x|^2 = x^2.

Proof. If x is positive or zero, this is true by definition of absolute value. If x is negative, then we have |x|^2 = (-x)(-x) = x^2. \Box

Fact 2. Let x,y,z be rational numbers. If x \leq y and z is non-negative, then xz \leq yz.

Proof. We have several cases. If x < y and z is positive, then this follows from Proposition 4.2.9(e). If z=0, then xz = 0 \leq 0 = yz. If x = y, then xz = yz so xz \leq yz. \Box

Fact 3. Let x,y be non-negative rational numbers. If x^2 = y^2, then x = y.

Proof. Suppose first that x=0 or y=0. If x=0 then 0=x^2 =y^2. Thus y^2=0 so y cannot be positive or negative. (For if y > 0 then by Proposition 4.2.9(e) we could multiply by y on both sides to obtain y^2 > 0, a contradiction. Similarly for y < 0.) So by trichotomy we see that y=0. Next, if y=0, then x^2 = y^2 = 0 so we have x=0. In either case, x=y.

Now suppose both x and y are positive. Suppose for sake of contradiction that x\ne y. Then by order trichotomy we have two cases, x<y or x>y. If x < y then x^2 < y^2, a contradiction. And if x>y then x^2 > y^2, which again is a contradiction. Thus we are left to conclude that x=y. \Box

Fact 4. Let x,y be non-negative rational numbers. If x^2 \leq y^2, then x \leq y.

Proof. We prove the contrapositive. Suppose x > y. We want to show that x^2 > y^2. Since x > y, we must have x \geq y and x\ne y. By Fact 2, we have x^2 \geq y^2. If x^2=y^2 then by Fact 3 we would have x=y, which contradicts the fact that x\ne y. Thus we must have x^2 \geq y^2 and x^2\ne y^2, i.e. x^2 > y^2. \Box

Finally, a psychological note: I found this to be the hardest post to write so far. I wrote parts of it one day, then gave up, then wrote some other parts on a second day, then put it down again, and came back to it on a third day and didn’t do much, put it down, and finally came back to finish it on a fourth day. For some reason, it felt very annoying to type everything up. I don’t think I got this feeling when working through the book myself, originally.

Model solution

(a) By order trichotomy, we have three cases: x > 0, x=0, and x < 0. If x>0 then x-0 = x is a positive number, so |x| = x \geq 0. If x=0 then |x| = 0 \geq 0. If x < 0 then x-0 = x is negative, so |x| = -x is positive, hence -x-0 is positive which means we have -x > 0. Therefore, |x| = -x \geq 0.

Now suppose |x|=0. If x were positive or negative, the absolute value would be a positive number. So by trichotomy, we see that x must be zero. Conversely, if x=0 then by definition of absolute value we have |x|=0.

(b) There are several ways to prove this. Here are two:

  1. We can split into cases depending on the value of x+y. If x+y=0 then |x+y| = 0 \leq |x|+|y| since by part (a) we know that both |x| \geq 0 and |y| \geq 0. If x+y is positive then |x+y| = x+y \leq |x|+|y| by part (c) (there is no risk of circularity since the proof of part (c) does not require part (b)). If x+y is negative then |x+y| = (-x) + (-y) \leq |{-x}| + |{-y}| = |x|+|y| by parts (c) and (d) (again there is no circularity since the proof of (d) does not require (b)).
  2. By Fact 4, it suffices to show that |x+y|^2 \leq (|x|+|y|)^2. We have (|x|+|y|)^2 = |x|^2 + 2|x||y| + |y|^2 by Exercise 2.3.4 (more precisely, the version of this result for rational numbers, which can be proved in exactly the same way). By parts (c) and (d) we have 2|x||y| =2|xy| \geq 2xy. Thus we have (|x|+|y|)^2 \geq x^2 + 2xy + y^2 = (x+y)^2 = |x+y|^2, where we have used Fact 1.

(c) Suppose -y \leq x \leq y. If x \geq 0 then |x| = x so we have |x| = x \leq y. If x < 0 then we have |x| = -x. Using -y \leq x and a slight modification of Exercise 4.2.6, we have y \geq -x = |x|.

Now suppose y \geq |x|. If x \geq 0 then y \geq |x| = x \geq 0. This means y\geq 0, so -y \leq 0. Thus we have y \geq x \geq 0 \geq -y. If x < 0 then |x| = -x. Thus we have y \geq |x| = -x > 0. This means y is positive, so -y is negative. So we have y \geq -x \geq -y, which means -y \leq x \leq y.

To show that -|x| \leq x \leq |x|, take y:=|x|. Then we have |x|\geq |x| so by the above result we must have -|x| \leq x \leq |x|.

(d) Here are two ways to prove this:

  1. We can prove by cases on the values of x and y. For instance, if x is negative and y is positive, then xy is negative so |xy| = -xy = |x||y|. The other cases can be shown in a similar way, and are omitted.
  2. Both |xy| and |x||y| are non-negative by part (a), so by Fact 3 it suffices to show that |xy|^2 = (|x||y|)^2. We have |xy|^2 = (xy)^2 = x^2y^2 = |x|^2 |y|^2 = (|x||y|)^2, where we have used Fact 1.

To show |{-x}| = |x|, take y:=-1.

(e) By part (a), d(x,y) = |x-y| \geq 0.

By part (a), we have |x-y| = 0 if and only if x-y = 0. But we know that |x-y| = d(x,y) and that x-y=0 iff x=y. Thus d(x,y) = 0 if and only if x=y.

(f) By part (d), d(x,y) = |x-y| = |{-(x-y)}| = |y-x| = d(y,x).

(g) By part (b) using x-y and y-z as inputs, we have |(x-y) + (y-z)| \leq |x-y| + |y-z|. But |(x-y) + (y-z)| = |x-z| = d(x,z) and |x-y| + |y-z| = d(x,y) + d(y,z) so the result follows.

Exercise 4.2.6

Exercise statement

Show that if x,y,z are rational numbers such that x < y and z is negative, then xz > yz.

Hints

None.

How to think about the exercise

This is a simple exercise, so I don’t have anything to say.

Model solution 1

Suppose x < y. By definition of order, this means that x-y is a negative number. Since z is also negative, we have x-y = (-a)/b and z=(-c)/d for positive rational numbers a,b,c,d. Thus xz - yz = (x-y)z = (ac)/(bd) is a positive rational, so xz > yz as required.

Model solution 2

Suppose x < y. By Proposition 4.2.9(d) we can add -y to both sides to obtain x-y < 0. Again by Proposition 4.2.9(d) we can add -x to both sides to obtain -y < -x. Since z is a negative rational, -z is a positive rational. Thus by Proposition 4.2.9(e) we have (-y)(-z) < (-x)(-z), i.e. yz < xz. By Proposition 4.2.9(b) we have xz > yz as required.