Category Archives: Solutions

Exercise 9.1.13

July 28, 2025 Issa Rice Leave a comment

Exercise statement

Prove Theorem 9.1.24.

Theorem 9.1.24 (Heine-Borel theorem for the line). Let $X$ be a subset of $\mathbf R$ . Then the following two statements are equivalent:

(a) $X$ is closed and bounded.

(b) Given any sequence $(a_n)_{n=0}^\infty$ of real numbers which takes values in $X$ (i.e., $a_n \in X$ for all $n$ ), there exits a subsequence $(a_{n_j})_{j=0}^\infty$ of the original sequence, which converges to some number $L$ in $X$ .

Hints

To show (a) implies (b), use the Bolzano-Weierstrass theorem (Theorem 6.6.8) and Corollary 9.1.17.
To show (b) implies (a), argue by contradiction, using Corollary 9.1.17 to establish that $X$ is closed (note by Issa: actually I don’t think proof by contradiction is needed here). You will need the axiom of choice to show that $X$ is bounded, as in Lemma 8.4.5. (Note by Issa: the axiom of choice is also not directly needed for this part, but proving this without direct use of the axiom of choice requires first showing that $X$ is closed, and showing that $X$ is closed implicitly requires the axiom of choice because it depends on Lemma 9.1.14.)

How to think about the exercise

This is a fairly simple exercise so I don’t have much to say.

Model solution 1

First we will show that (a) implies (b). Suppose $X$ is closed and bounded. Let $(a_n)_{n=0}^\infty$ be a sequence which takes values in $X$ . Since $X$ is bounded, this means that there exists some real number $M > 0$ such that $X \subseteq [-M, M]$ . Since $a_n \in X$ for each $n$ , this means that $a_n \in [-M, M]$ for each $n$ as well. But this means $-M \leq a_n \leq M$ , i.e., $|a_n| \leq M$ for each $n$ . So $(a_n)_{n=0}^\infty$ is a bounded sequence. Thus by the Bolzano-Weierstrass theorem (Theorem 6.6.8), there exists a subsequence $(a_{n_j})_{j=0}^\infty$ which converges to some limit $L \in \mathbf R$ . The only thing left to show now is that $L \in X$ . By assumption $X$ is closed, and since $(a_n)_{n=0}^\infty$ is a sequence in $X$ this means that $(a_{n_j})_{j=0}^\infty$ is also a sequence in $X$ . Thus we can apply Corollary 9.1.17 to conclude that $L = \lim_{j\to\infty} a_{n_j} \in X$ as desired.

Now we prove the converse, that (b) implies (a). Suppose that given any sequence $(a_n)_{n=0}^\infty$ in $X$ , there exits a subsequence $(a_{n_j})_{j=0}^\infty$ which converges to some number $L\in X$ . Our goal is to show that $X$ is both closed and bounded.

We start by showing that $X$ is closed. By Corollary 9.1.17, it suffices to show that every convergent sequence $(a_n)_{n=0}^\infty$ of elements in $X$ has its limit in $X$ . So let $(a_n)_{n=0}^\infty$ be a convergent sequence consisting of elements in $X$ . Call the limit $L$ . Our goal is to show that $L \in X$ . By our assumption (b), there exists a subsequence $(a_{n_j})_{j=0}^\infty$ which converges to some number $L' \in X$ . But now by Proposition 6.6.5, since $(a_n)_{n=0}^\infty$ converges to $L$ , we know that its subsequence $(a_{n_j})_{j=0}^\infty$ must also converge to $L$ , so in fact $L = L'$ , and thus $L \in X$ as required.

Finally, we show that $X$ is bounded. Suppose for the sake of contradiction that $X$ is not bounded. Thus for each natural number $n$ , there exists some element $x_n \in X$ such that $|x_n| > n$ . More formally, for each natural number $n$ , we are choosing some element $x_n$ from the set $\{ x \in X : |x| > n \}$ (which is non-empty for each $n$ because $X$ is not bounded) in order to define a sequence $(x_n)_{n=0}^\infty$ ; making infinitely many choices like this requires the axiom of choice. By our assumption (b), there exists some subsequence $(x_{n_j})_{j=0}^\infty$ which converges to some number $L \in X$ . But $|x_{n_j}| > n_j \geq j$ for each $j$ so $(x_{n_j})_{j=0}^\infty$ is not bounded, and unbounded sequences cannot be convergent (Corollary 6.1.17). We have thus shown that $(x_{n_j})_{j=0}^\infty$ is both convergent and not convergent, a contradiction. This contradiction shows that $X$ must in fact be bounded.

Model solution 2

Here we prove just the fact that (b) implies $X$ is bounded, without directly using the axiom of choice. Thanks to Leshuri and The Royal Group for help with this proof.

From Model solution 1, we know that $X$ is closed; note that proving this fact implicitly used the axiom of choice because it depended on Corollary 9.1.17, which depends on Lemma 9.1.14. Lemma 9.1.14 requires the axiom of choice. Thanks to The Royal Group for pointing this out to me. However, in the rest of the proof below, we won’t make any new appeals to the axiom of choice. So you could view the proof below as an axiom-of-choice-free proof that property (b) (which is called sequential compactness) together with the assumption that $X$ is closed implies that $X$ is bounded. See also Remark 8.4.6.

To make the proof easier to understand, we first prove a lemma that $X$ is bounded if and only if $X$ is both bounded above and bounded below. More formally, we will show that if $X$ is bounded, then there exists some $M > 0$ such that $x \leq M$ for all $x \in X$ , and also that there exists some $N < 0$ such that $N \leq x$ for all $x \in X$ .

If $X$ is bounded, then there is some $M > 0$ such that $-M \leq x \leq M$ , so $M$ witnesses the fact that $X$ is bounded above, and $-M$ witnesses the fact that $X$ is bounded below. Conversely, if $X$ is bounded above, say by $M > 0$ , and $X$ is bounded below, say by $N < 0$ , then we can consider the bound $M - N$ . Then for all $x \in X$ we have $x \leq M < M - N$ and $-(M - N) = -M + N < N \leq x$ . Thus $M - N$ bounds the set $X$ .

With that lemma out of the way, we will now show that $X$ is bounded. Suppose for the sake of contradiction that $X$ is not bounded. By the lemma above, $X$ is either not bounded above or not bounded below. We will show the case when $X$ is not bounded above; the case when $X$ is not bounded below is very similar.

Since $X$ is not bounded above, the set $X_n := \{ x \in X : x > n \}$ is non-empty for each natural number $n$ . Define $x_n := \inf(X_n)$ . Notice that we did not choose from the set $X_n$ , rather we simply constructed the exact $x_n$ we needed, so we did not use the axiom of choice in constructing the sequence $(x_n)_{n=0}^\infty$ .

We want to show that each $x_n$ lies in $X$ . First we show that $x_n$ is an adherent point of $X_n$ . Let $\varepsilon > 0$ be given. Since $x_n$ is the greatest lower bound of $X_n$ , we see that $x_n - \varepsilon$ is not a lower bound of $X_n$ , so there exists some $x \in X_n$ such that $x_n \leq x < x_n + \varepsilon$ . So subtracting $x_n$ from each side, this means $-\varepsilon < 0 \leq x - x_n < \varepsilon$ so $|x - x_n| < \varepsilon$ . But this means $x_n$ is $\varepsilon$ -adherent to $X_n$ , and since $\varepsilon$ was arbitrary, this means that $x_n$ is an adherent point of $X_n$ . In fact, we have proved something more. Since $X_n \subseteq X$ , the $x \in X_n$ which witnessed the fact that $x_n$ is an adherent point $X_n$ is also an element of $X$ . So in fact, $x_n$ is an adherent point of $X$ as well. But $X$ was already shown to be a closed set, so by Definition 9.1.15, we see that $x_n \in X$ .

The hard part is now done. By our assumption (b), there exists some subsequence $(x_{n_j})_{j=0}^\infty$ which converges to some number $L \in X$ . But $x_{n_j} \geq n_j \geq j$ for each $j$ so $(x_{n_j})_{j=0}^\infty$ is not bounded, and unbounded sequences cannot be convergent (Corollary 6.1.17). We have thus shown that $(x_{n_j})_{j=0}^\infty$ is both convergent and not convergent, a contradiction. This contradiction shows that $X$ must in fact be bounded.

Solutions

Exercise 3.1.8

June 20, 2025 Issa Rice Leave a comment

Exercise statement

Let $A,B$ be sets. Prove the absorption laws $A \cap (A \cup B) = A$ and $A \cup (A \cap B) = A$ .

Hints

What are the ways to show that two sets are equal?

How to think about the exercise

Let’s take the first law, $A \cap (A \cup B) = A$ . It would be simple enough to prove this by splitting into two statements, $A \cap (A \cup B) \subseteq A$ and $A \cap (A \cup B) \supseteq A$ . And then to say, for the first one, let $x \in A \cap (A \cup B)$ . Since $x \in A \cap (A \cup B)$ , we have $x \in A$ and $x \in A \cup B$ . But $x \in A$ is what we wanted, so we are done, and we have $A \cap (A \cup B) \subseteq A$ . Then we do the other side. Then we do the other equality. It’s kind of a tedious exercise. Is there a faster way, perhaps by direct algebraic manipulation?

Well, one thing to notice is that $A \cap (A \cup B)$ looks expandable, using Proposition 3.1.27 (in the 4th edition; it’s Proposition 3.1.28 in the 3rd edition of the book). In particular, using distributivity we get $A \cap (A \cup B) = (A \cap A) \cup (A \cap B)$ . Now using the identity property ( $A \cap A = A$ ) from the same Proposition, we can write this as $A \cup (A \cap B)$ . Now look at what we have: that’s just the left-hand side of the second absorption law! If we expand this using distributivity and then simplify using identity again, we get $A \cup (A \cap B) = (A \cup A) \cap (A \cup B) = A \cap (A \cup B)$ . Uh oh, we ended up right where we started. So this approach doesn’t get us all the way to a proof, but at least we can prove one of the absorption laws in terms of the other, which does cut down on about half of the tedium!

You might wonder how to know ahead of time whether it is possible to prove a set equality like this using algebra versus whether it is impossible and you need to go back to the equality of sets axiom (Axiom 3.2 in the 4th edition; Definition 3.1.4 in 3rd edition). I don’t actually know the answer to this. If anyone knows something about this, please write to me in the comments.

Model solution

Let’s first consider the second absorption law, $A \cup (A \cap B) = A$ . By the distributivity and identity properties from Proposition 3.1.27, we have

$A \cup (A \cap B) = (A \cup A) \cap (A \cup B) = A \cap (A \cup B)$

But $A \cap (A \cup B)$ is the left-hand side of the first absorption law. So if we can show the first absorption law, we get the second absorption law too simply by using the above chain of equalities.

To show $A \cap (A \cup B) = A$ , we will show both $A \cap (A \cup B) \subseteq A$ and $A \cap (A \cup B) \supseteq A$ . First let $x \in A \cap (A \cup B)$ . Then we have $x \in A$ and $x \in A \cup B$ . But $x \in A$ is what we wanted, so this shows $A \cap (A \cup B) \subseteq A$ . Now let $x \in A$ . Our goal is to show $x \in A \cap (A \cup B)$ . Since $x \in A$ , we have that $x \in A$ or $x \in B$ . Thus $x \in A \cup B$ . But now we have both $x \in A$ and $x \in A \cup B$ , so we have $x \in A \cap (A \cup B)$ as desired. This completes the proof that $A \cap (A \cup B) \supseteq A$ . Since we’ve now shown each set is a subset of the other, we have $A \cap (A \cup B) = A$ , and the proof is done.

Solutions

Exercise 6.6.2

October 21, 2023 Issa Rice Leave a comment

Exercise statement

Can you find two sequences $(a_n)_{n=0}^\infty$ and $(b_n)_{n=0}^\infty$ which are not the same sequence, but such that each is a subsequence of the other?

Hints

None.

How to think about the exercise

In the book, Tao writes “The property of being a subsequence is reflexive and transitive, though not symmetric” (meaning it fails to be an equivalence relation). Using terminology from Definition 8.5.1, this exercise shows that the relation “is a subsequence of” is not anti-symmetric either, and hence is not a partial order. (Contrast this with the relation “is a subset of” for sets, which is anti-symmetric, and also happens to be reflexive and transitive, and so is a partial order.)

You will see in the solution below that in finding a counterexample, we are taking advantage of how the sequence elements repeat. Must all the examples we can find have repeating elements? I think it’s possible to prove this, and I think it’s even possible to prove a slightly stronger claim that all examples must have infinitely repeating elements, so that at least for the class of non-repeating (or non-infinitely repeating) sequences, “is a subsequence of” is a partial order.

Model solution

Consider $(a_n)_{n=0}^\infty = (0,1,0,1,0,1,\ldots)$ defined by

$\displaystyle a_n := \begin{cases}0 & \text{if }n\text{ is even} \\ 1 & \text{if }n\text{ is odd}\end{cases}$

and $(b_n)_{n=0}^\infty = (1,0,1,0,1,0,\ldots)$ defined by

$\displaystyle b_n := \begin{cases}1 & \text{if }n\text{ is even} \\ 0 & \text{if }n\text{ is odd}\end{cases}$

The function $f : \mathbf N \to \mathbf N$ defined by $f(n) := n+1$ is increasing (why?), and we have $b_n = a_{f(n)}$ and $a_n = b_{f(n)}$ for each $n \in \mathbf N$ . So each sequence is a subsequence of the other, as required.

(To show this rigorously we’d need to define “even” and “odd”, prove a version of Euclidean division with uniqueness so we can say that a natural number is exactly one of even or odd, and show that a natural number $n$ is even (respectively odd) iff $f(n)$ is odd (respectively even). But I think this exercise was intended just as an opportunity to quickly come up with a counterexample and move on, not a place to spend a bunch of time proving facts about number theory. But if anyone in the comments wants help making this rigorous, I can maybe do this at some point.)

Solutions

Exercise 6.4.5

October 19, 2023 Issa Rice 2 Comments

Exercise statement

Use Lemma 6.4.13 to prove Corollary 6.4.14.

Corollary 6.4.14 (Squeeze test). Let $(a_n)_{n=m}^\infty$ , $(b_n)_{n=m}^\infty$ , and $(c_n)_{n=m}^\infty$ be sequences of real numbers such that

$a_n \leq b_n \leq c_n$

for all $n \geq m$ . Suppose also that $(a_n)_{n=m}^\infty$ and $(c_n)_{n=m}^\infty$ both converge to the same limit $L$ . Then $(b_n)_{n=m}^\infty$ is also convergent to $L$ .

Hints

You will probably also want to make use of some other results from this section.

How to think about the exercise

This is one of those exercises that you can do purely by pattern matching, without any deep thought. Let me show what I mean by doing the exercise in this way.

We are supposed to use Lemma 6.4.13. Lemma 6.4.13 has a hypothesis that looks like $a_n \leq b_n$ . Here we have $a_n \leq b_n \leq c_n$ instead. That means we get for free:

$\displaystyle \sup(a_n)_{n=m}^\infty \leq \sup(b_n)_{n=m}^\infty \leq \sup(c_n)_{n=m}^\infty$

$\displaystyle \inf(a_n)_{n=m}^\infty \leq \inf(b_n)_{n=m}^\infty \leq \inf(c_n)_{n=m}^\infty$

$\displaystyle \limsup_{n\to\infty} a_n \leq \limsup_{n\to\infty} b_n \leq \limsup_{n\to\infty} c_n$

$\displaystyle \liminf_{n\to\infty} a_n \leq \liminf_{n\to\infty} b_n \leq \liminf_{n\to\infty} c_n$

Can we do anything useful with these inequalities? Nothing immediately comes to mind. But there’s still some hypotheses we haven’t used yet, namely that both $(a_n)_{n=m}^\infty$ and $(c_n)_{n=m}^\infty$ converge to the same limit $L$ . Is there some previous result we have learned that allows us to connect limits to things like $\sup, \inf, \limsup, \liminf$ ? Flipping backwards in the book, we find Proposition 6.4.12(f), which mentions both $\limsup, \liminf$ and limits. Can we make use of this result? The first part seems relevant. It says that if a limit exists, then both the limit superior and limit inferior are equal to it. So we can update two of the inequalities above:

$L=\displaystyle \limsup_{n\to\infty} a_n \leq \limsup_{n\to\infty} b_n \leq \limsup_{n\to\infty} c_n=L$

$L=\displaystyle \liminf_{n\to\infty} a_n \leq \liminf_{n\to\infty} b_n \leq \liminf_{n\to\infty} c_n=L$

This seems useful. We’ve constrained the values of both $\limsup_{n\to\infty} b_n$ and $\liminf_{n\to\infty} b_n$ . In fact, they both have to be equal to $L$ . At this point, the second part of Proposition 6.4.12(f) becomes useful, and completes the result.

So the general strategy for this sort of exercise is to just flip backwards through the book, trying to look for any result that seems useful (e.g. mentions some of the same kinds of objects we are currently working with), and then see what new facts we can get.

Model solution

Since we have $a_n \leq b_n \leq c_n$ for all $n \geq m$ , Lemma 6.4.13 gives us

$\displaystyle \limsup_{n\to\infty} a_n \leq \limsup_{n\to\infty} b_n \leq \limsup_{n\to\infty} c_n$

and

$\displaystyle \liminf_{n\to\infty} a_n \leq \liminf_{n\to\infty} b_n \leq \liminf_{n\to\infty} c_n$

But since $\lim_{n\to \infty} a_n = L$ , Proposition 6.4.12(f) gives us $\limsup_{n\to\infty} a_n = \liminf_{n\to\infty} a_n = L$ . Similarly, since $\lim_{n\to \infty} c_n = L$ , Proposition 6.4.12(f) gives us $\limsup_{n\to\infty} c_n = \liminf_{n\to\infty} c_n = L$ . Thus the above inequalities become

$\displaystyle L \leq \limsup_{n\to\infty} b_n \leq L$

and

$\displaystyle L \leq \liminf_{n\to\infty} b_n \leq L$

This means that $\limsup_{n\to\infty} b_n = L$ and $\liminf_{n\to\infty} b_n = L$ . So by Proposition 6.4.12(f) again, we have $\lim_{n\to\infty} b_n = L$ as desired.

Solutions

Exercise 8.5.5

October 17, 2023 Issa Rice Leave a comment

Exercise statement

Let $f: X \to Y$ be a function from one set $X$ to another set $Y$ . Suppose that $Y$ is partially ordered with some ordering relation $\leq_Y$ . Define a relation $\leq_X$ on $X$ by defining $x \leq_X x'$ if and only if $f(x) <_Y f(x')$ or $x=x'$ . Show that this relation $\leq_X$ turns $X$ into a partially ordered set. If we know in addition that the relation $\leq_Y$ makes $Y$ totally ordered, does this mean that the relation $\leq_X$ makes $X$ totally ordered also? If not, what additional assumption needs to be made on $f$ in order to ensure that $\leq_X$ makes $X$ totally ordered?

Hints

Why did Tao define $\leq_X$ by the condition $f(x) <_Y f(x')$ or $x=x'$ rather than simply the condition $f(x) \leq_Y f(x')$ ?

How to think about the exercise

The first part of the exercise, where you must show that $\leq_X$ is partially ordered, is straightforward so I won’t comment on it. The latter parts of the exercise are more interesting and may be tricky, so I will discuss how I thought through the exercise.

As you might have guessed, when Tao asks you whether something is true rather than just telling you to prove a fact, it typically means the statement in question is not true. Let’s try to see why. I’m going to make use of two heuristics here, which I will call “try a simple example” and “try to prove the false statement and see what goes wrong”.

Try a simple example

One way to solve problems that pretty reliably works is to just to try scribbling down some simple concrete examples. I started out by drawing some dots to represent the sets $X$ and $Y$ , with order signs to indicate that $Y$ is totally ordered:

And then I just drew some arrows to represent the function $f$ :

I tried to not make the function too nice, so that it would appear to be a “random” function. In particular, I chose a function that was neither injective nor surjective. Now, can we totally order $X$ ? Notice that the rightmost dot in $X$ is clearly larger than either of the dots on the left, but among the first and second dots, there’s no clear ordering between them.

Because the lack of total ordering happened at exactly the spot where $f$ failed to be injective (we have two points in $X$ both mapping to the same point in $Y$ ), our first guess should be that injectivity is the additional assumption we need to force $\leq_X$ to be a total order. And sure enough, drawing a few more examples should convince you this is the case.

Try to prove the false statement and see what goes wrong

Another good strategy when trying to figure out what assumptions are needed to show something is to just try proving the thing, and seeing what goes wrong and what extra thing you need. In this case, we assume $\leq_Y$ is a total order, and then we try to prove that $\leq_X$ is also a total order. To do this, let $x, x' \in X$ . We want to show that $x \leq_X x'$ or $x' \leq_X x$ . In other words, we want to show that $f(x) <_Y f(x')$ or $f(x') <_Y f(x)$ or $x=x'$ . The case $x=x'$ is easy; maybe we just got lucky and happened to pick the same element out of $X$ both times. So suppose $x\ne x'$ . Can we always show that $f(x) <_Y f(x')$ or $f(x') <_Y f(x)$ ? Since we assumed that $\leq_Y$ is a total order, we know that $f(x) \leq_Y f(x')$ or $f(x') \leq_Y f(x)$ . Maybe we can show that when $f(x) \leq_Y f(x')$ we actually have $f(x) <_Y f(x')$ so that $x \leq_X x'$ and similarly when $f(x') \leq_Y f(x)$ we’ll have $x' \leq_X x$ ? In either case, the important thing we must show in order to prove strict inequality is $f(x) \ne f(x')$ . And here we see the problem: we have only assumed $x\ne x'$ , and now we’re trying to show $f(x) \ne f(x')$ . This is the defining property of an injection.

Another way to see the problem is that we know $f(x) <_Y f(x')$ or $f(x') <_Y f(x)$ , but we might have both $f(x) \leq_Y f(x')$ and $f(x') \leq_Y f(x)$ , in which case anti-symmetry of $\leq_Y$ gives us that $f(x) = f(x')$ . In other words, we’ll end up with $x\ne x'$ such that $f(x) = f(x')$ , which you might notice is precisely a failure of $f$ to be injective.

Model solution

We first show that the relation $\leq_X$ is a partial ordering of $X$ . To do this, we must show that $\leq_X$ satisfies reflexivity, anti-symmetry, and transitivity. We will go through each property one by one:

Reflexivity: Let $x \in X$ . We must show that $x \leq_X x$ . But $x=x$ , so by definition of $\leq_X$ we have $x \leq_X x$ .
Anti-symmetry: Let $x,x' \in X$ and suppose we have both $x \leq_X x'$ and $x' \leq_X x$ . To show that $\leq_X$ is anti-symmetric, we must now show that $x=x'$ . Suppose for the sake of contradiction that $x\ne x'$ . Since $x \leq_X x'$ , this means that $f(x) <_Y f(x')$ (because we’ve ruled out the possibility that $x=x'$ ), and since $x' \leq_X x$ this means that $f(x') <_Y f(x)$ . Since $\leq_Y$ is a partial order, this means that we have both $f(x) = f(x')$ (via anti-symmetry of $\leq_Y$ ) as well as $f(x) \ne f(x')$ (by definition of $<_Y$ ). This contradiction shows that we have $x=x'$ after all, and thus $\leq_X$ is anti-symmetric.
Transitivity: Let $x, x', x'' \in X$ and suppose we have both $x \leq_X x'$ and $x' \leq_X x''$ . To show transitivity, we must now prove that $x \leq_X x''$ . If $x=x'$ or $x'=x''$ then we automatically have transitivity; for instance if $x=x'$ then $x' \leq_X x''$ can be written as $x \leq_X x''$ , and similarly for the other case. Thus we only need to consider the case where $x\ne x'$ and $x'\ne x''$ . In this case, we have both $f(x) <_Y f(x')$ and $f(x') <_Y f(x'')$ . Now by transitivity of $\leq_Y$ we have $f(x) \leq_Y f(x'')$ . But not only that: if $f(x)=f(x'')$ we would have $f(x'') \leq_Y f(x)$ which combined with $f(x) <_Y f(x')$ would (by transitivity) show that $f(x'') \leq_Y f(x')$ . So we would have both $f(x'') \leq_Y f(x')$ and $f(x') <_Y f(x'')$ , so by anti-symmetry we would have $f(x'')=f(x')$ , a contradiction of $f(x') <_Y f(x'')$ . So we must have $f(x)\ne f(x'')$ after all, which combined with $f(x) \leq_Y f(x'')$ means $f(x) <_Y f(x'')$ . Thus $x \leq_X x''$ .

Now suppose that $\leq_Y$ is a total order. We will show that this does not necessarily mean $\leq_X$ is a total order. Any non-injective function $f$ will do, but for concreteness, let $X := \{1,2\}$ , let $Y := \mathbf Z$ , and let $\leq_Y$ be the usual ordering relation $\leq$ on the integers. Define $f : X \to Y$ by $f(x) := 0$ for all $x \in X$ . Now $\leq_X$ cannot order $1$ and $2$ because all three of the statements $f(1) < f(2)$ , $f(2) < f(1)$ , and $1=2$ are false. This shows that $\leq_X$ is not a total order. In fact, this happens anytime $f$ is not injective, as we will presently show.

The additional assumption that $f$ is injective is both necessary and sufficient to ensure that $\leq_X$ makes $X$ totally ordered. First suppose that $f$ is injective. We want to show that $\leq_X$ is a total ordering of $X$ . So let $x,x' \in X$ . We want to show that $x \leq_X x'$ or $x' \leq_X x$ . In other words, we must show that $f(x) <_Y f(x')$ or $f(x') <_Y f(x)$ or $x=x'$ . If $x=x'$ are done, so suppose $x\ne x'$ . Since $f$ is injective, this means $f(x) \ne f(x')$ . But now by the assumption that $\leq_Y$ is a total order, we must have $f(x) \leq_Y f(x')$ or $f(x') \leq_Y f(x)$ . Since the option $f(x)=f(x')$ is not available, this means that $f(x) <_Y f(x')$ or $f(x') <_Y f(x)$ . This completes the first direction.

Now suppose that $f$ is not injective. Then there must be elements $x,x' \in X$ such that $x\ne x'$ and $f(x)=f(x')$ . Our goal is to show that $\leq_X$ is not a total order. We cannot have $x \leq_X x'$ since this would mean $f(x) <_Y f(x')$ (remember, $f(x)=f(x')$ ) or $x=x'$ , and we know both are false. We also cannot have $x' \leq_X x$ for similar reasons.

Solutions

Exercise 3.3.6

October 14, 2023 Issa Rice Leave a comment

Exercise statement

Let $f : X \to Y$ be a bijective function, and let $f^{-1} : Y \to X$ be its inverse. Verify the cancellation laws $f^{-1}(f(x)) = x$ for all $x \in X$ and $f(f^{-1}(y)) = y$ for all $y \in Y$ . Conclude that $f^{-1}$ is also invertible, and has $f$ as its inverse (thus $(f^{-1})^{-1} = f$ ).

Hints

None.

How to think about the exercise

On one hand, this is a pretty obvious exercise and all of the results shown in it should make sense to you. On the other hand, I think it can be tricky to stick to just the definitions and what we know about bijective functions, and be vigilant so as to not smuggle in any extra “obvious” facts about bijective functions. But as long as you are extra careful, you should just be able to straightforwardly do the exercise.

Model solution

Let $x \in X$ be arbitrary. Now $f(x) \in Y$ . We can define the variable $y := f(x)$ . Since $f$ is bijective, for this $y \in Y$ , there is exactly one $x' \in X$ such that $f(x') = y$ , and we denote this $x'$ by $f^{-1}(y)$ . But clearly this $x'$ must be the same as our $x$ . If that’s not clear to you, you can start with $f(x') = y = f(x)$ and use the injectivity of $f$ to conclude $x'=x$ . So now just simply chasing down the equalities we know about, we see that $x=x'=f^{-1}(y)=f^{-1}(f(x))$ (in the last equality, we’re using the substitution axiom of equality). But that’s just what we wanted to show. So we’re done with the first claim. Note that we didn’t really need the variable $y$ at all in this part, but I think using the extra variable makes things a bit clearer.

Now let $y \in Y$ be arbitrary (this is a new variable we are declaring, and may not have anything to do with the $y$ in the previous paragraph). Since $f$ is bijective, for our given $y$ there is exactly one $x \in X$ such that $f(x) = y$ , and we denote this $x$ by $f^{-1}(y)$ . In other words, $x = f^{-1}(y)$ . Now what happens if we just apply $f$ to both sides of that equation? We get $f(x) = f(f^{-1}(y))$ . But we also know that $f(x)=y$ , so we have $y=f(x) = f(f^{-1}(y))$ , which is exactly what we wanted to show, so we’re done with the second cancellation law.

Now we have to show that $f^{-1}$ is invertible. First we show that it is injective. Let $y,y' \in Y$ , and suppose $f^{-1}(y) = f^{-1}(y')$ . Using the second cancellation law, we apply $f$ to both sides and get $y=y'$ . So $f^{-1}$ is injective as required. Now we show it’s also surjective. Let $x \in X$ . We need to find some $y \in Y$ such that $f^{-1}(y)=x$ . Why don’t we just take $f(x) \in Y$ as our $y$ ? (Since all we have to work with are $x$ , $f$ , and $f^{-1}$ , we don’t really have much choice in constructing an element of $Y$ .) Indeed, applying the first cancellation law, we get $f^{-1}(f(x))=x$ , so $f^{-1}$ is surjective.

Finally, we need to show that $f^{-1}$ has $f$ as its inverse. Since $f^{-1}$ is bijective, for every $x \in X$ there is exactly one $y \in Y$ such that $f^{-1}(y)=x$ . By the textbook, this value of $y$ is denoted $(f^{-1})^{-1}(x)$ . If we can show that this $(f^{-1})^{-1}(x)$ is the same as $f(x)$ , we will have shown that $(f^{-1})^{-1}=f$ . To do this, we simply take the equation $f^{-1}(y)=x$ and apply $f$ to both sides, to get $f(f^{-1}(y))=f(x)$ . By the second cancellation law, this becomes $y=f(x)$ . But we already know that $y = (f^{-1})^{-1}(x)$ , so this means $(f^{-1})^{-1}(x)=f(x)$ , and now we are done.

Solutions

Exercise 7.2.2

May 23, 2023 Issa Rice Leave a comment

Exercise statement

Prove Proposition 7.2.5.

Proposition 7.2.5. Let $\sum_{n=m}^\infty a_n$ be a formal series of real numbers. Then $\sum_{n=m}^\infty a_n$ converges if and only if, for every real number $\varepsilon > 0$ , there exists an integer $N \geq m$ such that

$\displaystyle \left|\sum_{n=p}^q a_n\right| \leq \varepsilon$ for all $p,q \geq N$ .

Hints

Use Proposition 6.1.12 and Theorem 6.4.18.
If you want to do this exercise correctly, it is not as easy as just applying the given Proposition and/or Theorem!

How to think about the exercise

If you look up the hints, it is quite obvious how the exercise is done. I wanted to give a thought process of how to approach the problem if one randomly saw it (i.e. without accompanying hints), so in what follows I won’t be assuming that the hints have been given.

Let’s write out the two sides of the “if and only if” by making the quantifiers explicit:

For all $\varepsilon > 0$ there exists $N \geq m$ such that for all $p,q \geq N$ we have $\left|\sum_{n=p}^q a_n\right| \leq \varepsilon$ (1)

The claim that $\sum_{n=m}^\infty a_n$ converges is:

For all $\varepsilon > 0$ there exists $N \geq m$ such that for all $k \geq N$ we have $\left|\sum_{n=m}^k a_n - S\right| \leq \varepsilon$ (2)

Above, I have used $S$ to denote the sum. It’s actually equal to $S=\sum_{n=m}^\infty a_n$ .

This shows something interesting. When formalizing what it meant for a series to converge, i.e., when writing out (2), we had to refer to what the sum converges to, i.e., to $S$ . But in (1), we did not even have to refer to the final sum! In other words, (1) encodes the claim “this series converges” without talking about what the series converges to. This sounds awfully familiar … didn’t we have a way to say “this sequence converges” without referring to the limit of the sequence? That’s right, this is sounding exactly like what Cauchy sequences are. And in fact, if our pattern-matching is active, we might even notice that in (1) we had “for all $p,q \geq N$ ” which is reminiscent of the “for all $j,k \geq N$ ” that we used a lot when dealing with Cauchy sequences (the variable names don’t matter of course).

So let’s see … if $S_N = \sum_{n=m}^N a_n$ is the $N$ th partial sum of the series, then $\sum_{n=p}^q a_n$ is basically equal to $S_q - S_p$ . (How did I see this? It was so quick it’s hard for me to say what my mind did here. I suppose it’s pattern-matching on things I’ve seen before, e.g. the integral rule $\int_a^b + \int_b^c = \int_a^c$ , which can be written in subtracted form as $\int_a^c - \int_a^b =\int_b^c$ .) So $\left|\sum_{n=p}^q a_n\right| \leq \varepsilon$ is basically saying $|S_q - S_p| \leq \varepsilon$ . So (1) is basically saying that $(S_N)_{N=m}^\infty$ is Cauchy. And we have exactly the theorems to show that “is Cauchy” and “is convergent” are the same for sequences.

I said “basically” above because $\sum_{n=p}^q a_n = S_q - S_p$ only holds when $q \geq p$ . When $q < p$ , the sum $\sum_{n=p}^q a_n = 0$ whereas $S_q - S_p$ is not necessarily zero; instead it’s equal to $-\sum_{n=q+1}^p a_n$ . And actually, I’ve made an off-by-one error: $\sum_{n=p}^q a_n$ should actually be $S_q - S_{p-1}$ . So this exercise is perhaps not as simple we thought! The first challenge was to figure out the “big idea” or strategy to use (i.e., recognize that Cauchyness is relevant), and the second challenge is now to make our idea actually work by getting all of the details right. Let’s write out all the variations of the claims we are working with, to get things straight:

$(S_N)_{N=m}^\infty$ is Cauchy (a)

For all $\varepsilon > 0$ there exists $N \geq m$ such that for all $p,q \geq N$ we have $|S_q - S_p| \leq \varepsilon$ (b)

For all $\varepsilon > 0$ there exists $N \geq m$ such that for all $p,q \geq N$ we have $|S_q - S_{p-1}| \leq \varepsilon$ (c)

For all $\varepsilon > 0$ there exists $N \geq m$ such that for all $p,q \geq N$ we have $\left|\sum_{n=p}^q a_n\right| \leq \varepsilon$ (d)

Our goal is to show that (a) and (d) are the same (why do we want to show both directions? It’s because the original Proposition is stated as an “if and only if”). That will allow us to use the fact that a sequence is Cauchy iff it is convergent.

It’s clear that (a) and (b) are equivalent, since (b) is just what it means for a sequence to be Cauchy. So we just need to show that (b) and (c) are equivalent and that (c) and (d) are equivalent.

(c) implies (b) since if $|S_q - S_{p-1}| \leq \varepsilon$ holds for all $p,q \geq N$ , then since $p+1 \geq N$ as well, we have $|S_q - S_{(p+1)-1}| \leq \varepsilon$ , i.e., $|S_q - S_p| \leq \varepsilon$ .

(b) implies (c) since if we have some $N$ such that $|S_q - S_p| \leq \varepsilon$ for all $p,q \geq N$ , then we can pick $N+1$ and see that if $p,q \geq N+1$ then $p-1,q \geq N$ and so $|S_q - S_{p-1}| \leq \varepsilon$ .

(The above two paragraphs are similar to Exercise 6.1.3 and Exercise 6.1.4. In all cases, what matters is what happens at the limits so shifting things around a bit won’t change whether a statement is true.)

So now we just have to show that (c) and (d) are equivalent. Being very mindful of off-by-one errors now, if $q \geq p$ we have $\sum_{n=p}^q a_n = S_q - S_{p-1}$ , and if $q < p$ instead we have $\sum_{n=p}^q a_n = 0$ and

$\begin{aligned}S_q - S_{p-1} &= (a_1 + \cdots + a_q) - (a_1 + \cdots + a_q + a_{q+1} + \cdots + a_p) \\ &= -(a_{q+1} + \cdots + a_p) \\ &= -\sum_{n=q+1}^p a_n\end{aligned}$

So whatever $p,q$ are, we have $\left|\sum_{n=p}^q a_n\right|\leq |S_q - S_{p-1}|$ . This means (c) implies (d): if the bigger thing is less than $\varepsilon$ , then so is the smaller thing. To show that (d) implies (c) as well, let $\varepsilon > 0$ . Since (d) is true, we have some $N$ ; pick this $N$ . Now let $p,q \geq N$ . If $q \geq p$ , we have $\sum_{n=p}^q a_n = S_q - S_{p-1}$ so we have $|S_q - S_{p-1}| = \left|\sum_{n=p}^q a_n\right| \leq \varepsilon$ . If $q < p$ then from (d) we have $\left|\sum_{n=q}^p a_n\right| \leq \varepsilon$ . But we know that $S_q - S_{p-1} = -\sum_{n=q+1}^p a_n$ , so actually we want to show that $\left|\sum_{n=q+1}^p a_n\right| \leq \varepsilon$ . But this is possible since $q+1 \geq N$ . So in other words we have

$\displaystyle |S_q - S_{p-1}| = \left|\sum_{n=q+1}^p a_n\right| \leq \varepsilon$

That was a lot of fiddly details, but we’re finally done.

Model solution

Let $S_N = \sum_{n=m}^N a_n$ be the $N$ th partial sum of the series $\sum_{n=m}^\infty a_n$ . By definition of series convergence, $\sum_{n=m}^\infty a_n$ converges iff the sequence $(S_N)_{N=m}^\infty$ converges.

We will show that $(S_N)_{N=m}^\infty$ is Cauchy if and only if, for every $\varepsilon > 0$ there exists an integer $N \geq m$ such that for all $p,q \geq N$ we have $\left|\sum_{n=p}^q a_n \right| \leq \varepsilon$ . If we can do this, then Theorem 6.4.18 will guarantee the result that we are trying to show.

First suppose that $(S_N)_{N=m}^\infty$ is Cauchy. Let $\varepsilon > 0$ . Since $(S_N)_{N=m}^\infty$ is Cauchy, this means that there exists an integer $N \geq m$ such that for all $p,q \geq N$ we have $|S_q - S_p| \leq \varepsilon$ . Consider the number $N+1$ . If $p,q \geq N+1$ , then $q, p-1 \geq N$ . Now we have two cases. If $q \geq p$ , then the sum $\sum_{n=p}^q a_n$ is equal to $S_q - S_{p-1}$ (Lemma 7.1.4(a)), so we have $\left|\sum_{n=p}^q a_n\right| = |S_q - S_{p-1}| \leq \varepsilon$ by Cauchyness. If on the other hand $q < p$ , then $\left|\sum_{n=p}^q a_n\right| = 0 < \varepsilon$ (Definition 7.1.1). Let us summarize: we let $\varepsilon > 0$ , then found a number $N+1 \geq m$ such that if $p,q \geq N+1$ then $\left|\sum_{n=p}^q a_n\right| \leq \varepsilon$ . So we’ve completed the first direction of the proof.

Now suppose that for every $\varepsilon > 0$ there exists an integer $N \geq m$ such that for all $p,q \geq N$ we have $\left|\sum_{n=p}^q a_n \right| \leq \varepsilon$ . Let $\varepsilon > 0$ . From our assumption, we are given some $N \geq m$ , so pick this $N$ . Now let $p,q \geq N$ . We have $\left|\sum_{n=p}^q a_n \right| \leq \varepsilon$ , and we want to show $|S_q - S_p| \leq \varepsilon$ . We have three cases:

If $q=p$ then $|S_q - S_p| = 0 < \varepsilon$ .
Next, if $q > p$ then $q \geq p+1$ and we have $\sum_{n=p+1}^q a_n = S_q - S_p$ by Lemma 7.1.4(a). And since $q,p+1 \geq N$ we can say that $\left|\sum_{n=p+1}^q a_n \right| \leq \varepsilon$ . But this means $|S_q - S_p| \leq \varepsilon$ as required.
Finally, if $q < p$ then $q+1 \leq p$ and we have $S_q - S_p = -(S_p - S_q) = -\sum_{n=q+1}^p a_n$ . But since $q+1, p \geq N$ we have $\left|\sum_{n=q+1}^p a_n\right| \leq \varepsilon$ , so $|S_q - S_p| = |S_p - S_q| = \left|\sum_{n=q+1}^p a_n\right| \leq \varepsilon$ as required.

In all three cases we have shown that $|S_q - S_p| \leq \varepsilon$ so we are done.

Solutions

Exercise 6.6.4

May 20, 2023 Issa Rice Leave a comment

Exercise statement

Prove Proposition 6.6.5. (Note that one of the two implications has a very short proof.)

Proposition 6.6.5 (Subsequences related to limits). Let $(a_n)_{n=0}^\infty$ be a sequence of real numbers, and let $L$ be a real number. Then the following two statements are logically equivalent (each one implies the other):

(a) The sequence $(a_n)_{n=0}^\infty$ converges to $L$ .
(b) Every subsequence of $(a_n)_{n=0}^\infty$ converges to $L$ .

Hints

For one of the directions, use Lemma 6.6.4.

How to think about the exercise

This is a pretty straightforward exercise. I could leave it at that, but I want to try something new. So here’s a kind of stream-of-consciousness account of how I approached the problem:

We want to show (a) and (b) are the same. So first suppose (a) is true. We want to show (b) is true. What does that mean? Let’s unwrap:

Let $(a_n)_{n=0}^\infty$ be a sequence of real numbers, and suppose $(a_n)_{n=0}^\infty$ converges to $L \in \mathbf R$ . Now we want to show that every subsequence of $(a_n)_{n=0}^\infty$ converges to $L$ . It would be good to have an arbitrary subsequence and to give it a name. So let $(b_n)_{n=0}^\infty$ be a subsequence of $(a_n)_{n=0}^\infty$ . What does this mean? By definition, this means that there is a strictly increasing function $f : \mathbf N \to \mathbf N$ such that $b_n = a_{f(n)}$ for all $n \in \mathbf N$ .

Now, we want to show that $(b_n)_{n=0}^\infty$ converges to $L$ . How do we do that? To show that any sequence converges at all, the usual thing to do is to let $\varepsilon > 0$ be arbitrary. Now we have to find some $N$ such that $n \geq N$ implies $|b_n - L| \leq \varepsilon$ . How do we find such an $N$ ? The only possible way is to use some fact about $(a_n)_{n=0}^\infty$ , because that’s the only sequence we know anything about. We know that $(a_n)_{n=0}^\infty$ converges to $L$ . That means that for any $\varepsilon > 0$ (including the one we defined earlier), there is $N$ such that $n \geq N$ implies $|a_n - L| \leq \varepsilon$ . Okay, that is pretty close to what we want. Since we don’t know any other $N$ to use, let’s just pick this $N$ that we know exists, for now. We know that:

if $n \geq N$ then $|a_n - L|\leq \varepsilon$ . (*)

We want to show that:

if $n \geq N$ then $|b_n - L|\leq \varepsilon$ ;

or, making the relation between $(a_n)_{n=0}^\infty$ and $(b_n)_{n=0}^\infty$ clear, we want to show that:

if $n \geq N$ then $|a_{f(n)} - L|\leq \varepsilon$ .

So in other words, assuming $n \geq N$ , we have $|a_n - L|\leq \varepsilon$ and we want $|a_{f(n)} - L|\leq \varepsilon$ . Can we get what we want? We could if we knew that $f(n) \geq N$ (because of (*)). In other words, we want to know if $n \geq N$ implies $f(n) \geq N$ . One way this could be true is if $f(n) \geq n$ , so that $f(n) \geq n \geq N$ . (Here I have “cheated” a bit. I have seen the pattern “ $f(n) \geq n$ ” many times before, so it was immediately obvious to me that I should use it. I tried to give an account of how one might have reached the same conclusion without having known the pattern ahead of time, but I don’t know if I’ve succeeded.) This seems pretty plausibly true. The identity function seems like the slowest-growing strictly increasing function (it just counts up one by one), but other strictly increasing functions can be faster-growing (they can skip some numbers). It seems like an easy case of induction. It’s clear that $f(0) \geq 0$ , and if $f(n) \geq n$ then since $f(n+1) > f(n)$ , we have $f(n+1) > n$ , so $f(n+1) \geq n+1$ . So that completes the proof that (a) implies (b).

Now we do the other direction. Suppose every subsequence of $(a_n)_{n=0}^\infty$ converges to $L$ . We want to show that $(a_n)_{n=0}^\infty$ converges to $L$ . Since we know something about every subsequence, it seems like we ought to pick some clever particular subsequence to get what we want – oh, we can just pick $(a_n)_{n=0}^\infty$ to be the subsequence!

Model solution

First we show that (a) implies (b). Suppose the sequence $(a_n)_{n=0}^\infty$ converges to $L$ . We want to show that every subsequence of $(a_n)_{n=0}^\infty$ converges to $L$ . So let $(b_n)_{n=0}^\infty$ be a subsequence of $(a_n)_{n=0}^\infty$ . This means there is some strictly increasing function $f : \mathbf N \to \mathbf N$ such that $b_n = a_{f(n)}$ for all $n \in \mathbf N$ . We will show that $(b_n)_{n=0}^\infty$ converges to $L$ . Let $\varepsilon > 0$ . Since $(a_n)_{n=0}^\infty$ converges to $L$ , there exists some $N \geq 0$ such that:

for all $n\geq N$ we have $|a_n - L| \leq \varepsilon$ . (*)

Pick this $N$ , and let $n \geq N$ . We want to show that $|b_n - L| \leq \varepsilon$ . If we can show that $f(n) \geq n$ for all $n \in \mathbf N$ , then we would have $f(n) \geq n \geq N$ , so by (*) we have $|a_{f(n)} - L| \leq \varepsilon$ . By definition of $(b_n)_{n=0}^\infty$ this means we have $|b_n - L| \leq \varepsilon$ as required.

So it remains to show that $f(n) \geq n$ for all $n \in \mathbf N$ . We proceed by induction. The base case $f(0) \geq 0$ follows since the range of $f$ is $\mathbf N$ . Now suppose inductively that $f(n) \geq n$ . Since $f$ is strictly increasing, we have $f(n+1) > f(n)$ . Thus we have $f(n+1) > n$ . By Proposition 2.2.12(e) we have $f(n+1) \geq n+1$ as required. This closes the induction.

Next we show that (b) implies (a). This follows immediately from Lemma 6.6.4: $(a_n)_{n=0}^\infty$ is a subsequence of itself, so if every subsequence of $(a_n)_{n=0}^\infty$ converges to $L$ then in particular $(a_n)_{n=0}^\infty$ must converge to $L$ .

Solutions

Exercise 6.6.1

April 1, 2023 Issa Rice 1 Comment

Exercise statement

Let $(a_n)_{n=0}^\infty$ , $(b_n)_{n=0}^\infty$ , and $(c_n)_{n=0}^\infty$ be sequences of real numbers. Then $(a_n)_{n=0}^\infty$ is a subsequence of $(a_n)_{n=0}^\infty$ . Furthermore, if $(b_n)_{n=0}^\infty$ is a subsequence of $(a_n)_{n=0}^\infty$ , and $(c_n)_{n=0}^\infty$ is a subsequence of $(b_n)_{n=0}^\infty$ , then $(c_n)_{n=0}^\infty$ is a subsequence of $(a_n)_{n=0}^\infty$ .

Hints

You may need to use induction depending on how careful you want to be.

How to think about the exercise

This exercise is almost mechanically straightforward (i.e. requires just unwrapping the definitions and figuring out the right functions to use), but there is one complication, which is that Tao decided to use a “simpler” definition of a strictly increasing function on the natural numbers. Because of this definition, we have to do a little bit of extra work to show the transitivity result.

The induction is even more complicated by the fact that the statement we want to prove by induction, when phrased in a natural way, starts at $1$ rather than $0$ , but the book does not prove that induction works with an arbitrary base case. I’ve split that part off into a lemma to make the proof here look conceptually cleaner.

Model solution

First we show that $(a_n)_{n=0}^\infty$ is a subsequence of itself. The function $f : \mathbf N \to \mathbf N$ defined by $f(n) := n$ (i.e. the identity function on the set of natural numbers) is strictly increasing because $f(n+1) = n+1 > n = f(n)$ . We also have $a_n = a_{f(n)}$ for each $n \in \mathbf N$ . Thus $(a_n)_{n=0}^\infty$ is a subsequence of $(a_n)_{n=0}^\infty$ .

Now suppose that $(b_n)_{n=0}^\infty$ is a subsequence of $(a_n)_{n=0}^\infty$ , and that $(c_n)_{n=0}^\infty$ is a subsequence of $(b_n)_{n=0}^\infty$ . We want to show that $(c_n)_{n=0}^\infty$ is a subsequence of $(a_n)_{n=0}^\infty$ . Since $(b_n)_{n=0}^\infty$ is a subsequence of $(a_n)_{n=0}^\infty$ , there exists a function $g : \mathbf N \to \mathbf N$ which is strictly increasing such that $b_n = a_{g(n)}$ for each $n \in \mathbf N$ . Similarly, since $(c_n)_{n=0}^\infty$ is a subsequence of $(b_n)_{n=0}^\infty$ , there exists a function $h: \mathbf N \to \mathbf N$ which is strictly increasing such that $c_n = b_{h(n)}$ for each $n \in \mathbf N$ . By composing $g$ and $h$ , we can produce a function which selects elements out of $(a_n)_{n=0}^\infty$ that make up the sequence $(c_n)_{n=0}^\infty$ . Concretely, we have $c_n = b_{h(n)} = a_{g(h(n))} = a_{(g\circ h)(n)}$ .

It remains to verify that $g\circ h : \mathbf N \to \mathbf N$ is a strictly increasing function. We first show that for a strictly increasing function $g : \mathbf N \to \mathbf N$ , if $n > m$ then $g(n) > g(m)$ . (This is actually the same thing as Exercise 6.1.1 but I had forgotten about that exercise when I wrote this post. So you can read along below or skip the subproof and refer to Exercise 6.1.1 instead.)

Fix a natural number $m$ , and consider the proposition $P(d)$ defined by “ $g(m+d) > g(m)$ ”. We will use induction to show that $P(d)$ is true for all $d \geq 1$ .

For the base case $d=1$ , we must show that $g(m+1) > g(m)$ , but this is true since $g$ is a strictly increasing function.

Now let $d \geq 1$ and suppose inductively that $P(d)$ is true, i.e. $g(m+d) > g(m)$ . We want to show that $P(d+1)$ is true, i.e. $g(m+d+1) > g(m)$ . By inductive hypothesis, we have $g(m+d) > g(m)$ . Also, because $g$ is strictly increasing, we have $g((m+d) + 1) > g(m+d)$ . So by transitivity of inequality, we have $g(m+d+1) > g(m)$ as required. This closes the induction.

Now to actually prove the result that if $n > m$ then $g(n) > g(m)$ , we see that $n-m \geq 1$ so $P(n-m)$ is true, i.e. $g(n)= g(m+(n-m)) > g(m)$ .

With the result above, it is easy to show that $g\circ h$ is a strictly increasing function. Since $h$ is a strictly increasing function, we have $h(n+1) > h(n)$ by definition. Now using the result above, we have $g(h(n+1)) > g(h(n))$ . But this is the same thing as $(g\circ h)(n+1) > (g\circ h)(n)$ .

Solutions

Induction with arbitrary base case

April 1, 2023 Issa Rice 2 Comments

In this post, I will state a version of mathematical induction that uses an arbitrary natural number base case. I will assume that the reader has worked through the book up to and including section 2.2; I will also be making use of of subtracting, which is only introduced in section 4.1 (but the use of subtraction is not fundamental, and the same result can be shown using Lemma 2.2.10 instead, and replacing all mentions of $n-1$ with “the number $a$ such that $a{{+}\!{+}} = n$ ”).

I’m splitting this result off as a lemma because it’s very useful for Exercise 6.6.1 but is very much not the point of Exercise 6.6.1, so I thought it would be distracting to include as part of the solution for that exercise.

Exercise statement

Lemma. Let $n_0$ be a natural number, and let $P(n)$ be a property pertaining to an arbitrary natural number $n$ . Suppose that $P(n_0)$ is true. Suppose also that for each $n\geq n_0$ , if $P(n)$ is true then $P(n+1)$ is also true. Then $P(n)$ is true for all $n \geq n_0$ .

Model solution 1

In this solution, we make use of the strong principle of induction. The strong principle of induction is a stronger version of induction with arbitrary base case, since it already allows an arbitrary case (and is stronger because it allows one to use more assumptions when proving the inductive case). So the idea below is to specialize/weaken the strong principle of induction.

Let $n_0$ be a natural number, and let $P(n)$ be a property pertaining to an arbitrary natural number $n$ . Suppose that $P(n_0)$ is true. Suppose also that for each $n\geq n_0$ , if $P(n)$ is true then $P(n+1)$ is also true. We must show that $P(n)$ is true for all $n \geq n_0$ .

We use the strong principle of induction. We must show the following implication for each $n \geq n_0$ : if $P(n')$ is true for all natural numbers $n_0 \leq n' < n$ , then $P(n)$ is also true. So let $n \geq n_0$ be a natural number, and suppose $P(n')$ is true for all natural numbers $n_0 \leq n' < n$ . We have two cases:

If $n=n_0$ , then there is no natural number $n'$ such that $n_0 \leq n' < n_0$ , so the hypothesis of the implication is vacuously true. That means we have to show that $P(n_0)$ is true. But we are already given that $P(n_0)$ is true.
If $n > n_0$ , then we have $n-1 \geq n_0$ by Proposition 2.2.12(e) (why?). So we have $n_0 \leq n-1 < n$ . Since we assumed that $P(n')$ is true for all natural numbers $n_0 \leq n' < n$ , this means that in particular $P(n-1)$ is true. Since $n-1 \geq n_0$ , we have that $P((n-1)+1)$ is true, i.e. $P(n)$ is true as required.

We showed that $P(n)$ is true in both cases, so this closes the induction.

Model solution 2

Define the statement $Q(m)$ to be $P(n_0+m)$ . We will show using (ordinary) induction that $Q(m)$ is true for all natural numbers $m$ .

For the base case, we must show that $Q(0)$ is true. But this is just the statement $P(n_0)$ , which we assumed to be true.

Now suppose inductively that $Q(m)$ is true. We must show that $Q(m+1)$ is true. Since $Q(m)$ is true, we know that $P(n_0+m)$ is true. By assumption about the property $P$ , we know that $P(n_0+m+1)$ is true. But this means $Q(m+1)$ is true. This closes the induction.

We have now shown that $Q(m)$ is true for all natural numbers $m$ . Unwrapping the definition of $Q$ , this means that $P(n_0+m)$ is true for all $m$ . But our goal was to show that $P(n)$ is true for all $n \geq n_0$ . So let $n \geq n_0$ . By Definition 2.2.11, this means that there is some $m$ such that $n = n_0 + m$ . But we know that for all $m$ the statement $P(n_0+m)$ is true. So this means that in particular for our $m$ , we have $P(n_0+m)$ . But since $n = n_0 + m$ this means that $P(n)$ as desired.