Exercise 6.3.4

Exercise statement

Explain why Proposition 6.3.10 fails when x > 1. In fact, show that the sequence (x^n)_{n=1}^\infty diverges when x > 1. Compare this with the argument in Example 1.2.3; can you now explain the flaws in the reasoning in that example?

Hints

  1. Prove by contradiction and use the identity (1/x)^n x^n = 1 and the limit laws in Theorem 6.1.19.

How to think about the exercise

One thing that’s good to do when you’re trying to understand why a theorem fails in a particular case is to run through the proof of the theorem and locate the first step where the proof doesn’t make sense. This point comes early in the proof of Proposition 6.3.10 in the book. In particular, the sequence (x^n)_{n=1}^\infty is not decreasing when x > 1.

Model solution

Let x > 1. Suppose for sake of contradiction that \lim_{n\to\infty} x^n = L for some real number L, i.e. that the sequence (x^n)_{n=1}^\infty converges. Since x > 1, we have 0 < 1/x < 1. Thus by Proposition 6.3.10 we have \lim_{n\to\infty} (1/x)^n = 0. By the limit laws (Theorem 6.1.19(b)) we thus have \lim_{n\to\infty} ((1/x)^n x^n) = \left(\lim_{n\to\infty} (1/x)^n\right) \left(\lim_{n\to\infty} x^n\right) = 0L = 0. But (1/x)^n x^n = 1 by Proposition 4.3.10(a), which we can use thanks to Proposition 5.6.3. Thus we can compute \lim_{n\to\infty} ((1/x)^n x^n) = \lim_{n\to\infty} 1 = 1. This is a contradiction, and completes the proof.

Example 1.2.3 assumes that the limit \lim_{n\to\infty} x^n exists for all real x, which is false. If we restrict to 0 < x < 1 then indeed we have \lim_{n\to\infty} x^n = 0 by Proposition 6.3.10.

Exercise 7.3.2

Exercise statement

Prove Lemma 7.3.3.

Lemma 7.3.3 (Geometric series). Let x be a real number. If |x| \geq 1, then the series \sum_{n=0}^\infty x^n is divergent. If however |x| < 1, then the series is absolutely convergent and

\displaystyle \sum_{n=0}^\infty x^n = 1/(1-x).

Hints

  1. For the first part, use the zero test.
  2. For the second part, first use induction to establish the geometric series formula \sum_{n=0}^N x^n = (1-x^{N+1})/(1-x) and then apply Lemma 6.5.2.

How to think about the exercise

In the case where |x| < 1, Tao suggests to “use induction”. I think this is a pretty unhelpful hint. It nudges you toward a particular unenlightening kind of induction proof where you just blindly go through the motions – it is a totally valid proof, and will work without any trouble, but I just don’t like it. Here I want to walk through a better proof.

Let N be a natural number. The Nth partial sum of the series is

\displaystyle S_N := \sum_{n=0}^N x^n = 1 + x + x^2 + \cdots + x^{N-1} + x^N

The big insight is to notice that if you multiply this by x, you almost get the (N+1)th partial sum, S_{N+1}. In fact, we have

\displaystyle 1 + xS_N = 1 + x + x^2 + x^3 + \cdots + x^N + x^{N+1} = S_{N+1}

But for any series, the difference between the (N+1)th partial sum and the Nth partial sum is just the (N+1)th term. In other words, we have S_{N+1} - S_N = x^{N+1}. Substituting the identity S_{N+1} = 1 + xS_N from above, we have

x^{N+1} = S_{N+1} - S_N = (1 + xS_N) - S_N = 1 + (x-1)S_N

Rearranging, we obtain the desired formula

\displaystyle S_N = \frac{1 - x^{N+1}}{1-x}

Dividing by 1-x is valid since |x| < 1 hence x\ne 1.

The above wasn’t entirely rigorous because we used a bunch of “\cdots” to make it easier to see the pattern. But it is very simple to convert this into a rigorous proof, which we will do below in the model solution.

Model solution

First suppose |x| \geq 1. Consider the terms of this series 1, x, x^2, x^3, \ldots. By Lemma 6.5.2, this sequence either converges to 1 or else diverges. In either case, the sequence does not converge to zero, so by the zero test (Corollary 7.2.6) the series \sum_{n=0}^\infty x^n diverges.

Now suppose |x| < 1. We will show the geometric series formula \sum_{n=0}^N x^n = (1-x^{N+1})/(1-x). Write S_N for the Nth partial sum. Then we have

\displaystyle \begin{aligned}1 + xS_N &= 1 + x\sum_{n=0}^N x^n \\ &= 1 + \sum_{n=0}^N x^{n+1} \\ &= 1 + \sum_{n=1}^{N+1} x^n \\ &= \sum_{n=0}^{N+1} x^n \\ &= S_{N+1}\end{aligned}

We also have S_{N+1} - S_N = x^{N+1} for any series. Combining these two equalities and rearranging, we have S_N = (1-x^{N+1})/(1-x) as desired.

To show that the series \sum_{n=0}^\infty x^n is convergent, we directly compute the limit \lim_{N\to\infty} S_N. By Lemma 6.5.2 we have \lim_{N\to\infty} x^{N+1} = 0. Thus by the limit laws we have

\displaystyle \lim_{N\to\infty} S_N = \lim_{N\to\infty} \frac{1 - x^{N+1}}{1-x} = \frac{1 - \lim_{N\to\infty} x^{N+1}}{1-x} = \frac{1}{1-x}

as desired.

Finally we show that when |x| < 1 the series is absolutely convergent. We must show that the series \sum_{n=0}^\infty |x^n| converges. By Proposition 4.3.10(d) (which we can use thanks to Proposition 5.6.3), we have |x^n| = |x|^n. Thus it suffices to show that the series \sum_{n=0}^\infty |x|^n converges. But this is just a geometric series with the common ratio |x|, so the series converges (to \sum_{n=0}^\infty |x|^n = 1/(1-|x|)) by our proof above.

Exercise 7.2.4

Exercise statement

Prove Proposition 7.2.9.

Proposition 7.2.9 (Absolute convergence test). Let \sum_{n=m}^\infty a_n be a formal series of real numbers. If this series is absolutely convergent, then it is also conditionally convergent. Furthermore, in this case we have the triangle inequality

\displaystyle \left|\sum_{n=m}^\infty a_n\right| \leq \sum_{n=m}^\infty |a_n|

Hints

  1. Use Proposition 7.2.5.
  2. Use Proposition 7.1.4(e).

How to think about the exercise

In the proof below, we use Proposition 7.2.5 twice: once to go from the fact that \sum_{n=m}^\infty a_n absolutely converges to an inequality, and a second time to go from an inequality to the fact that \sum_{n=m}^\infty a_n conditionally converges. Make sure you introduce \varepsilon > 0 at the start of the proof, before you use Proposition 7.2.5 for the first time.

Model solution

Suppose \sum_{n=m}^\infty a_n is absolutely convergent. We want to show this series is conditionally convergent. To do so, we will use Proposition 7.2.5, which is also known as Cauchy’s convergence test. So let \varepsilon > 0 be a real number. Since \sum_{n=m}^\infty a_n is absolutely convergent, this means that \sum_{n=m}^\infty |a_n| is a convergent series. Thus there exists an integer N \geq m such that \left|\sum_{n=p}^q |a_n|\right| \leq \varepsilon for all p,q \geq N. The quantity \sum_{n=p}^q |a_n| is non-negative for all p,q (this is an easy induction), so in fact we have \left|\sum_{n=p}^q |a_n|\right| = \sum_{n=p}^q |a_n| which means that \sum_{n=p}^q |a_n| \leq \varepsilon for all p,q \geq N.

By the triangle inequality of Proposition 7.1.4(e), we have \left|\sum_{n=p}^q a_n\right| \leq \sum_{n=p}^q |a_n| for all p,q. Thus we have found an N\geq m such that for all p,q \geq N we have

\displaystyle \left|\sum_{n=p}^q a_n\right| \leq \sum_{n=p}^q |a_n| \leq \varepsilon

which proves convergence of \sum_{n=m}^\infty a_n.

It remains to show the triangle inequality, \left|\sum_{n=m}^\infty a_n\right| \leq \sum_{n=m}^\infty |a_n|. Let \alpha_N := \left|\sum_{n=m}^N a_n\right| be the absolute value of the Nth partial sum of the series \sum_{n=m}^\infty a_n, and let \beta_N := \sum_{n=m}^N |a_n| be the Nth partial sum of the series \sum_{n=m}^\infty |a_n|. By Proposition 7.1.4(e), we have \alpha_N \leq \beta_N for each N\geq m. Thus by the comparison principle, Lemma 6.4.13, we have the inequality \limsup_{N\to\infty} \alpha_N \leq \limsup_{N\to\infty} \beta_N.

We proved above that the series \sum_{n=m}^\infty a_n converges. Thus by the limit laws, Theorem 6.1.19 (c) and (g) (and the fact that |x| = \max(x, -x) for all real numbers x), we know that the sequence (\alpha_N)_{N=m}^\infty converges to \left|\sum_{n=m}^\infty a_n\right| (i.e. we know that the series converges, so this means the sequence of partial sums converges, which means by the limit laws that the sequence of absolute values of the partial sums also converges). Thus by Proposition 6.4.12(f) we have \limsup_{N\to\infty} \alpha_N = \lim_{N\to\infty} \alpha_N.

Similarly, by the hypothesis that \sum_{n=m}^\infty a_n is absolutely convergent, we know that the sequence (\beta_N)_{N=m}^\infty converges. Thus \limsup_{N\to\infty} \beta_N = \lim_{N\to\infty} \beta_N.

Combining these equalities with the inequality from above, we have

\displaystyle \left|\sum_{n=m}^\infty a_n\right| = \lim_{N\to\infty} \alpha_N \leq \lim_{N\to\infty} \beta_N = \sum_{n=m}^\infty |a_n|

as desired.

Exercise 3.2.1

Exercise statement

Show that the universal specification axiom, Axiom 3.8, if assumed to be true, would imply Axioms 3.2, 3.3, 3.4, 3.5, and 3.6. (If we assume that all natural numbers are objects, we also obtain Axiom 3.7.) Thus, this axiom, if permitted, would simplify the foundations of set theory tremendously (and can be viewed as one basis for an intuitive model of set theory known as “naive set theory”). Unfortunately, as we have seen, Axiom 3.8 is “too good to be true”!

Hints

None.

How to think about the exercise

This is a pretty straightforward exercise, so I don’t have anything to say.

Model solution

Axiom 3.2: Let P(x) be any statement pertaining to an object x that is always false, for instance the statement x \ne x or the statement 0=1. (You might not like using x\ne x because equality might not be defined for all objects, but typically in formal systems equality is defined for all objects and has the properties listed in Appendix A.7.) Then by Axiom 3.8 the set \{x : P(x)\text{ is true}\} exists. In other words, we have y \in \{x : P(x)\text{ is true}\} if and only if P(y) is true. Since P(y) is always false, this means that y \in \{x : P(x)\text{ is true}\} is also always false, which means the set does not contain any elements, i.e. for every object y we have y \notin \{x : P(x)\text{ is true}\}. This is exactly what Axiom 3.2 claims, so we are done.

Axiom 3.3: Let a be an object, and consider the statement P(x) defined as x=a. By Axiom 3.8, the set \{x : x=a\} exists. Thus the singleton set \{a\} exists. If a,b are objects, we can consider the statement Q(x) defined as “x=a or x=b”. Then by Axiom 3.8 the set \{x : x=a\text{ or }x=b\} exists, which is the pair set \{a,b\}. This proves Axiom 3.3.

Axiom 3.4: Let A,B be two sets. Consider the statement P(x) defined as “x\in A or x\in B”. Then by Axiom 3.8 the set \{x : P(x)\text{ is true}\} exists. Thus we have y \in \{x : P(x)\text{ is true}\} \iff (y \in A \text{ or }y \in B), which means \{x : P(x)\text{ is true}\} is the union of A and B.

Axiom 3.5: Let A be a set, and let P(x) be a statement pertaining to an object x \in A. Let Q(x) be the statement “x \in A and P(x) is true”. Then by Axiom 3.8 the set \{x : Q(x)\text{ is true}\} exists. Thus we have y \in \{x : Q(x)\text{ is true}\} \iff (y \in A \text{ and }P(y)\text{ is true}). This is what we mean by the set \{x \in A : P(x)\}, so this proves Axiom 3.5.

Axiom 3.6: Let A be a set, and let P(x,y) be a statement pertaining to x,y such that for each x \in A there is at most one y for which P(x,y) is true. Let Q(y) be the statement “P(x,y) is true for some x \in A”. Then by Axiom 3.8, the set \{y : Q(y) \text{ is true}\} exists. Thus we have z \in \{y : Q(y) \text{ is true}\} iff Q(z) is true iff P(x,z) is true for some x \in A. This is precisely what is meant by the set \{y : P(x,y) \text{ is true for some } x \in A\}, so this proves Axiom 3.6.

Axiom 3.7: Suppose all natural numbers are objects. Let P(x) be the statement “x is a natural number”. Then the set \{x : P(x)\text{ is true}\} exists by Axiom 3.8. This set is \mathbf N.

Exercise 6.3.2

Exercise statement

Prove Proposition 6.3.6.

Proposition 6.3.6 (Least upper bound property). Let (a_n)_{n=m}^\infty be a sequence of real numbers, and let x be the extended real number x := \sup(a_n)_{n=m}^\infty. Then we have a_n \leq x for all n \geq m. Also, whenever M \in \mathbf R^* is an upper bound for (a_n)_{n=m}^\infty (i.e., a_n \leq M for all n \geq m), we have x \leq M. Finally, for every extended real number y for which y < x, there exists at least one n \geq m for which y < a_n \leq x.

Hints

  1. Use Theorem 6.2.11.

How to think about the exercise

This is a straightforward exercise so I don’t have anything to say.

Model solution

Let E := \{a_n : n \geq m\}. Then by Definition 6.3.1 we have x = \sup(E). By Theorem 6.2.11(a) we have z \leq \sup(E) for every z \in E. Given any n \geq m we have a_n \in E so a_n \leq \sup(E) = x as desired.

Next, let M \in \mathbf R^* be an upper bound for (a_n)_{n=m}^\infty. This means that M is an upper bound for the set E, since given any z \in E we have z = a_n for some n \geq m. Thus by Theorem 6.2.11(b) we have \sup(E) \leq M. Since x = \sup(E), this shows that x \leq M as required.

Finally, let y be an extended real number such that y < x. We want to find an n \geq m for which y < a_n \leq x. We will first find an n \geq m such that y < a_n. Suppose for sake of contradiction that such an n does not exist, i.e. for every n \geq m we have a_n \leq y. This means that y is an upper bound for the sequence, so by the previous paragraph we have x \leq y, which contradicts the fact that y < x. This contradiction shows that there is some n \geq m such that y < a_n. By the first paragraph of this proof, we have a_n \leq x. Thus we have found an n \geq m such that y < a_n \leq x as desired.

Exercise 3.3.8

Exercise statement

If X is a subset of Y, let \iota_{X \to Y} : X \to Y be the inclusion map from X to Y, defined by mapping x \mapsto x for all x \in X, i.e., \iota_{X\to Y}(x) := x for all x \in X. The map \iota_{X\to X} is in particular called the identity map on X.

(a) Show that if X \subseteq Y \subseteq Z then \iota_{Y \to Z} \circ \iota_{X \to Y} = \iota_{X \to Z}.

(b) Show that if f : A \to B is any function, then f = f \circ \iota_{A \to A} = \iota_{B \to B} \circ f.

(c) Show that, if f : A \to B is a bijective function, then f \circ f^{-1} = \iota_{B\to B} and f^{-1} \circ f = \iota_{A \to A}.

(d) Show that if X and Y are disjoint sets, and f : X \to Z and g : Y \to Z are functions, then there is a unique function h : X \cup Y \to Z such that h \circ \iota_{X \to X \cup Y} = f and h \circ \iota_{Y \to X \cup Y} = g.

Hints

None.

How to think about the exercise

I want to comment on just part (d), because there is an interesting observation that isn’t spelled out in the exercise statement. The existence of the function h follows from the fact that X and Y are disjoint. If there was an element x_0 \in X\cap Y then f could send x_0 to z_1 while g could send x_0 to z_2 \ne z_1. Now any purported h must decide between following f or following g, and it is impossible to follow both.

On the other hand, the uniqueness of the function h follows from the fact that X and Y are exhaustive. In other words, if we write W := X \cup Y, then every w \in W is in either X or Y. Suppose there was some w_0 \in W such that w_0 \notin X and w_0 \notin Y. Then we could have h(w_0) := z_1 and \tilde h(w_0) := z_2 \ne z_1. Then both h, \tilde h will satisfy h \circ \iota_{X\to W} = f and h\circ \iota_{Y\to W} = g but h\ne \tilde h.

Moreover, the proof of existence does not require the X and Y be exhaustive, and the proof of uniqueness does not require that X and Y be disjoint.

Pay attention to where in the proof below (and in your own proof) these facts are (or are not) used!

Model solution

(a) Both \iota_{Y \to Z} \circ \iota_{X \to Y} and \iota_{X \to Z} have the same domain and range, so we just need to verify that their outputs agree on all inputs. Let x \in X. Then \iota_{Y \to Z} \circ \iota_{X \to Y}(x) = \iota_{Y \to Z}(\iota_{X \to Y}(x)) = \iota_{Y \to Z}(x) = x, where the final equality follows since x \in Y. Similarly, we have \iota_{X \to Z}(x) = x. This shows that \iota_{Y \to Z} \circ \iota_{X \to Y} = \iota_{X \to Z} as required.

(b) The compositions f \circ \iota_{A \to A} and \iota_{B \to B} \circ f make sense and do not alter the domain and range of f, so we just need to verify that all three functions agree in their behavior. Let a \in A. Then \iota_{A \to A}(a) = a so f \circ \iota_{A \to A}(a) = f(a). Since f(a) \in B, we have \iota_{B \to B} \circ f(a) = \iota_{B \to B}(f(a)) = f(a). All three functions have the same output, namely f(a), so they are equal.

(c) Let b \in B. Since f is bijective, there is exactly one a \in A such that f(a) = b, namely a = f^{-1}(b). Thus we have f \circ f^{-1}(b) = f(f^{-1}(b)) = f(a) = b = \iota_{B\to B}(b). Since b \in B was arbitrary, this shows that f \circ f^{-1} = \iota_{B\to B}.

Let a \in A. Then f(a) \in B. Since f is bijective, there is exactly one element of A that maps to f(a). We already know that a \in A is such an element, so we must have f^{-1}(f(a)) = a. But we also have \iota_{A \to A}(a) = a. Thus we have f^{-1} \circ f = \iota_{A \to A} as required.

(d) Define h as follows: for each w \in X \cup Y, if w \in X then h(w) := f(w) and if w \in Y then h(w) := g(w). Since X and Y are disjoint, this definition makes sense: if w \in X \cup Y then exactly one of w \in X or w \in Y is true.

The function h \circ \iota_{X \to X \cup Y} is the composition of \iota_{X \to X \cup Y} : X \to X\cup Y and h : X \cup Y \to Z so it has domain X and range Z, which is the same as the domain and range of f.  Moreover, if x \in X, then h \circ \iota_{X \to X \cup Y}(x) = h(x) = f(x), so h \circ \iota_{X \to X \cup Y} = f.

Similarly, the function h \circ \iota_{Y \to X \cup Y} is the composition of \iota_{Y \to X \cup Y} : Y \to X \cup Y and h : X \cup Y \to Z so it has domain Y and range Z, which is the same as the domain and range of g. Moreover, if y \in Y, then h \circ \iota_{Y \to X \cup Y}(y) = h(y) = g(y), so h \circ \iota_{Y \to X \cup Y} = g.

Now we verify the uniqueness of h. Let \tilde h : X\cup Y \to Z be another function such that \tilde h \circ \iota_{X \to X \cup Y} = f and \tilde h \circ \iota_{Y \to X \cup Y} = g. We want to show h = \tilde h. Let w \in X \cup Y. We have two cases:

  • If w \in X then w = \iota_{X\to X\cup Y}(w) so \tilde h(w) = \tilde h(\iota_{X\to X\cup Y}(w)) = \tilde h \circ \iota_{X \to X \cup Y}(w) = f(w). By definition of h, we also have h(w) = f(w). Thus \tilde h(w) = h(w).
  • Similarly, if w \in Y then w = \iota_{Y\to X\cup Y}(w) so \tilde h(w) = \tilde h(\iota_{Y\to X\cup Y}(w)) = \tilde h \circ \iota_{Y \to X \cup Y}(w) = g(w). By definition of h, we also have h(w) = g(w). Thus \tilde h(w) = h(w).

We have shown that h(w) = \tilde h(w) for all w \in X \cup Y, so h = \tilde h.

Exercise 6.2.2

Exercise statement

Prove Theorem 6.2.11.

Theorem 6.2.11. Let E be a subset of \mathbf R^*. Then the following statements are true.

(a) For every x \in E we have x \leq \sup(E) and x \geq \inf(E).
(b) Suppose that M \in \mathbf R^* is an upper bound for E, i.e., x \leq M for all x \in E. Then we have \sup(E) \leq M.
(c) Suppose that M \in \mathbf R^* is a lower bound for E, i.e., x \geq M for all x \in E. Then we have \inf(E) \geq M.

Hints

  1. You may need to break into cases depending on whether +\infty or -\infty belongs to E.
  2. You can of course use Definition 5.5.10, provided that E consists only of real numbers.
  3. When proving the parts about the infimum, you don’t need to split into cases.

How to think about the exercise

The geometric picture, in terms of the real number line with special points at infinity, should feel intuitive to you. The point of this exercise is just to verify that our formal definition really does work the way we intuitively expect.

This exercise is a little tricky if you want to do the least amount of work. (Doing the exercise by splitting everything into cases is fine, but that’s no fun.)

Model solution

(a) We first show that x \leq \sup(E) for all x \in E. Let x \in E. We have three cases:

  • If E is contained in \mathbf R, then x is a real number, and by Definition 5.5.10 we have x \leq \sup(E): if E is empty, then the result is vacuous; if E is non-empty and bounded above, then the result follows because \sup(E) is the least upper bound; and if E is non-empty and not bounded above, then \sup(E) = +\infty so the result follows by Definition 6.2.3(b).
  • If E contains +\infty, then x \leq +\infty = \sup(E) by Definition 6.2.3(b) and Definition 6.2.6(b).
  • If E does not contain +\infty but does contain -\infty, then we have two cases. If x is real, then x \in E \setminus \{-\infty\} so x \leq \sup(E \setminus \{-\infty\}) = \sup(E) by Definition 5.5.10 and Definition 6.2.6(c). On the other hand, if x = -\infty then x \leq \sup(E) by Definition 6.2.3(c); the value of x is so small that we don’t even care what \sup(E) is.

Next we show that x \geq \inf(E) for all x \in E. Let x \in E. Then -x \in -E so by what we just showed, we have -x \leq \sup(-E). Thus by Proposition 6.2.5(d) we have x \geq -\sup(-E) = \inf(E) as required.

(b) We have three cases:

  • If E is contained in \mathbf R, then the result follows by Definition 5.5.10: if E is empty, then \sup(E) = -\infty \leq M; if E is non-empty and is bounded above, then \sup(E) is the least upper bound while M is an upper bound, so \sup(E) \leq M; if E is non-empty and has no upper bound, then M must be +\infty (otherwise we could find a real number in E that is larger than M), so \sup(E) \leq +\infty = M.
  • If E contains +\infty, then we must have +\infty \leq M since M is an upper bound for E. Thus we have \sup(E) = +\infty \leq M as required.
  • If E does not contain +\infty but does contain -\infty, then \sup(E) = \sup(E \setminus \{-\infty\}). We also have x \leq M for all x \in E \setminus \{-\infty\}, since E \setminus \{-\infty\} is a subset of E. Thus by the first case we have \sup(E \setminus \{-\infty\}) \leq M, from which the result follows.

(c) Let y \in -E. Then y=-x for some x \in E. Since x \in E and M is a lower bound for E, we have x \geq M, which means y=-x \leq -M. Since y \in -E was arbitrary, this means that -M is an upper bound for -E. By part (b), we thus have \sup(-E) \leq -M, which means \inf(E) = -\sup(-E) \geq M as desired.

Exercise 6.3.1

Exercise statement

Verify the claim in Example 6.3.4.

Let a_n := 1/n; thus (a_n)_{n=1}^\infty is the sequence 1, 1/2, 1/3, \ldots . Then \sup(a_n)_{n=1}^\infty = 1 and \inf(a_n)_{n=1}^\infty = 0.

Hints

  1. You might want to use Exercise 5.4.4.

How to think about the exercise

This is a straightforward exercise so I don’t have anything to say. I’ll just leave this picture:

Model solution

First we show that 1 is the supremum of the sequence. Let 1/n be an element of the sequence. Then n \geq 1 so 1/n \leq 1, which shows that 1 is an upper bound for the sequence. Let M be any upper bound for the sequence, so that 1/n \leq M for all n \geq 1. Then in particular for n=1 we have 1/1 \leq M as required. Thus 1 is the least upper bound (supremum) as required.

Next we show that 0 is the infimum of the sequence. Each n \geq 1 is positive, so by Definition 4.2.6 we see that 1/n is positive, i.e. 1/n > 0. Thus 0 is a lower bound for the sequence. Let L be any lower bound for the sequence, so that 1/n \geq L for all n \geq 1. We want to show that L \leq 0. Suppose for sake of contradiction that L > 0. Then by Exercise 5.4.4 there exists a positive integer N \geq 1 such that L > 1/N > 0. But 1/N is an element of the sequence, so this contradicts the fact that L is a lower bound for the sequence. Hence we must have L \leq 0, which shows that 0 is the greatest lower bound (infimum) as required.

Exercise 9.7.2

Exercise statement

Let f : [0,1] \to [0,1] be a continuous function. Show that there exists a real number x in [0,1] such that f(x) = x. This point x is known as a fixed point of f, and this result is a basic example of a fixed point theorem, which play an important rôle in certain types of analysis.

Hints

  1. Apply the intermediate value theorem to the function f(x) - x.

How to think about the exercise

This result just says that if a function starts from the left end of the square and ends up on the right end of the square, then it must intersect the diagonal at some point.

Model solution

Define the function g : [0,1] \to \mathbf R by g(x) := f(x) - x. Then g(0) = f(0) \in [0,1] and g(1) = f(1)-1. Since 0\leq f(1) \leq 1 we have -1 \leq f(1) - 1 \leq 0, thus g(1) \in [-1, 0]. Thus we have g(1) \leq 0 \leq g(0). We also know that g is continuous because f is continuous by assumption, the identity function x \mapsto x is continuous by Example 9.4.3, and by Proposition 9.4.9. Hence we can apply the intermediate value theorem to find an x \in [0,1] such that g(x) = 0. For this value of x, we have f(x) - x = g(x) = 0, i.e. f(x)=x as desired.

Exercise 3.4.7

Exercise statement

Let X, Y be sets. Define a partial function from X to Y to be any function f : X' \to Y' whose domain X' is a subset of X, and whose range Y' is a subset of Y. Show that the collection of all partial functions from X to Y is itself a set.

Hints

  1. Use Exercise 3.4.6, the power set axiom, the replacement axiom, and the union axiom.

How to think about the exercise

If X' \subseteq X and Y' \subseteq Y, then we can form the set (Y')^{X'} using the power set axiom; this gives us all of the partial functions with domain X' and range Y'. Now we just want to “loop over” all combinations of X', Y', and take the union to “flatten” the result. In other words, we want the set

\displaystyle \bigcup\{ (Y')^{X'} : X' \subseteq X\text{ and }Y'\subseteq Y\}

But we are not yet done, because we must justify all of the steps using the axioms and results we have so far.

First of all, to use the replacement axiom, we need the “looping over” construct to use the “is an element” relation rather than the “is a subset” relation, i.e. to be in the form \{f(x) : x \in A\} rather than \{f(x) : x \subseteq A\} as above. This is easy to fix: by Lemma 3.4.9, the sets 2^X and 2^Y exist, so the set we want is

\displaystyle \bigcup\{ (Y')^{X'} : X' \in 2^X\text{ and }Y'\in 2^Y\}

Now we run into the problem that the replacement axiom only allows us to “loop over” a single set, so we can’t be looping over the sets 2^X and 2^Y simultaneously. To turn this double loop into a single loop, we can use the notion of ordered pairs and the Cartesian product; but it turns out that in this book, these ideas are only introduced in section 3.5, the next section! If we cheat a little and use these ideas, then the set 2^X \times 2^Y = \{(X', Y') : X' \in 2^X\text{ and }Y' \in 2^Y\} exists. Now we can use the replacement axiom to replace each ordered pair (X', Y') by the set (Y')^{X'}. In other words, we have the set

\displaystyle \bigcup\{ (Y')^{X'} : (X', Y') \in 2^X \times 2^Y\}

I consider this an entirely acceptable solution to this exercise: it is the natural, obvious thing one comes up with. Alternatively, if this solution isn’t acceptable, then I consider this exercise to be pretty unfair! With that complaint out of the way, let’s see if we can find a different solution that does not require Cartesian products.

So, we can only loop over one set at a time, and we aren’t allowed to take the Cartesian product to compress the double loop into a single loop. What to do? If you are familiar with computer programming, you might at this point have an idea.

In Python, given a list X = [1,2,3] and another list Y = [4,5], how can we print all of the ordered pairs (x, y) where x is in X and y is in Y? One idea is to use the Cartesian product functionality that is provided by the language:

from itertools import product
for p in product(X, Y):
    print(p)
# prints the following:
# (1, 4)
# (1, 5)
# (2, 4)
# (2, 5)
# (3, 4)
# (3, 5)

But more obvious idea is to just use a nested loop:

for x in X:
    for y in Y:
        print((x, y))
# prints the following:
# (1, 4)
# (1, 5)
# (2, 4)
# (2, 5)
# (3, 4)
# (3, 5)

(Aside for people who know Haskell: You can think of [(x,y) | x <- [1,2,3], y <- [4,5]] versus [[(x,y) | y <- [4,5]] | x <- [1,2,3]]; the latter must be flattened using concat, analogously to how we must apply union twice below. Consider also currying, e.g. the difference between add :: (Int, Int) -> Int; add (x,y) = x+y and add' :: Int -> (Int -> Int); add' = \x -> (\y -> x+y). The former eats both arguments at once while the latter eats one argument, spitting out a function which can eat an argument. This gives us a way to define functions which can essentially take multiple arguments, without actually taking multiple arguments, analogously to how here we want to simulate taking from multiple sets without actually taking from multiple sets at once.)

The benefit of the nested loop, for our purposes, is that we are only looping over a single list or set at a time, so the axiom of replacement can be applied. How will this work? First, fix a set X' \subseteq X. Then we can form the set \{(Y')^{X'} : Y' \in 2^Y\} using the replacement axiom. This gives us all of the partial functions with domain X'. Now we just need to loop over all of the possible domains, by forming the set \{\{(Y')^{X'} : Y' \in 2^Y\} : X' \in 2^X\}. But since we formed these sets in stages, we must now take the union twice (instead of once, as we did in our first solution above) to “flatten” this set, so our final answer is \bigcup\bigcup\{\{(Y')^{X'} : Y' \in 2^Y\} : X' \in 2^X\}.

Let’s take an example to make sure this actually works. Let X = \{1\} and Y = \{2,3\}. Then we have 2^X = \{\emptyset, \{1\}\} and 2^Y = \{\emptyset, \{2\}, \{3\}, \{2,3\}\}. The partial functions from X to Y are the following:

  • f_1: Domain \emptyset and range \emptyset, which is an empty function.
  • f_2: Domain \emptyset and range \{2\}, which is an empty function.
  • f_3: Domain \emptyset and range \{3\}, which is an empty function.
  • f_4: Domain \emptyset and range \{2,3\}, which is an empty function.
  • f_5: Domain \{1\} and range \{2\}, which maps 1 \mapsto 2.
  • f_6: Domain \{1\} and range \{3\}, which maps 1 \mapsto 3.
  • f_7: Domain \{1\} and range \{2,3\}, which maps 1 \mapsto 2.
  • f_8: Domain \{1\} and range \{2,3\}, which maps 1 \mapsto 3.

Let’s first check the sets \{(Y')^{X'} : Y' \in 2^Y\} for a fixed X'. The subsets of X are \emptyset and \{1\} so we have

\begin{aligned}\{(Y')^{\emptyset} : Y' \in 2^Y\} &= \{\emptyset^\emptyset, \{2\}^\emptyset, \{3\}^\emptyset, \{2,3\}^\emptyset\} \\ &= \{\{f_1\}, \{f_2\}, \{f_3\}, \{f_4\}\}\end{aligned}

and

\begin{aligned}\{(Y')^{\{1\}} : Y' \in 2^Y\} &= \{\emptyset^{\{1\}}, \{2\}^{\{1\}}, \{3\}^{\{1\}}, \{2,3\}^{\{1\}}\} \\ &= \{\emptyset, \{f_5\}, \{f_6\}, \{f_7, f_8\}\}\end{aligned}

Now, the set \{\{(Y')^{X'} : Y' \in 2^Y\} : X' \in 2^X\} is equal to \{\{(Y')^{\emptyset} : Y' \in 2^Y\}, \{(Y')^{\{1\}} : Y' \in 2^Y\}\}, which by the above computation is just \{\{\{f_1\}, \{f_2\}, \{f_3\}, \{f_4\}\}, \{\emptyset, \{f_5\}, \{f_6\}, \{f_7, f_8\}\}\}. Taking the union once gives us \{\{f_1\}, \{f_2\}, \{f_3\}, \{f_4\}, \emptyset, \{f_5\}, \{f_6\}, \{f_7, f_8\}\}, and taking the union again gives us \{f_1,f_2,f_3,f_4,f_5,f_6,f_7, f_8\}, which is precisely what we wanted.

You’ve probably noticed that we could have taken the union at an intermediate stage: \bigcup\{(Y')^{\emptyset} : Y' \in 2^Y\} = \{f_1,f_2,f_3,f_4\} and \bigcup \{(Y')^{\{1\}} : Y' \in 2^Y\} = \{f_5,f_6,f_7, f_8\}. So we can form the set \left\{ \bigcup\{(Y')^{X'} : Y' \in 2^Y\} : X' \in 2^X\right\}, which equals \{\{f_1,f_2,f_3,f_4\},\{f_5,f_6,f_7, f_8\}\}; taking the union again then gives us the same answer. In the solution below we will use this latter expression.

This is one of those exercises where it can be difficult to tell when one is done (alternatively, the difficulty depends a lot on where one decides one is “off the hook”). I can easily imagine people skipping the proof that the resulting set is actually the set of all partial functions, for example.

Model solution

By Lemma 3.4.9, the sets 2^X and 2^Y exist. Fix some subset X' of X. Then the set \{(Y')^{X'} : Y' \in 2^Y\} exists by the power set axiom and replacement axiom. (Formally, we can proceed as follows. Let P(a,b) be the predicate b = a^{X'}, pertaining to sets a,b. If a is a set, then a^{X'} exists by the power set axiom. Thus for each a there is exactly one (hence at most one) b such that P(a,b). Thus by the replacement axiom the set \{b : b = a^{X'}\text{ is true for some }a \in 2^Y\} exists.)

By the union axiom, the set \bigcup \{(Y')^{X'} : Y' \in 2^Y\} exists. By the replacement axiom again, the set \left\{\bigcup \{(Y')^{X'} : Y' \in 2^Y\} : X' \in 2^X\right\} exists. (Formally, we can define the predicate Q(a,b) by b = \bigcup \{(Y')^{a} : Y' \in 2^Y\}.)

By the union axiom again, the set \bigcup \left\{\bigcup \{(Y')^{X'} : Y' \in 2^Y\} : X' \in 2^X\right\} exists. It remains to show that this set is the set of all partial functions from X to Y. Let f be an element of this set. Then by the union axiom, this means f \in S for some S \in \left\{\bigcup \{(Y')^{X'} : Y' \in 2^Y\} : X' \in 2^X\right\}. By the replacement axiom, S = \bigcup \{(Y')^{X'} : Y' \in 2^Y\} for some X' \subseteq X. Thus f \in \bigcup \{(Y')^{X'} : Y' \in 2^Y\} for some X' \subseteq X. By the union axiom, this means f \in S' for some S' \in \{(Y')^{X'} : Y' \in 2^Y\}. By the replacement axiom, S' = (Y')^{X'} for some Y' \subseteq Y. Thus f \in (Y')^{X'} for some Y' \subseteq Y and X' \subseteq X. This means f is a function from X' to Y', which means it is a partial function from X to Y. This shows that every element of \bigcup \left\{\bigcup \{(Y')^{X'} : Y' \in 2^Y\} : X' \in 2^X\right\} is a partial function from X to Y.

Conversely, we now show that every partial function from X to Y is in the set \bigcup \left\{\bigcup \{(Y')^{X'} : Y' \in 2^Y\} : X' \in 2^X\right\}. Let f : X'' \to Y'' be a function, where X'' \subseteq X and Y'' \subseteq Y. Then f \in (Y'')^{X''}. Thus f \in (Y')^{X''} for some Y' \in 2^Y, since Y'' \in 2^Y. If we define S' := (Y'')^{X''}, then S' \in \{(Y')^{X''} : Y' \in 2^Y\}. Thus we have f \in S' for some S' \in \{(Y')^{X''} : Y' \in 2^Y\}, which means that f \in \bigcup\{(Y')^{X''} : Y' \in 2^Y\}. Since X'' \in 2^X, we thus see that f \in \bigcup\{(Y')^{X'} : Y' \in 2^Y\} for some X' \in 2^X. If we define S := \bigcup \{(Y')^{X''} : Y' \in 2^Y\}, then S \in \left\{\bigcup \{(Y')^{X'} : Y' \in 2^Y\} : X' \in 2^X\right\}. Thus we have f \in S for some S \in \left\{\bigcup \{(Y')^{X'} : Y' \in 2^Y\} : X' \in 2^X\right\}, which means that f \in \bigcup \left\{\bigcup \{(Y')^{X'} : Y' \in 2^Y\} : X' \in 2^X\right\}. This is what we wanted to show, so we are done.

Create your website at WordPress.com
Get started