Let be sets such that , and suppose that there is an injection . Define the sets recursively by setting , and then for all natural numbers . Prove that the sets are all disjoint from each other (i.e., whenever ). Also show that if is the function defined by setting when , and when , then does indeed map to and is a bijection between the two. In particular, and have the same cardinality.
- Draw a picture.
- You don’t lose much by taking , and this simplifies the picture.
- Beware that Tao’s notation is non-standard here: typically in setting up the Schröder–Bernstein theorem we are given injections and (with no subset relation assumed between and ) and are asked to define a bijection . In Tao’s book, the usual is just the inclusion map (see Exercise 3.3.8 for a definition), the usual is instead called , and the usual bijection is instead called . If you are referring to other expositions as you work on this exercise, you may want to temporarily convert Tao’s notation to match the notation in these other expositions.
How to think about the exercise
This is a pretty tough exercise, if you really want to understand what is going on instead of just bashing out a proof.
To simplify the exposition, let us assume that . At the end of this section we will show how to adjust in the case where we only assume , and in the model solution below we will revert to only assuming .
To set the stage, let’s draw a picture of two sets with :
(Throughout this post, I will be copying from Richard Hammack’s exposition the idea of using suggestive shapes for the two sets as well as the idea generally of visually representing the sets by embedding them within each other.)
We define :
We have an injection , i.e. since we assumed . This means that we can put a copy of the set inside :
Notice two things: first the shape of is the same as the shape of . This is because we assumed that is injective, hence it doesn’t “deform” the shape by mapping multiple starting points onto the same point. Second, does not take up all of , since we didn’t assume that is a bijection.
Now we define . Since , we have . Thus we should see a copy of the green stuff within the copy of that we produced:
We can keep going further with this same process. Our copy of (namely ) has a copy of within it (namely ) so using we can place a copy of our copy of (namely ) inside our copy of (namely ):
By the injectivity of , this “copy of a copy” of again has a shape similar to that of the original . The green stuff that is is embedded within , so it also gets copied inside of . We can define and see this as the third layer of green stuff in the picture:
Hopefully by this point you get the idea. As we make more and more copies of within copies of within copies of , we accumulate more and more green stuff. The union of all of the green stuff is , and the white stuff we are left with is the complement, .
Having all of these copies within copies is all nice and trippy, but how does this help us define a bijection ? What do we even want, in terms of the picture we have drawn? We want to take each point in and map it to some point in in such a way that each point in corresponds to exactly one point in . The trick to doing this is the following: each point colored in white can stay put, and each point colored in green can “climb up” one level to hit a green point in an outer level:
In other words, the points in map to via , the points in map to via , and so on (that’s all the green stuff “climbing up”), while the points in the rest of just stay where they are (that’s all the white stuff). If you look at the picture, I hope you can see that no matter which point in you pick, there will be exactly one point in that maps to it.
Now that we have the idea, how do we write down the formal definition of this bijection ? The green stuff is , so you might be tempted to write the following:
But actually, what is ? We defined , so the points in aren’t even in the domain of ! This means we don’t need to include in the union, so we can instead define:
with the union index starting at . In the end, starting the union at or does not change the behavior of : we never encounter the case where because is empty.
That concludes the intuition for the construction outlined in the exercise statement. Here are a few other points I want to make:
- Why do we need to show that the sets are all disjoint from each other? It is not necessary to show this in order to prove the bijectivity of . My understanding is that it is just an interesting additional fact one can prove about this setup, and also justifies drawing the separate green parts in a way that looks disjoint.
- Why do we assume that is an injection rather than a bijection (as was originally the case in the 1st and 2nd editions of the book)? I haven’t thought about this in detail. I think it makes sense to use the weaker assumption of injectivity if bijectivity is not required, especially since the Schröder–Bernstein theorem only assumes injectivity.
- What happens when we only assume rather than as above? The pictures above change slightly: now there is a bigger set surrounding both and . Otherwise the pictures don’t change. In the proof there is slightly more work since we must verify that when we use in the definition of , the output lands in instead of .
- Since is assumed to be an injection rather than a bijection, what does it even mean to write for some ? For an injection , the map is a bijection (why?), so is a (bijective) function. So whenever we can define . If we leave undefined. This means that in the model solution below, when we define , we must verify that if then .
- Why did Tao assume that instead of allowing to be unrelated sets (like in most expositions of the Schröder–Bernstein theorem)? Assuming simplifies the drawing: if are unrelated sets, one would need to draw two pictures instead of one (like on this page). By making a subset of , there is just one picture, and there is a simple rule for defining the bijection : if you’re in a white area, stay put, and if you’re in a green area, “climb up” one level to take up a larger amount of space in a bijective way.
Finally, I want to consider a particularly simple case of the construction outlined in this exercise: what if the sets are all finite? If there is an injection , that means by Exercise 3.6.7. But since , we also have by Proposition 3.6.14(c). Thus . If then we would have by Proposition 3.6.14(c) again, a contradiction. Thus we must have . Similarly we have and so . Since , we can use Proposition 3.6.14(c) to conclude again that . Thus in the finite case, all three sets must be the same. So we can try something like . Then , and , and so on, so that as well. This means that in the definition of , we only ever encounter the case where . This does indeed give a bijection, since the identity map is a bijection.
First we show that the sets are all disjoint from each other. We do this by showing the following statement: for every natural number and every positive integer , the sets and are disjoint. Once we have proved this, given two distinct natural numbers, we can let be the larger one so that . Then we can write for some positive integer and see that and are disjoint.
To prove the statement from above, let be a positive integer. We will induct on . For the base case, , we must show that and are disjoint. But and so these two sets are disjoint: if we would have and , a contradiction. Now suppose inductively that and are disjoint. We need to show that and are disjoint. Suppose for sake of contradiction that there is some . Then since and there exist and such that and . But is injective so implies . Since and are disjoint, this is impossible. This closes the induction.
Now define as in the exercise statement:
We first verify that maps from to . The things to check are:
- is defined for each : the value is only defined when , so we must verify that if then . So suppose . Then for some natural number , so we have as required.
- If then lands in : We have for some natural number , so for some . Thus . But and for each positive , so in any case we have as required.
- If then lands in : In this case, we still have since is in the domain of , which means since .
Thus maps from to .
Now we show that is injective. Let . We must show that implies . So suppose . We have three cases:
- Both and are in : We have . Since is injective, is a bijection. Thus whenever is defined, we can assume it acts bijectively. We thus have .
- Neither nor is in : By definition of we have and so means .
- Exactly one of the two elements is in : We can suppose and (we can relabel and and repeat the same steps if it is the other way around). We have and so . But for some natural number , so . Thus . This means is either in or is in . But is in (as it is in the domain of ) so it can’t be in , and by hypothesis of this case, so we have a contradiction. This means that this case actually isn’t possible so we don’t need to worry about it.
Now we show that is surjective. To do this, let . We must find some such that . We have two cases, or .
- If then for some natural number so , which means that . Thus if we let then we have .
- On the other hand, if then is also not in the smaller set . Thus for we have .
In each case we have found some such that , so is surjective.
Acknowledgments: Thanks to Satira for having extensive discussions with me about this exercise.