Exercise statement
Let
be sets such that
, and suppose that there is an injection
. Define the sets
recursively by setting
, and then
for all natural numbers
. Prove that the sets
are all disjoint from each other (i.e.,
whenever
). Also show that if
is the function defined by setting
when
, and
when
, then
does indeed map
to
and is a bijection between the two. In particular,
and
have the same cardinality.
Hints
- Draw a picture.
- You don’t lose much by taking
, and this simplifies the picture.
- Beware that Tao’s notation is non-standard here: typically in setting up the Schröder–Bernstein theorem we are given injections
and
(with no subset relation assumed between
and
) and are asked to define a bijection
. In Tao’s book, the usual
is just the inclusion map
(see Exercise 3.3.8 for a definition), the usual
is instead called
, and the usual bijection
is instead called
. If you are referring to other expositions as you work on this exercise, you may want to temporarily convert Tao’s notation to match the notation in these other expositions.
How to think about the exercise
This is a pretty tough exercise, if you really want to understand what is going on instead of just bashing out a proof.
To simplify the exposition, let us assume that
. At the end of this section we will show how to adjust in the case where we only assume
, and in the model solution below we will revert to only assuming
.
To set the stage, let’s draw a picture of two sets
with
:

(Throughout this post, I will be copying from Richard Hammack’s exposition the idea of using suggestive shapes for the two sets as well as the idea generally of visually representing the sets by embedding them within each other.)
We define
:

We have an injection
, i.e.
since we assumed
. This means that we can put a copy
of the set
inside
:

Notice two things: first the shape of
is the same as the shape of
. This is because we assumed that
is injective, hence it doesn’t “deform” the shape by mapping multiple starting points onto the same point. Second,
does not take up all of
, since we didn’t assume that
is a bijection.
Now we define
. Since
, we have
. Thus we should see a copy of the green stuff within the copy of
that we produced:

We can keep going further with this same process. Our copy of
(namely
) has a copy of
within it (namely
) so using
we can place a copy of our copy of
(namely
) inside our copy of
(namely
):

By the injectivity of
, this “copy of a copy” of
again has a shape similar to that of the original
. The green stuff that is
is embedded within
, so it also gets copied inside of
. We can define
and see this as the third layer of green stuff in the picture:

Hopefully by this point you get the idea. As we make more and more copies of
within copies of
within copies of
, we accumulate more and more green stuff. The union of all of the green stuff is
, and the white stuff we are left with is the complement,
.
Having all of these copies within copies is all nice and trippy, but how does this help us define a bijection
? What do we even want, in terms of the picture we have drawn? We want to take each point in
and map it to some point in
in such a way that each point in
corresponds to exactly one point in
. The trick to doing this is the following: each point colored in white can stay put, and each point colored in green can “climb up” one level to hit a green point in an outer level:

In other words, the points in
map to
via
, the points in
map to
via
, and so on (that’s all the green stuff “climbing up”), while the points in the rest of
just stay where they are (that’s all the white stuff). If you look at the picture, I hope you can see that no matter which point in
you pick, there will be exactly one point in
that maps to it.
Now that we have the idea, how do we write down the formal definition of this bijection
? The green stuff is
, so you might be tempted to write the following:

But actually, what is
? We defined
, so the points in
aren’t even in the domain of
! This means we don’t need to include
in the union, so we can instead define:

with the union index starting at
. In the end, starting the union at
or
does not change the behavior of
: we never encounter the case where
because
is empty.
That concludes the intuition for the construction outlined in the exercise statement. Here are a few other points I want to make:
- Why do we need to show that the sets
are all disjoint from each other? It is not necessary to show this in order to prove the bijectivity of
. My understanding is that it is just an interesting additional fact one can prove about this setup, and also justifies drawing the separate green parts in a way that looks disjoint.
- Why do we assume that
is an injection rather than a bijection (as was originally the case in the 1st and 2nd editions of the book)? I haven’t thought about this in detail. I think it makes sense to use the weaker assumption of injectivity if bijectivity is not required, especially since the Schröder–Bernstein theorem only assumes injectivity.
- What happens when we only assume
rather than
as above? The pictures above change slightly: now there is a bigger set
surrounding both
and
. Otherwise the pictures don’t change. In the proof there is slightly more work since we must verify that when we use
in the definition of
, the output lands in
instead of
.
- Since
is assumed to be an injection rather than a bijection, what does it even mean to write
for some
? For an injection
, the map
is a bijection (why?), so
is a (bijective) function. So whenever
we can define
. If
we leave
undefined. This means that in the model solution below, when we define
, we must verify that if
then
.
- Why did Tao assume that
instead of allowing
to be unrelated sets (like in most expositions of the Schröder–Bernstein theorem)? Assuming
simplifies the drawing: if
are unrelated sets, one would need to draw two pictures instead of one (like on this page). By making
a subset of
, there is just one picture, and there is a simple rule for defining the bijection
: if you’re in a white area, stay put, and if you’re in a green area, “climb up” one level to take up a larger amount of space in a bijective way.
Finally, I want to consider a particularly simple case of the construction outlined in this exercise: what if the sets
are all finite? If there is an injection
, that means
by Exercise 3.6.7. But since
, we also have
by Proposition 3.6.14(c). Thus
. If
then we would have
by Proposition 3.6.14(c) again, a contradiction. Thus we must have
. Similarly we have
and
so
. Since
, we can use Proposition 3.6.14(c) to conclude again that
. Thus in the finite case, all three sets must be the same. So we can try something like
. Then
, and
, and so on, so that
as well. This means that in the definition of
, we only ever encounter the case where
. This does indeed give a bijection, since the identity map
is a bijection.
Model solution
First we show that the sets
are all disjoint from each other. We do this by showing the following statement: for every natural number
and every positive integer
, the sets
and
are disjoint. Once we have proved this, given two distinct natural numbers, we can let
be the larger one so that
. Then we can write
for some positive integer
and see that
and
are disjoint.
To prove the statement from above, let
be a positive integer. We will induct on
. For the base case,
, we must show that
and
are disjoint. But
and
so these two sets are disjoint: if
we would have
and
, a contradiction. Now suppose inductively that
and
are disjoint. We need to show that
and
are disjoint. Suppose for sake of contradiction that there is some
. Then since
and
there exist
and
such that
and
. But
is injective so
implies
. Since
and
are disjoint, this is impossible. This closes the induction.
Now define
as in the exercise statement:

We first verify that
maps from
to
. The things to check are:
is defined for each
: the value
is only defined when
, so we must verify that if
then
. So suppose
. Then
for some natural number
, so we have
as required.
- If
then
lands in
: We have
for some natural number
, so
for some
. Thus
. But
and
for each positive
, so in any case we have
as required.
- If
then
lands in
: In this case, we still have
since
is in the domain of
, which means
since
.
Thus
maps from
to
.
Now we show that
is injective. Let
. We must show that
implies
. So suppose
. We have three cases:
- Both
and
are in
: We have
. Since
is injective,
is a bijection. Thus whenever
is defined, we can assume it acts bijectively. We thus have
.
- Neither
nor
is in
: By definition of
we have
and
so
means
.
- Exactly one of the two elements is in
: We can suppose
and
(we can relabel
and
and repeat the same steps if it is the other way around). We have
and
so
. But
for some natural number
, so
. Thus
. This means
is either in
or is in
. But
is in
(as it is in the domain of
) so it can’t be in
, and
by hypothesis of this case, so we have a contradiction. This means that this case actually isn’t possible so we don’t need to worry about it.
Now we show that
is surjective. To do this, let
. We must find some
such that
. We have two cases,
or
.
- If
then
for some natural number
so
, which means that
. Thus if we let
then we have
.
- On the other hand, if
then
is also not in the smaller sets
and
. From
we see that
, so
is defined. And from
we see that
. Thus for
we have
.
In each case we have found some
such that
, so
is surjective.
Acknowledgments: Thanks to Satira for having extensive discussions with me about this exercise.