2.6: Counting Past Infinity

Last updated
Save as PDF

Page ID: 6720

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

As children, we all learned to answer the question “How many?” by counting with numbers: 1, 2, 3, 4, . . . . But the question of “How many?” was asked and answered long before the abstract concept of number was in- vented. The answer can be given in terms of “as many as.” How many cousins do you have? As many cousins as I have fingers on both hands. How many sheep do you own? As many sheep as there are notches on this stick. How many baskets of wheat must I pay in taxes? As many baskets as there are stones in this box. The question of how many things are in one collection of objects is answered by exhibiting another, more convenient, collection of objects that has just as many members.

In set theory, the idea of one set having just as many members as another set is expressed in terms of one-to-one correspondence. A one-to-one correspondence between two sets A and B pairs each element of A with an element of B in such a way that every element of B is paired with one and only one element of A. The process of counting, as it is learned by children, establishes a one-to-one correspondence between a set of \(n\) objects and the set of numbers from 1 to \(n\). The rules of counting are the rules of one-to-one correspondence: Make sure you count every object, make sure you don’t count the same object more than once. That is, make sure that each object corresponds to one and only one number. Earlier in this chapter, we used the fancy name “bijective function” to refer to this idea, but we can now see it as as an old, intuitive way of answering the question “How many?”

In counting, as it is learned in childhood, the set \(\{1, 2, 3, . . . , n\}\) is used as a typical set that contains \(n\) elements. In mathematics and computer science, it has become more common to start counting with zero instead of with one, so we define the following sets to use as our basis for counting:

\[\begin{align*} N_{0} &= \emptyset \text{ a set with 0 elements}
\\[4pt] N_{1} &= \{0\} \text{ a set with 1 elements}
\\[4pt] N_{2} &= \{0, 1\} \text{ a set with 2 elements}
\\[4pt] N_{3} &= \{0,1,2\} \text{ a set with 3 elements}
\\[4pt] N_{4} &= \{0,1,2,3\} \text{ a set with 4 elements} \end{align*}\]

and so on. In general, \(N_{n} = \{0,1,2,...,n − 1\}\) for each \(n ∈ \mathbb{N}\). For each natural number \(n\), \(N_{n}\) is a set with \(n\) elements. Note that if \(n \ne m\), then there is no one-to-one correspondence between \(N_{n}\) and \(N_{m}\). This is obvious, but like many obvious things is not all that easy to prove rigorously, and we omit the argument here.

Theorem

For each \(n ∈ \mathbb{N}\), let \(N_{n}\) be the set \(N_{n} = {0,1,...,n − 1}\). If \(n \ne m\), then there is no bijective function from \(N_{m}\) to \(N_{n}\).

We can now make the following definitions:

A set \(A\) is said to be finite if there is a one-to-one correspondence between \(A\) and \(N_{n}\) for some natural number \(n\). We then say that \(n\) is the cardinality of \(A\). The notation |\(A\)| is used to indicate the cardinality of \(A\). That is, if \(A\) is a finite set, then |\(A\)| is the natural number \(n\) such that there is a one-to-one correspondence between \(A\) and \(N_{n}\). A set that is not finite is said to be infinite. That is, a set \(B\) is infinite if for every \(n ∈ \mathbb{N}\), there is no one-to-one correspondence between \(B\) and \(N_{n}\).

Fortunately, we don’t always have to count every element in a set individually to determine its cardinality. Consider, for example, the set \(A × B\), where \(A\) and \(B\) are finite sets. If we already know \(|A|\) and \(|B|\), then we can determine \(|A × B|\) by computation, without explicit counting of elements. In fact, \(|A × B| = |A| · |B|\). The cardinality of the cross product \(A × B\) can be computed by multiplying the cardinality of \(A\) by the cardinality of \(B\). To see why this is true, think of how you might count the elements of \(A × B\). You could put the elements into piles, where all the ordered pairs in a pile have the same first coordinate. There are as many piles as there are elements of \(A\), and each pile contains as many ordered pairs as there are elements of \(B\). That is, there are \(|A|\) piles, with \(|B|\) items in each. By the definition of multiplication, the total number of items in all the piles is \(|A| · |B|\). A similar result holds for the cross product of more that two finite sets. For example,\(|A×B×C|=|A|·|B|·|C|\).

It’s also easy to compute \(|A∪B|\) in the case where \(A\) and \(B\) are disjoint finite sets. (Recall that two sets \(A\) and \(B\) are said to be disjoint if they have no members in common, that is, if \(A ∩ B = ∅\).) Suppose \(|A| = n\) and \(|B| = m\). If we wanted to count the elements of \(A ∪ B\), we could use the \(n\) numbers from 0 to \(n − 1\) to count the elements of \(A\) and then use the \(m\) numbers from \(n\) to \(n+m−1\) to count the elements of \(B\). This amounts to a one-to-one correspondence between \(A ∪ B\) and the set \(N_{n+m}\). We see that \(|A∪B| = n+m\). That is, for disjoint finite sets \(A\) and \(B\),\(|A ∪ B| = |A| + |B|\).

What about \(A ∪ B\), where \(A\) and \(B\) are not disjoint? We have to be careful not to count the elements of \(A ∩ B\) twice. After counting the elements of \(A\), there are only \(|B|−|A∩B|\) new elements in \(B\) that still need to be counted. So we see that for any two finite sets \(A\) and \(B\), \(|A∪B| =|A|+|B|−|A∩B|\).

What about the number of subsets of a finite set \(A\)? What is the relationship between \(|A|\) and \(|\mathcal{P}(A)|\)? The answer is provided by the following theorem.

A finite set with cardinality \(n\) has \(2^{n}\) subsets.

Proof. Let \(P(n)\) be the statement “Any set with cardinality \(n\) has \(2^{n}\) subsets.” We will use induction to show that \(P(n)\) is true for all \(n ∈ N\).

Base case: For \(n = 0, P (n)\) is the statement that a set with cardinality 0 has \(2^{0}\) subsets. The only set with 0 elements is the empty set. The empty set has exactly 1 subset, namely itself. Since \(2^{0} = 1\), \(P(0)\) is true.

Inductive case: Let \(k\) be an arbitrary element of \(\mathbb{N}\), and assume that \(P(k)\) is true. That is, assume that any set with cardinality \(k\) has \(2^{k}\) elements. (This is the induction hypothesis.) We must show that \(P(k + 1)\) follows from this assumption. That is, using the assumption that any set with cardinality \(k\) has \(2^{k}\) subsets, we must show that any set with cardinality \(k + 1\) has \(2^{k+1}\) subsets.

Let \(A\) be an arbitrary set with cardinality \(k + 1\). We must show that \(|\mathcal{P}(A)| = 2k+1\). Since \(|A| > 0\), \(A\) contains at least one element. Let \(x\) be some element of \(A\), and let \(B = \setminus A��{x}\). The cardinality of \(B\) is \(k\), so we have by the induction hypothesis that \(|\mathcal{P}(B)| = 2^{k}\). Now, we can divide the subsets of \(A\) into two classes: subsets of \(A\) that do not contain \(x\) and subsets of \(A\) that do contain \(x\). Let \(Y\) be the collection of subsets of \(A\) that do not contain \(x\), and let \(X\) be the collection of subsets of \(A\) that do contain \(x\).\(X\) and \(Y\) are disjoint, since it is impossible for a given subset of \(A\) both to contain and to not contain \(x\). It follows that \( |\mathcal{P}(A)| = |X ∪ Y | = |X| + |Y |\).

Now, a member of \(Y\) is a subset of \(A\) that does not contain \(x\). But that is exactly the same as saying that a member of \(Y\) is a subset of \(B\). So \(Y =P(B)\), which we know contains \(2^{k}\) members. As for \(X\), there is a one-to-one correspondence between \(\mathcal{P}(B)\) and \(X\). Namely, the function \(f : \mathcal{P}(B) → X\) defined by \(f(C) = C ∪ {x}\) is a bijective function. (The proof of this is left as an exercise.) From this, it follows that \(|X| = |\mathcal{P}(B)| = 2^{k}\). Putting these facts together, we see that \(|\mathcal{P}(A)| = |X|+|Y| = 2^{k} +2^{k} = 2·2^{k} = 2^{k+1}\). This completes the proof that \(P(k) → P (k + 1)\).

We have seen that the notation \(A^{B}\) represents the set of all functions from \(B\) to \(A\). Suppose \(A\) and \(B\) are finite,and that \(|A|=n\) and \(|B|=m\). Then ��\(|AB��|�� = n^{m} = |A|^{|B|}\). (This fact is one of the reasons why the notation \(A^{B}\) is reasonable.) One way to see this is to note that there is a one-to-one correspondence between \(A^{B]\) and across product \(A×A×··A\),where the number of terms in the cross product is \(m\). (This will be shown in one of the exercises at the end of this section.) It follows that ��\(|A^{B}| �� = |A| · |A| · · · |A| =n · n · · · n\), where the factor \(n\) occurs m times in the product. This product is, by definition, \(n^{m}\).

This discussion about computing cardinalities is summarized in the following theorem:

Let A and B be finite sets. Then

\(|A×B|=|A|·|B|.\)
\(|A∪B|=|A|+|B|−|A∩B|.\)
If \(A\) and \(B\) are disjoint then \(|A∪B|=|A|+|B|\).
\(|��A^{B}|�� = |A|^{|B|}.\)
\(|\mathcal{P}(A)| = 2|A|.\)

When it comes to counting and computing cardinalities, this theorem is only the beginning of the story. There is an entire large and deep branch of mathematics known as combinatorics that is devoted mostly to the problem of counting. But the theorem is already enough to answer many questions about cardinalities.

For example, suppose that \(|A| = n\) and \(|B| = m\). We can form the set \(\mathcal{P}(A×B)\), which consists of all subsets of \(A×B\). Using the theorem, we can compute that \(|\mathcal{P}(A × B)| = 2^{|A×B|} = 2^{|A|·|B|} = 2^{nm}\). If we assume that \(A\) and \(B\) are disjoint, then we can compute that ��\(|A^{A∪B}|�� = |A|^{|A∪B|} = nn+m\).

To be more concrete, let \(X = {a,b,c,d,e}\) and let \(Y = {c,d,e,f}\) where \(a, b, c, d, e,\) and \(f\) are distinct. Then \(|X ×Y| = 5·4 = 20\) while \(|X∪Y|=5+4−|{c,d,e}|=6\) and \(|��X^{Y}|��=5^{4} =625\).

We can also answer some simple practical questions. Suppose that in a restaurant you can choose one appetizer and one main course. What is the number of possible meals? If \(A\) is the set of possible appetizers and \(C\) is the set of possible main courses, then your meal is an ordered pair belonging to the set \(A × C\). The number of possible meals is \(|A × C|\), which is the product of the number of appetizers and the number of main courses.

Or suppose that four different prizes are to be awarded, and that the set of people who are eligible for the prizes is \(A\). Suppose that \(|A| = n\). How many different ways are there to award the prizes? One way to answer this question is to view a way of awarding the prizes as a function from the set of prizes to the set of people. Then, if \(P\) is the set of prizes, the number of different ways of awarding the prizes is \(|��A^{P} |\)��. Since \(|P | = 4\) and \(|A| = n\), this is \(n^{4}\). Another way to look at it is to note that the people who win the prizes form an ordered tuple \((a, b, c, d)\), which is an element of \(A × A × A × A\). So the number of different ways of awarding the prizes is \(|A×A×A×A|\), which is \(|A|·|A|·|A|·|A|\). This is \(|A|^{4}\), or \(n^{4}\), the same answer we got before.

So far, we have only discussed finite sets. \(\mathbb{N}\), the set of natural numbers \({0, 1, 2, 3, . . . }\), is an example of an infinite set. There is no one-to-one correspondence between \(\mathbb{N}\) and any of the finite sets \(N_{n}\). Another example of an infinite set is the set of even natural numbers, \(E = {0,2,4,6,8,...}\). There is a natural sense in which the sets \(\mathbb{N}\) and \(E\) have the same number of elements. That is, there is a one-to-one correspondence between them. The function \(f : N → E\) defined by \(f(n) = 2n\) is bijective. We will say that \(\mathbb{N}\) and \(E\) have the same cardinality, even though that cardinality is not a finite number. Note that \(E\) is a proper subset of \(\mathbb{N}\). That is, \(\mathbb{N}\) has a proper subset that has the same cardinality as \(\mathbb{N}\).

We will see that not all infinite sets have the same cardinality. When it comes to infinite sets, intuition is not always a good guide. Most people seem to be torn between two conflicting ideas. On the one hand, they think, it seems that a proper subset of a set should have fewer elements than the set itself. On the other hand, it seems that any two infinite sets should have the same number of elements. Neither of these is true, at least if we define having the same number of elements in terms of one-to-one correspondence.

A set \(A\) is said to be countably infinite if there is a one-to-one correspondence between \(\mathbb{N}\) and \(A\). A set is said to be countable if it is either finite or countably infinite. An infinite set that is not countably infinite is said to be uncountable. If \(X\) is an uncountable set, then there is no one-to-one correspondence between \(\mathbb{N}\) and \(X\).

The idea of “countable infinity” is that even though a countably infinite set cannot be counted in a finite time, we can imagine counting all the elements of \(A\), one-by-one, in an infinite process. A bijective function \(f : \mathbb{N} → A\) provides such an infinite listing: \((f(0), f(1), f(2), f(3), . . . )\). Since \(f\) is onto, this infinite list includes all the elements of \(A\). In fact, making such a list effectively shows that \(A\) is countably infinite, since the list amounts to a bijective function from \(\mathbb{N}\) to \(A\). For an uncountable set, it is impossible to make a list, even an infinite list, that contains all the elements of the set.

Before you start believing in uncountable sets, you should ask for an example. In Chapter 1, we worked with the infinite sets \(\mathbb{Z}\) (the integers), \(\mathbb{Q}\)(the rationals), \(\mathbb{R}\) (the reals), and \(\mathbb{R} \setminus�� \mathbb{Q}\)(the irrationals). Intuitively, these are all “bigger” than \(\mathbb{N}\), but as we have already mentioned, intuition is a poor guide when it comes to infinite sets. Are any of \(\mathbb{Z, Q, R,}\) and \(mathbb{R} \setminus�� \mathbb{Q}\) in fact uncountable?

It turns out that both \(\mathbb{Z}\) and \(\mathbb{Q}\) are only countably infinite. The proof that \(\mathbb{Z}\) is countable is left as an exercise; we will show here that the set of non-negative rational numbers is countable. (The fact that \(\mathbb{Q}\) itself is countable follows easily from this.) The reason is that it’s possible to make an infinite list containing all the non-negative rational numbers. Start the list with all the non-negative rational numbers \(n/m\) such that \(n + m = 1\). There is only one such number, namely 0/1. Next come numbers with \(n + m = 2\). They are 0/2 and 1/1, but we leave out 0/2 since it’s just another way of writing 0/1, which is already in the list. Now, we add the numbers with \(n + m = 3\), namely 0/3, 1/2, and 2/1. Again, we leave out 0/3, since it’s equal to a number already in the list. Next come numbers with \(n + m = 4\). Leaving out 0/4 and 2/2 since they are already in the list, we add 1/3 and 3/1 to the list. We continue in this way, adding numbers with \(n+m = 5\), then numbers with \(n+m = 6\), and so on. The list looks like

\[(\frac{0}{1}, \frac{1}{1}, \frac{1}{2}, \frac{2}{1}, \frac{1}{3}, \frac{3}{1}, \frac{1}{4}, \frac{2}{3}, \frac{3}{2}, \frac{4}{1}, \frac{1}{5}, \frac{5}{1}, \frac{1}{6}, \frac{2}{5},...)\nonumber\]

This process can be continued indefinitely, and every non-negative rational number will eventually show up in the list. So we get a complete, infinite list of non-negative rational numbers. This shows that the set of non-negative rational numbers is in fact countable.

On the other hand, \(\mathbb{R}\) is uncountable. It is not possible to make an infinite list that contains every real number. It is not even possible to make a list that contains every real number between zero and one. Another way of saying this is that every infinite list of real numbers between zero and one, no matter how it is constructed, leaves something out. To see why this is true, imagine such a list, displayed in an infinitely long column. Each row contains one number, which has an infinite number of digits after the decimal point. Since it is a number between zero and one, the only digit before the decimal point is zero. For example, the list might look like this:

\[0.90398937249879561297927654857945. . . \nonumber \]

\[0.12349342094059875980239230834549. . . \nonumber\]

\[0.22400043298436234709323279989579. . . \nonumber\]

\[0.50000000000000000000000000000000. . . \nonumber\]

\[0.77743449234234876990120909480009. . . \nonumber\]

\[0.77755555588888889498888980000111. . . \nonumber\]

\[0.12345678888888888888888800000000. . . \nonumber\]

\[0.34835440009848712712123940320577. . . \nonumber\]

\[0.93473244447900498340999990948900. . . \nonumber\]

This is only (a small part of) one possible list. How can we be certain that every such list leaves out some real number between zero and one? The trick is to look at the digits shown in bold face. We can use these digits to build a number that is not in the list. Since the first number in the list has a 9 in the first position after the decimal point, we know that this number cannot equal any number of, for example, the form 0.4. . . . Since the second number has a 2 in the second position after the decimal point, neither of the first two numbers in the list is equal to any number that begins with 0.44. . . . Since the third number has a 4 in the third position after the decimal point, none of the first three numbers in the list is equal to any number that begins 0.445.... We can continue to construct a number in this way, and we end up with a number that is different from every number in the list. The \(n^{th}\) digit of the number we are building must differ from the \(n^{th}\) digit of the \(n^{th}\) number in the list. These are the digits shown in bold face in the above list. To be definite, I use a 5 when the corresponding boldface number is 4, and otherwise I use a 4. For the list shown above, this gives a number that begins 0.44544445. . . . The number constructed in this way is not in the given list, so the list is incomplete. The same construction clearly works for any list of real numbers between zero and one. No such list can be a complete listing of the real numbers between zero and one, and so there can be no complete listing of all real numbers. We conclude that the set \(\mathbb{R}\) is uncountable.

The technique used in this argument is called diagonalization. It is named after the fact that the bold face digits in the above list lie along a diagonal line. This proof was discovered by a mathematician named Georg Cantor, who caused quite a fuss in the nineteenth century when he came up with the idea that there are different kinds of infinity. Since then, his notion of using one-to-one correspondence to define the cardinalities of infinite sets has been accepted. Mathematicians now consider it almost intuitive that \(\mathbb{N,Z,} and \mathbb{Q}\) have the same cardinality while \(\mathbb{R}\) has a strictly larger cardinality.

Suppose that \(X\) is an uncountable set, and that \(K\) is a countable subset of \(X\). Then the set \(X \setminus�� K\) is uncountable.

Proof. Let \(X\) be an uncountable set. Let \(K ⊆ X\), and suppose that \(K\) is countable. Let \(L = X \setminus�� K\). We want to show that \(L\) is uncountable. Suppose that \(L\) is countable. We will show that this assumption leads to a contradiction.

Note that \(X=K∪(X \setminus ��K)=K∪L\). You will show in Exercise11 of this section that the union of two countable sets is countable. Since \(X\) is the union of the countable sets \(K\) and \(L\), it follows that \(X\) is countable. But this contradicts the fact that \(X\) is uncountable. This contradiction proves the theorem.

In the proof, both \(q\) and \(¬q\) are shown to follow from the assumptions, where \(q\) is the statement “\(X\) is countable.” The statement \(q\) is shown to follow from the assumption that \(X \setminus�� K\) is countable. The statement \(¬q\) is true by assumption. Since \(q\) and \(¬q\) cannot both be true, at least one of the assumptions must be false. The only assumption that can be false is the assumption that \(X \setminus�� K\) is countable.

This theorem, by the way, has the following easy corollary. (A corollary is a theorem that follows easily from another, previously proved theorem.)

Let \(X\) be any set. Then there is no one-to-one correspondence between \(X\) and \(\mathcal{P}(X)\).

Proof. Given an arbitrary function \(f : X → \mathcal{P}(X)\), we can show that \(f\) is not onto. Since a one-to-one correspondence is both one-to-one and onto, this shows that \(f\) is not a one-to-one correspondence.

Recall that \(\mathcal{P}(X)\) is the set of subsets of \(X\). So, for each \(x ∈ X, f(x)\) is a subset of \(X\). We have to show that no matter how \(f\) is defined, there is some subset of \(X\) that is not in the image of \(f\).

Given \(f\),we define \(A\) to be the set \(A=\{x∈X|x̸∈f(x)\}\). The test “\(x ̸∈ f(x)\)” makes sense because \(f(x)\) is a set. Since \(A⊆X\), we have that \(A∈P(X)\). However, \(A\) is not in the image of \(f\). That is,for every \(y∈X,A \ne f(y)\).6 To see why this is true, let \(y\) be any element of \(X\). There are two cases to consider. Either \(y ∈ f(y)\) or \(y ̸∈ f(y)\). We show that whichever case holds, \(A \ne f(y)\). If it is true that \(y ∈ f(y)\), then by the definition of \(A\), \(y∈A\). Sincey∈f(y)buty̸∈A,f(y)andAdonothavethesameelements and therefore are not equal. On the other hand, suppose that y ̸∈ f(y).