Skip to main content

Section 0.3 Basic set theory

Note: 1–3 lectures (some material can be skipped, covered lightly, or left as reading)

Before we start talking about analysis, we need to fix some language. Modern 1  analysis uses the language of sets, and therefore that is where we start. We talk about sets in a rather informal way, using the so-called “naïve set theory.” Do not worry, that is what majority of mathematicians use, and it is hard to get into trouble. The reader has hopefully seen the very basics of set theory and proof writing before, and this section should be a quick refresher.

Subsection 0.3.1 Sets

Definition 0.3.1.

A set is a collection of objects called elements or members. A set with no objects is called the empty set and is denoted by \(\emptyset\) (or sometimes by \(\{ \}\)).

Think of a set as a club with a certain membership. For example, the students who play chess are members of the chess club. The same student can be a member of many different clubs. However, do not take the analogy too far. A set is only defined by the members that form the set; two sets that have the same members are the same set.

Most of the time we will consider sets of numbers. For example, the set

\begin{equation*} S := \{ 0, 1, 2 \} \end{equation*}

is the set containing the three elements 0, 1, and 2. By “\(:=\)”, we mean we are defining what \(S\) is, rather than just showing equality. We write

\begin{equation*} 1 \in S \end{equation*}

to denote that the number 1 belongs to the set \(S\text{.}\) That is, 1 is a member of \(S\text{.}\) At times we want to say that two elements are in a set \(S\text{,}\) so we write “\(1,2 \in S\)” as a shorthand for “\(1 \in S\) and \(2 \in S\text{.}\)

Similarly, we write

\begin{equation*} 7 \notin S \end{equation*}

to denote that the number 7 is not in \(S\text{.}\) That is, 7 is not a member of \(S\text{.}\)

The elements of all sets under consideration come from some set we call the universe. For simplicity, we often consider the universe to be the set that contains only the elements we are interested in. The universe is generally understood from context and is not explicitly mentioned. In this course, our universe will most often be the set of real numbers.

While the elements of a set are often numbers, other objects, such as other sets, can be elements of a set. A set may also contain some of the same elements as another set. For example,

\begin{equation*} T := \{ 0, 2 \} \end{equation*}

contains the numbers 0 and 2. In this case all elements of \(T\) also belong to \(S\text{.}\) We write \(T \subset S\text{.}\) See Figure 1 for a diagram.

Figure 1. A diagram of the example sets \(S\) and its subset \(T\text{.}\)

Definition 0.3.2.

  1. A set \(A\) is a subset of a set \(B\) if \(x \in A\) implies \(x \in B\text{,}\) and we write \(A \subset B\text{.}\) That is, all members of \(A\) are also members of \(B\text{.}\) At times we write \(B \supset A\) to mean the same thing.

  2. Two sets \(A\) and \(B\) are equal if \(A \subset B\) and \(B \subset A\text{.}\) We write \(A = B\text{.}\) That is, \(A\) and \(B\) contain exactly the same elements. If it is not true that \(A\) and \(B\) are equal, then we write \(A \not= B\text{.}\)

  3. A set \(A\) is a proper subset of \(B\) if \(A \subset B\) and \(A \not= B\text{.}\) We write \(A \subsetneq B\text{.}\)

For the example \(S\) and \(T\) defined above, \(T \subset S\text{,}\) but \(T \not= S\text{.}\) So \(T\) is a proper subset of \(S\text{.}\) If \(A = B\text{,}\) then \(A\) and \(B\) are simply two names for the same exact set.

To define sets, one often uses the set building notation,

\begin{equation*} \bigl\{ x \in A : P(x) \bigr\} . \end{equation*}

This notation refers to a subset of the set \(A\) containing all elements of \(A\) that satisfy the property \(P(x)\text{.}\) Using \(S = \{ 0, 1, 2 \}\) as above, \(\{ x \in S : x \not= 2 \}\) is the set \(\{ 0, 1 \}\text{.}\) The notation is sometimes abbreviated as \(\bigl\{ x : P(x) \bigr\}\text{,}\) that is, \(A\) is not mentioned when understood from context. Furthermore, \(x \in A\) is sometimes replaced with a formula to make the notation easier to read.

Example 0.3.3.

The following are sets including the standard notations.

  1. The set of natural numbers, \(\N := \{ 1, 2, 3, \ldots \}\text{.}\)

  2. The set of integers, \(\Z := \{ 0, -1, 1, -2, 2, \ldots \}\text{.}\)

  3. The set of rational numbers, \(\Q := \bigl\{ \frac{m}{n} : m, n \in \Z \text{ and } n \not= 0 \bigr\}\text{.}\)

  4. The set of even natural numbers, \(\{ 2m : m \in \N \}\text{.}\)

  5. The set of real numbers, \(\R\text{.}\)

Note that \(\N \subset \Z \subset \Q \subset \R\text{.}\)

We create new sets out of old ones by applying some natural operations.

Definition 0.3.4.

  1. A union of two sets \(A\) and \(B\) is defined as

    \begin{equation*} A \cup B := \{ x : x \in A \text{ or } x \in B \} . \end{equation*}
  2. An intersection of two sets \(A\) and \(B\) is defined as

    \begin{equation*} A \cap B := \{ x : x \in A \text{ and } x \in B \} . \end{equation*}
  3. A complement of \(B\) relative to \(A\) (or set-theoretic difference of \(A\) and \(B\)) is defined as

    \begin{equation*} A \setminus B := \{ x : x \in A \text{ and } x \notin B \} . \end{equation*}
  4. We say complement of \(B\) and write \(B^c\) instead of \(A \setminus B\) if the set \(A\) is either the entire universe or if it is the obvious set containing \(B\text{,}\) and is understood from context.

  5. We say sets \(A\) and \(B\) are disjoint if \(A \cap B = \emptyset\text{.}\)

The notation \(B^c\) may be a little vague at this point. If the set \(B\) is a subset of the real numbers \(\R\text{,}\) then \(B^c\) means \(\R \setminus B\text{.}\) If \(B\) is naturally a subset of the natural numbers, then \(B^c\) is \(\N \setminus B\text{.}\) If ambiguity can arise, we use the set difference notation \(A \setminus B\text{.}\)

Figure 2. Venn diagrams of set operations, the result of the operation is shaded.

We illustrate the operations on the Venn diagrams in Figure 2. Let us now establish one of most basic theorems about sets and logic.


The first statement is proved by the second statement if we assume the set \(A\) is our “universe.”

Let us prove \(A \setminus (B \cup C) = (A \setminus B) \cap (A \setminus C)\text{.}\) Remember the definition of equality of sets. First, we must show that if \(x \in A \setminus (B \cup C)\text{,}\) then \(x \in (A \setminus B) \cap (A \setminus C)\text{.}\) Second, we must also show that if \(x \in (A \setminus B) \cap (A \setminus C)\text{,}\) then \(x \in A \setminus (B \cup C)\text{.}\)

So let us assume \(x \in A \setminus (B \cup C)\text{.}\) Then \(x\) is in \(A\text{,}\) but not in \(B\) nor \(C\text{.}\) Hence \(x\) is in \(A\) and not in \(B\text{,}\) that is, \(x \in A \setminus B\text{.}\) Similarly \(x \in A \setminus C\text{.}\) Thus \(x \in (A \setminus B) \cap (A \setminus C)\text{.}\)

On the other hand suppose \(x \in (A \setminus B) \cap (A \setminus C)\text{.}\) In particular, \(x \in (A \setminus B)\text{,}\) so \(x \in A\) and \(x \notin B\text{.}\) Also as \(x \in (A \setminus C)\text{,}\) then \(x \notin C\text{.}\) Hence \(x \in A \setminus (B \cup C)\text{.}\)

The proof of the other equality is left as an exercise.

The result above we called a Theorem, while most results we call a Proposition, and a few we call a Lemma (a result leading to another result) or Corollary (a quick consequence of the preceding result). Do not read too much into the naming. Some of it is traditional, some of it is stylistic choice. It is not necessarily true that a Theorem is always “more important” than a Proposition or a Lemma.

We will also need to intersect or union several sets at once. If there are only finitely many, then we simply apply the union or intersection operation several times. However, suppose we have an infinite collection of sets (a set of sets) \(\{ A_1, A_2, A_3, \ldots \}\text{.}\) We define

\begin{equation*} \begin{aligned} & \bigcup_{n=1}^\infty A_n := \{ x : x \in A_n \text{ for some } n \in \N \} , \\ & \bigcap_{n=1}^\infty A_n := \{ x : x \in A_n \text{ for all } n \in \N \} . \end{aligned} \end{equation*}

We can also have sets indexed by two natural numbers. For example, we can have the set of sets \(\{ A_{1,1}, A_{1,2}, A_{2,1}, A_{1,3}, A_{2,2}, A_{3,1}, \ldots \}\text{.}\) Then we write

\begin{equation*} \bigcup_{n=1}^\infty \bigcup_{m=1}^\infty A_{n,m} = \bigcup_{n=1}^\infty \left( \bigcup_{m=1}^\infty A_{n,m} \right) . \end{equation*}

And similarly with intersections.

It is not hard to see that we can take the unions in any order. However, switching the order of unions and intersections is not generally permitted without proof. For instance,

\begin{equation*} \bigcup_{n=1}^\infty \bigcap_{m=1}^\infty \{ k \in \N : mk < n \} = \bigcup_{n=1}^\infty \emptyset = \emptyset . \end{equation*}


\begin{equation*} \bigcap_{m=1}^\infty \bigcup_{n=1}^\infty \{ k \in \N : mk < n \} = \bigcap_{m=1}^\infty \N = \N. \end{equation*}

Sometimes, the index set is not the natural numbers. In such a case we require a more general notation. Suppose \(I\) is some set and for each \(\lambda \in I\text{,}\) there is a set \(A_\lambda\text{.}\) Then we define

\begin{equation*} \bigcup_{\lambda \in I} A_\lambda := \{ x : x \in A_\lambda \text{ for some } \lambda \in I \} , \qquad \bigcap_{\lambda \in I} A_\lambda := \{ x : x \in A_\lambda \text{ for all } \lambda \in I \} . \end{equation*}

Subsection 0.3.2 Induction

When a statement includes an arbitrary natural number, a common method of proof is the principle of induction. We start with the set of natural numbers \(\N = \{ 1,2,3,\ldots \}\text{,}\) and we give them their natural ordering, that is, \(1 < 2 < 3 < 4 < \cdots\text{.}\) By \(S \subset \N\) having a least element, we mean that there exists an \(x \in S\text{,}\) such that for every \(y \in S\text{,}\) we have \(x \leq y\text{.}\)

The natural numbers \(\N\) ordered in the natural way possess the so-called well ordering property. We take this property as an axiom; we simply assume it is true.

The principle of induction is the following theorem, which is in a sense 2  equivalent to the well ordering property of the natural numbers.


Let \(S\) be the set of natural numbers \(m\) for which \(P(m)\) is not true. Suppose for contradiction that \(S\) is nonempty. Then \(S\) has a least element by the well ordering property. Call \(m \in S\) the least element of \(S\text{.}\) We know \(1 \notin S\) by hypothesis. So \(m > 1\text{,}\) and \(m-1\) is a natural number as well. Since \(m\) is the least element of \(S\text{,}\) we know that \(P(m-1)\) is true. But the induction step says that \(P(m-1+1) = P(m)\) is true, contradicting the statement that \(m \in S\text{.}\) Therefore, \(S\) is empty and \(P(n)\) is true for all \(n \in \N\text{.}\)

Sometimes it is convenient to start at a different number than 1, all that changes is the labeling. The assumption that \(P(n)\) is true in “if \(P(n)\) is true, then \(P(n+1)\) is true” is usually called the induction hypothesis.

Example 0.3.7.

Let us prove that for all \(n \in \N\text{,}\)

\begin{equation*} 2^{n-1} \leq n! \qquad \text{(recall } n! = 1 \cdot 2 \cdot 3 \cdots n\text{)}. \end{equation*}

We let \(P(n)\) be the statement that \(2^{n-1} \leq n!\) is true. By plugging in \(n=1\text{,}\) we see that \(P(1)\) is true.

Suppose \(P(n)\) is true. That is, suppose \(2^{n-1} \leq n!\) holds. Multiply both sides by 2 to obtain

\begin{equation*} 2^n \leq 2(n!) . \end{equation*}

As \(2 \leq (n+1)\) when \(n \in \N\text{,}\) we have \(2(n!) \leq (n+1)(n!) = (n+1)!\text{.}\) That is,

\begin{equation*} 2^n \leq 2(n!) \leq (n+1)!, \end{equation*}

and hence \(P(n+1)\) is true. By the principle of induction, \(P(n)\) is true for all \(n \in \N\text{.}\) In other words, \(2^{n-1} \leq n!\) is true for all \(n \in \N\text{.}\)

Example 0.3.8.

We claim that for all \(c \not= 1\text{,}\)

\begin{equation*} 1 + c + c^2 + \cdots + c^n = \frac{1-c^{n+1}}{1-c} . \end{equation*}

Proof: It is easy to check that the equation holds with \(n=1\text{.}\) Suppose it is true for \(n\text{.}\) Then

\begin{equation*} \begin{split} 1 + c + c^2 + \cdots + c^n + c^{n+1} & = ( 1 + c + c^2 + \cdots + c^n ) + c^{n+1} \\ & = \frac{1-c^{n+1}}{1-c} + c^{n+1} \\ & = \frac{1-c^{n+1} + (1-c)c^{n+1}}{1-c} \\ & = \frac{1-c^{n+2}}{1-c} . \end{split} \end{equation*}

Sometimes, it is easier to use in the inductive step that \(P(k)\) is true for all \(k = 1,2,\ldots,n\text{,}\) not just for \(k=n\text{.}\) This principle is called strong induction and is equivalent to the normal induction above. The proof of that equivalence is left as an exercise.

Subsection 0.3.3 Functions

Informally, a set-theoretic function \(f\) taking a set \(A\) to a set \(B\) is a mapping that to each \(x \in A\) assigns a unique \(y \in B\text{.}\) We write \(f \colon A \to B\text{.}\) An example function \(f \colon S \to T\) taking \(S := \{ 0, 1, 2 \}\) to \(T := \{ 0, 2 \}\) can be defined by assigning \(f(0) := 2\text{,}\) \(f(1) := 2\text{,}\) and \(f(2) := 0\text{.}\) That is, a function \(f \colon A \to B\) is a black box, into which we stick an element of \(A\) and the function spits out an element of \(B\text{.}\) Sometimes \(f\) is called a mapping or a map, and we say \(f\) maps \(A\) to \(B\).

Often, functions are defined by some sort of formula; however, you should really think of a function as just a very big table of values. The subtle issue here is that a single function can have several formulas, all giving the same function. Also, for many functions, there is no formula that expresses its values.

To define a function rigorously, first let us define the Cartesian product.

Definition 0.3.10.

Let \(A\) and \(B\) be sets. The Cartesian product is the set of tuples defined as

\begin{equation*} A \times B := \bigl\{ (x,y) : x \in A, y \in B \bigr\} . \end{equation*}

For instance, \(\{ a,b \} \times \{ c , d\} = \bigl\{ (a,c), (a,d), (b,c), (b,d) \bigr\}\text{.}\) A more complicated example is the set \([0,1] \times [0,1]\text{:}\) a subset of the plane bounded by a square with vertices \((0,0)\text{,}\) \((0,1)\text{,}\) \((1,0)\text{,}\) and \((1,1)\text{.}\) When \(A\) and \(B\) are the same set we sometimes use a superscript 2 to denote such a product. For example, \([0,1]^2 = [0,1] \times [0,1]\) or \(\R^2 = \R \times \R\) (the Cartesian plane).

Definition 0.3.11.

A function \(f \colon A \to B\) is a subset \(f\) of \(A \times B\) such that for each \(x \in A\text{,}\) there exists a unique \(y \in B\) for which \((x,y) \in f\text{.}\) We write \(f(x) = y\text{.}\) Sometimes the set \(f\) is called the graph of the function rather than the function itself.

The set \(A\) is called the domain of \(f\) (and sometimes confusingly denoted \(D(f)\)). The set

\begin{equation*} R(f) := \{ y \in B : \text{there exists an } x \in A \text{ such that } f(x)=y \} \end{equation*}

is called the range of \(f\text{.}\) The set \(B\) is called the codomain of \(f\text{.}\)

It is possible that the range \(R(f)\) is a proper subset of the codomain \(B\text{,}\) while the domain of \(f\) is always equal to \(A\text{.}\) We generally assume that the domain of \(f\) is nonempty.

Example 0.3.12.

From calculus, you are most familiar with functions taking real numbers to real numbers. However, you saw some other types of functions as well. The derivative is a function mapping the set of differentiable functions to the set of all functions. Another example is the Laplace transform, which also takes functions to functions. Yet another example is the function that takes a continuous function \(g\) defined on the interval \([0,1]\) and returns the number \(\int_0^1 g(x) \,dx\text{.}\)

Definition 0.3.13.

Consider a function \(f \colon A \to B\text{.}\) Define the image (or direct image) of a subset \(C \subset A\) as

\begin{equation*} f(C) := \bigl\{ f(x) \in B : x \in C \bigr\} . \end{equation*}

Define the inverse image of a subset \(D \subset B\) as

\begin{equation*} f^{-1}(D) := \bigl\{ x \in A : f(x) \in D \bigr\} . \end{equation*}

In particular, \(R(f) = f(A)\text{,}\) the range is the direct image of the domain \(A\text{.}\)

Figure 3. Example of direct and inverse images for the function \(f \colon \{ 1,2,3,4 \} \to \{ a,b,c,d \}\) defined by \(f(1) := b\text{,}\) \(f(2) := d\text{,}\) \(f(3) := c\text{,}\) \(f(4) := b\text{.}\)

Example 0.3.14.

Define the function \(f \colon \R \to \R\) by \(f(x) := \sin(\pi x)\text{.}\) Then \(f\bigl([0,\nicefrac{1}{2}]\bigr) = [0,1]\text{,}\) \(f^{-1}\bigl(\{0\}\bigr) = \Z\text{,}\) etc.

Read the last line of the proposition as \(f^{-1}( B \setminus C) = A \setminus f^{-1} (C)\text{.}\)


We start with the union. If \(x \in f^{-1}( C \cup D)\text{,}\) then \(x\) is taken to \(C\) or \(D\text{,}\) that is, \(f(x) \in C\) or \(f(x) \in D\text{.}\) Thus \(f^{-1}( C \cup D) \subset f^{-1} (C) \cup f^{-1} (D)\text{.}\) Conversely if \(x \in f^{-1}(C)\text{,}\) then \(x \in f^{-1}(C \cup D)\text{.}\) Similarly for \(x \in f^{-1}(D)\text{.}\) Hence \(f^{-1}( C \cup D) \supset f^{-1} (C) \cup f^{-1} (D)\text{,}\) and we have equality.

The rest of the proof is left as an exercise.

The proposition does not hold for direct images. We do have the following weaker result.

The proof is left as an exercise.

Definition 0.3.17.

Let \(f \colon A \to B\) be a function. The function \(f\) is said to be injective or one-to-one if \(f(x_1) = f(x_2)\) implies \(x_1 = x_2\text{.}\) In other words, \(f\) is injective if for all \(y \in B\text{,}\) the set \(f^{-1}(\{y\})\) is empty or consists of a single element. We call such an \(f\) an injection.

If \(f(A) = B\text{,}\) then we say \(f\) is surjective or onto. In other words, \(f\) is surjective if the range and the codomain of \(f\) are equal. We call such an \(f\) a surjection.

If \(f\) is both an surjective and injective, then we say \(f\) is bijective or that \(f\) is a bijection.

When \(f \colon A \to B\) is a bijection, then the inverse image of a single element, \(f^{-1}(\{y\})\text{,}\) is always a unique element of \(A\text{.}\) We then consider \(f^{-1}\) as a function \(f^{-1} \colon B \to A\) and we write simply \(f^{-1}(y)\text{.}\) In this case, we call \(f^{-1}\) the inverse function of \(f\text{.}\) For instance, for the bijection \(f \colon \R \to \R\) defined by \(f(x) := x^3\text{,}\) we have \(f^{-1}(x) = \sqrt[3]{x}\text{.}\)

Definition 0.3.18.

Consider \(f \colon A \to B\) and \(g \colon B \to C\text{.}\) The composition of the functions \(f\) and \(g\) is the function \(g \circ f \colon A \to C\) defined as

\begin{equation*} (g \circ f)(x) := g\bigl(f(x)\bigr) . \end{equation*}

For example, if \(f \colon \R \to \R\) is \(f(x):=x^3\) and \(g \colon \R \to \R\) is \(g(y) = \sin(y)\text{,}\) then \((g \circ f)(x) = \sin(x^3)\text{.}\)

Subsection 0.3.4 Relations and equivalence classes

We often compare two objects in some way. We say \(1 < 2\) for natural numbers, or \(\nicefrac{1}{2} = \nicefrac{2}{4}\) for rational numbers, or \(\{ a,c \} \subset \{ a,b,c \}\) for sets. The `\(<\)', `\(=\)', and `\(\subset\)' are examples of relations.

Definition 0.3.19.

Given a set \(A\text{,}\) a binary relation on \(A\) is a subset \(\sR \subset A \times A\text{,}\) which are those pairs where the relation is said to hold. Instead of \((a,b) \in \sR\text{,}\) we write \(a \, \sR \, b\text{.}\)

Example 0.3.20.

Take \(A := \{ 1,2,3 \}\text{.}\)

Consider the relation `\(<\)'. The corresponding set of pairs is \(\bigl\{ (1,2), (1,3), (2,3) \bigr\}\text{.}\) So \(1 < 2\) holds as \((1,2)\) is in the corresponding set of pairs, but \(3 < 1\) does not hold as \((3,1)\) is not in the set.

Similarly, the relation `\(=\)' is defined by the set of pairs \(\bigl\{ (1,1), (2,2), (3,3) \big\}\text{.}\)

Any subset of \(A \times A\) is a relation. Let us define the relation \(\dagger\) via \(\bigl\{ (1,2), (2,1), (2,3), (3,1) \bigr\}\text{,}\) then \(1 \dagger 2\) and \(3 \dagger 1\) are true, but \(1 \dagger 3\) is not.

Definition 0.3.21.

Let \(\sR\) be a relation on a set \(A\text{.}\) Then \(\sR\) is said to be

  1. Reflexive if \(a\, \sR \, a\) for all \(a \in A\text{.}\)

  2. Symmetric if \(a\, \sR \, b\) implies \(b \, \sR \, a\text{.}\)

  3. Transitive if \(a\, \sR \, b\) and \(b \, \sR \, c\) implies \(a\, \sR \, c\text{.}\)

If \(\sR\) is reflexive, symmetric, and transitive, then it is said to be an equivalence relation.

Example 0.3.22.

Let \(A := \{ 1,2,3 \}\) as above. The relation `\(<\)' is transitive, but neither reflexive nor symmetric. The relation `\(\leq\)' defined by \(\bigl\{ (1,1), (1,2), (1,3), (2,2), (2,3), (3,3) \bigr\}\) is reflexive and transitive, but not symmetric. Finally, a relation `\(\star\)' defined by \(\bigl\{ (1,1), (1,2), (2,1), (2,2), (3,3) \bigr\}\) is an equivalence relation.

Equivalence relations are useful in that they divide a set into sets of “equivalent” elements.

Definition 0.3.23.

Let \(A\) be a set and \(\sR\) an equivalence relation. An equivalence class of \(a \in A\text{,}\) often denoted by \([a]\text{,}\) is the set \(\{ x \in A : a\, \sR \, x \}\text{.}\)

For example, given the relation `\(\star\)' above, there are two equivalence classes, \([1] = [2] = \{ 1,2 \}\) and \([3] = \{ 3 \}\text{.}\)

Reflexivity guarantees that \(a \in [a]\text{.}\) Symmetry guarantees that if \(b \in [a]\text{,}\) then \(a \in [b]\text{.}\) Finally, transitivity guarantees that if \(a \in [b]\) and \(b \in [c]\text{,}\) then \(a \in [c]\text{.}\) In particular, we have the following proposition, whose proof is an exercise.

Example 0.3.25.

The set of rational numbers can be defined as equivalence classes of a pair of an integer and a natural number, that is elements of \(\Z \times \N\text{.}\) The relation is defined by \((a,b) \sim (c,d)\) whenever \(ad = bc\text{.}\) It is left as an exercise to prove that `\(\sim\)' is an equivalence relation. Usually the equivalence class \(\bigl[(a,b)\bigr]\) is written as \(\nicefrac{a}{b}\text{.}\)

Subsection 0.3.5 Cardinality

A subtle issue in set theory and one generating a considerable amount of confusion among students is that of cardinality, or “size” of sets. The concept of cardinality is important in modern mathematics in general and in analysis in particular. In this section, we will see the first really unexpected theorem.

Definition 0.3.26.

Let \(A\) and \(B\) be sets. We say \(A\) and \(B\) have the same cardinality when there exists a bijection \(f \colon A \to B\text{.}\) We denote by \(\abs{A}\) the equivalence class of all sets with the same cardinality as \(A\) and we simply call \(\abs{A}\) the cardinality of \(A\text{.}\)

For example, \(\{ 1,2,3 \}\) has the same cardinality as \(\{ a,b,c \}\) by defining a bijection \(f(1) := a\text{,}\) \(f(2) := b\text{,}\) \(f(3) := c\text{.}\) Clearly the bijection is not unique.

The existence of a bijection really is an equivalence relation. The identity, \(f(x) := x\text{,}\) is a bijection showing reflexivity. If \(f\) is a bijection, then so is \(f^{-1}\) showing symmetricity. If \(f \colon A \to B\) and \(g \colon B \to C\) are bijections, then \(g \circ f\) is a bijection of \(A\) and \(C\) showing transitivity. A set \(A\) has the same cardinality as the empty set if and only if \(A\) itself is the empty set: If \(B\) is nonempty, then no function \(f \colon B \to \emptyset\) can exist. In particular, there is no bijection of \(B\) and \(\emptyset\text{.}\)

Definition 0.3.27.

Suppose \(A\) has the same cardinality as \(\{ 1,2,3,\ldots,n \}\) for some \(n \in \N\text{.}\) We then write \(\abs{A} := n\text{.}\) If \(A\) is empty, we write \(\abs{A} := 0\text{.}\) In either case, we say that \(A\) is finite.

We say \(A\) is infinite or “of infinite cardinality” if \(A\) is not finite.

That the notation \(\abs{A} = n\) is justified we leave as an exercise. That is, for each nonempty finite set \(A\text{,}\) there exists a unique natural number \(n\) such that there exists a bijection from \(A\) to \(\{ 1,2,3,\ldots,n \}\text{.}\)

We can order sets by size.

Definition 0.3.28.

We write

\begin{equation*} \abs{A} \leq \abs{B} \end{equation*}

if there exists an injection from \(A\) to \(B\text{.}\) We write \(\abs{A} = \abs{B}\) if \(A\) and \(B\) have the same cardinality. We write \(\abs{A} < \abs{B}\) if \(\abs{A} \leq \abs{B}\text{,}\) but \(A\) and \(B\) do not have the same cardinality.

We state without proof that \(A\) and \(B\) have the same cardinality if and only if \(\abs{A} \leq \abs{B}\) and \(\abs{B} \leq \abs{A}\text{.}\) This is the so-called Cantor–Bernstein–Schröder theorem. Furthermore, if \(A\) and \(B\) are any two sets, we can always write \(\abs{A} \leq \abs{B}\) or \(\abs{B} \leq \abs{A}\text{.}\) The issues surrounding this last statement are very subtle. As we do not require either of these two statements, we omit proofs.

The truly interesting cases of cardinality are infinite sets. We will distinguish two types of infinite cardinality.

Definition 0.3.29.

If \(\abs{A} = \abs{\N}\text{,}\) then \(A\) is said to be countably infinite. If \(A\) is finite or countably infinite, then we say \(A\) is countable. If \(A\) is not countable, then \(A\) is said to be uncountable.

The cardinality of \(\N\) is usually denoted as \(\aleph_0\) (read as aleph-naught) 3 .

Example 0.3.30.

The set of even natural numbers has the same cardinality as \(\N\text{.}\) Proof: Let \(E \subset \N\) be the set of even natural numbers. Given \(k \in E\text{,}\) write \(k=2n\) for some \(n \in \N\text{.}\) Then \(f(n) := 2n\) defines a bijection \(f \colon \N \to E\text{.}\)

In fact, let us mention without proof the following characterization of infinite sets: A set is infinite if and only if it is in one-to-one correspondence with a proper subset of itself.

Example 0.3.31.

\(\N \times \N\) is a countably infinite set. Proof: Arrange the elements of \(\N \times \N\) as follows \((1,1)\text{,}\) \((1,2)\text{,}\) \((2,1)\text{,}\) \((1,3)\text{,}\) \((2,2)\text{,}\) \((3,1)\text{,}\) .... That is, always write down first all the elements whose two entries sum to \(k\text{,}\) then write down all the elements whose entries sum to \(k+1\) and so on. Define a bijection with \(\N\) by letting 1 go to \((1,1)\text{,}\) 2 go to \((1,2)\text{,}\) and so on. See Figure 4.

Figure 4. Showing \(\N \times \N\) is countable.

Example 0.3.32.

The set of rational numbers is countable. Proof: (informal) For positive rational numbers follow the same procedure as in the previous example, writing \(\nicefrac{1}{1}\text{,}\) \(\nicefrac{1}{2}\text{,}\) \(\nicefrac{2}{1}\text{,}\) etc. However, leave out fractions (such as \(\nicefrac{2}{2}\)) that have already appeared. The list would continue: \(\nicefrac{1}{3}\text{,}\) \(\nicefrac{3}{1}\text{,}\) \(\nicefrac{1}{4}\text{,}\) \(\nicefrac{2}{3}\text{,}\) etc. For all rational numbers, include \(0\) and the negative numbers: \(0\text{,}\) \(\nicefrac{1}{1}\text{,}\) \(\nicefrac{-1}{1}\text{,}\) \(\nicefrac{1}{2}\text{,}\) \(\nicefrac{-1}{2}\text{,}\) etc.

For completeness, we mention the following statements from the exercises. If \(A \subset B\) and \(B\) is countable, then \(A\) is countable. The contrapositive of the statement is that if \(A\) is uncountable, then \(B\) is uncountable. As a consequence, if \(\abs{A} < \abs{\N}\text{,}\) then \(A\) is finite. Similarly, if \(B\) is finite and \(A \subset B\text{,}\) then \(A\) is finite.

We give the first truly striking result. First, we need a notation for the set of all subsets of a set.

Definition 0.3.33.

The power set of a set \(A\text{,}\) denoted by \(\sP(A)\text{,}\) is the set of all subsets of \(A\text{.}\)

For example, if \(A := \{ 1,2\}\text{,}\) then \(\sP(A) = \bigl\{ \emptyset, \{ 1 \}, \{ 2 \}, \{ 1, 2 \} \bigr\}\text{.}\) In particular, \(\abs{A} = 2\) and \(\abs{\sP(A)} = 4 = 2^2\text{.}\) In general, for a finite set \(A\) of cardinality \(n\text{,}\) the cardinality of \(\sP(A)\) is \(2^n\text{.}\) This fact is left as an exercise. Hence, for a finite set \(A\text{,}\) the cardinality of \(\sP(A)\) is strictly larger than the cardinality of \(A\text{.}\) What is an unexpected and striking fact is that this statement is still true for infinite sets.


There exists an injection \(f \colon A \to \sP(A)\text{.}\) For \(x \in A\text{,}\) define \(f(x) := \{ x \}\text{.}\) Thus, \(\abs{A} \leq \abs{\sP(A)}\text{.}\)

To finish the proof, we must show that no function \(g \colon A \to \sP(A)\) is a surjection. Suppose \(g \colon A \to \sP(A)\) is a function. So for \(x \in A\text{,}\) \(g(x)\) is a subset of \(A\text{.}\) Define the set

\begin{equation*} B := \bigl\{ x \in A : x \notin g(x) \bigr\} . \end{equation*}

We claim that \(B\) is not in the range of \(g\) and hence \(g\) is not a surjection. Suppose for contradiction that there exists an \(x_0\) such that \(g(x_0) = B\text{.}\) Either \(x_0 \in B\) or \(x_0 \notin B\text{.}\) If \(x_0 \in B\text{,}\) then \(x_0 \notin g(x_0) = B\text{,}\) which is a contradiction. If \(x_0 \notin B\text{,}\) then \(x_0 \in g(x_0) = B\text{,}\) which is again a contradiction. Thus such an \(x_0\) does not exist. Therefore, \(B\) is not in the range of \(g\text{,}\) and \(g\) is not a surjection. As \(g\) was an arbitrary function, no surjection exists.

One particular consequence of this theorem is that there do exist uncountable sets, as \(\sP(\N)\) must be uncountable. A related fact is that the set of real numbers (which we study in the next chapter) is uncountable. The existence of uncountable sets may seem unintuitive, and the theorem caused quite a controversy at the time it was announced. The theorem not only says that uncountable sets exist, but that there in fact exist progressively larger and larger infinite sets \(\N\text{,}\) \(\sP(\N)\text{,}\) \(\sP(\sP(\N))\text{,}\) \(\sP(\sP(\sP(\N)))\text{,}\) etc.

Subsection 0.3.6 Exercises

Exercise 0.3.1.

Show \(A \setminus (B \cap C) = (A \setminus B) \cup (A \setminus C)\text{.}\)

Exercise 0.3.2.

Prove that the principle of strong induction is equivalent to the standard induction.

Exercise 0.3.4.

  1. Prove Proposition 0.3.16.

  2. Find an example for which equality of sets in \(f( C \cap D) \subset f (C) \cap f (D)\) fails. That is, find an \(f\text{,}\) \(A\text{,}\) \(B\text{,}\) \(C\text{,}\) and \(D\) such that \(f( C \cap D)\) is a proper subset of \(f(C) \cap f(D)\text{.}\)

Exercise 0.3.5.

(Tricky)   Prove that if \(A\) is nonempty and finite, then there exists a unique \(n \in \N\) such that there exists a bijection between \(A\) and \(\{ 1, 2, 3, \ldots, n \}\text{.}\) In other words, the notation \(\abs{A} := n\) is justified. Hint: Show that if \(n > m\text{,}\) then there is no injection from \(\{ 1, 2, 3, \ldots, n \}\) to \(\{ 1, 2, 3, \ldots, m \}\text{.}\)

Exercise 0.3.6.


  1. \(A \cap (B \cup C) = (A \cap B) \cup (A \cap C)\text{.}\)

  2. \(A \cup (B \cap C) = (A \cup B) \cap (A \cup C)\text{.}\)

Exercise 0.3.7.

Let \(A \Delta B\) denote the symmetric difference, that is, the set of all elements that belong to either \(A\) or \(B\text{,}\) but not to both \(A\) and \(B\text{.}\)

  1. Draw a Venn diagram for \(A \Delta B\text{.}\)

  2. Show \(A \Delta B = (A \setminus B) \cup (B \setminus A)\text{.}\)

  3. Show \(A \Delta B = (A \cup B) \setminus ( A \cap B)\text{.}\)

Exercise 0.3.8.

For each \(n \in \N\text{,}\) let \(A_n := \{ (n+1)k : k \in \N \}\text{.}\)

  1. Find \(A_1 \cap A_2\text{.}\)

  2. Find \(\bigcup_{n=1}^\infty A_n\text{.}\)

  3. Find \(\bigcap_{n=1}^\infty A_n\text{.}\)

Exercise 0.3.9.

Determine \(\sP(S)\) (the power set) for each of the following:

  1. \(S = \emptyset\text{,}\)

  2. \(S = \{1\}\text{,}\)

  3. \(S = \{1,2\}\text{,}\)

  4. \(S = \{1,2,3,4\}\text{.}\)

Exercise 0.3.10.

Let \(f \colon A \to B\) and \(g \colon B \to C\) be functions.

  1. Prove that if \(g \circ f\) is injective, then \(f\) is injective.

  2. Prove that if \(g \circ f\) is surjective, then \(g\) is surjective.

  3. Find an explicit example where \(g \circ f\) is bijective, but neither \(f\) nor \(g\) is bijective.

Exercise 0.3.11.

Prove by induction that \(n < 2^n\) for all \(n \in \N\text{.}\)

Exercise 0.3.12.

Show that for a finite set \(A\) of cardinality \(n\text{,}\) the cardinality of \(\sP(A)\) is \(2^n\text{.}\)

Exercise 0.3.13.

Prove \(\frac{1}{1\cdot 2} + \frac{1}{2\cdot 3} + \cdots + \frac{1}{n(n+1)} = \frac{n}{n+1}\) for all \(n \in \N\text{.}\)

Exercise 0.3.14.

Prove \(1^3 + 2^3 + \cdots + n^3 = {\left( \frac{n(n+1)}{2} \right)}^2\) for all \(n \in \N\text{.}\)

Exercise 0.3.15.

Prove that \(n^3 + 5n\) is divisible by \(6\) for all \(n \in \N\text{.}\)

Exercise 0.3.16.

Find the smallest \(n \in \N\) such that \(2{(n+5)}^2 < n^3\) and call it \(n_0\text{.}\) Show that \(2{(n+5)}^2 < n^3\) for all \(n \geq n_0\text{.}\)

Exercise 0.3.17.

Find all \(n \in \N\) such that \(n^2 < 2^n\text{.}\)

Exercise 0.3.19.

Give an example of a countably infinite collection of finite sets \(A_1, A_2, \ldots\text{,}\) whose union is not a finite set.

Exercise 0.3.20.

Give an example of a countably infinite collection of infinite sets \(A_1, A_2, \ldots\text{,}\) with \(A_j \cap A_k\) being infinite for all \(j\) and \(k\text{,}\) such that \(\bigcap_{j=1}^\infty A_j\) is nonempty and finite.

Exercise 0.3.21.

Suppose \(A \subset B\) and \(B\) is finite. Prove that \(A\) is finite. That is, if \(A\) is nonempty, construct a bijection of \(A\) to \(\{ 1,2,\ldots,n \}\text{.}\)

Exercise 0.3.22.

Prove Proposition 0.3.24. That is, prove that if \(\sR\) is an equivalence relation on a set \(A\text{,}\) then every \(a \in A\) is in exactly one equivalence class. Then prove that \(a \, \sR \, b\) if and only if \([a] = [b]\text{.}\)

Exercise 0.3.23.

Prove that the relation `\(\sim\)' in Example 0.3.25 is an equivalence relation.

Exercise 0.3.24.

  1. Suppose \(A \subset B\) and \(B\) is countably infinite. By constructing a bijection, show that \(A\) is countable (that is, \(A\) is empty, finite, or countably infinite).

  2. Use part a) to show that if \(\abs{A} < \abs{\N}\text{,}\) then \(A\) is finite.

Exercise 0.3.25.

(Challenging)   Suppose \(\abs{\N} \leq \abs{S}\text{,}\) or in other words, \(S\) contains a countably infinite subset. Show that there exists a countably infinite subset \(A \subset S\) and a bijection between \(S \setminus A\) and \(S\text{.}\)

The term “modern” refers to late 19th century up to the present.
To be completely rigorous, this equivalence is only true if we also assume as an axiom that \(n-1\) exists for all natural numbers bigger than \(1\text{,}\) which we do. In this book, we are assuming all the usual arithmetic holds.
For the fans of the TV show Futurama, there is a movie theater in one episode called an \(\aleph_0\)-plex.
Named after the German mathematician Georg Ferdinand Ludwig Philipp Cantor 5  (1845–1918).
For a higher quality printout use the PDF versions: or