˜ ˜

as n ’ ∞. Thus φφ(y) = y for all y ∈ Y . Similarly, φφ(y ) = y for all

˜

y ∈ Y . Thus φ is the inverse of φ, and φ must be bijective.

Exercise 14.1.7. Let (X, d) be a metric space.

(i) Show that

d(x, y) ’ d(u, v) ¤ d(x, u) + d(y, v),

and deduce that

|d(x, y) ’ d(u, v)| ¤ d(x, u) + d(y, v)

for all x, y, u, v ∈ X.

(ii) If xn ’ x and yn ’ y, show that d(xn , yn ) ’ d(x, y) as n ’ ∞.

Exercise 14.1.8. The proof of Lemma 14.1.5 looks quite complicated but the

di¬culty lies in asking the right questions in the right order, rather than an-

swering them. Outline the questions answered in the proof of Lemma 14.1.5,

paying particular attention to the ¬rst two paragraphs.

We can now repose question B more precisely.

Question C: If (X, d) is a metric space, can we ¬nd a complete metric space

(Z, ρ) such that Z ⊇ X, X is dense in (Z, ρ) and d(u, v) = ρ(u, v) for all u,

v ∈ X?

Question C : If (X, d) is a metric space, can we ¬nd a complete metric

˜ ˜

space (Y, d) and a map θ : X ’ Y such that θX is dense in (Y, d) and

˜

d(θu, θv) = d(u, v) for all u, v ∈ X?

We shall leave these questions aside for the moment. Instead, we shall

give some examples of how behaviour on a dense subset may force behaviour

on the whole space. (Note, however that Exercise 5.7.1 and its preceeding

discussion shows that this need not always be the case.)

Lemma 14.1.9. Suppose (X, d) is a complete metric space with a dense

subset E. Suppose that E is a vector space (over F where F = R or F = C)

E such that d(x, y) = x ’E y E for all x, y ∈ E. Then X can

with norm

be given the structure of a normed vector space with norm such that E is

a vector subspace of X and d(x, y) = x ’ y for all x, y ∈ X.

359

Please send corrections however trivial to twk@dpmms.cam.ac.uk

Proof. Suppose1 that (E, +E , —E , E , F) is a normed vector space with ad-

dition denoted by +E and scalar multiplication denoted by —E .

Suppose x, y ∈ X. We can ¬nd xn , yn ∈ E such that d(xn , x), d(yn , y) ’

0. We know that xn and yn must be Cauchy sequences, and so

d(xn +E yn , xm +E ym ) = (xn +E yn ) ’E (xm +E ym ) E

= (xn ’E xm ) +E (yn ’E ym ) E ¤ xn ’E xm + y n ’ E ym

E E

= d(xn , xm ) + d(yn , ym ) ’ 0

as n, m ’ ∞. Thus xn +yn is a Cauchy sequence in X and must have a limit

z, say. Suppose now that xn , yn ∈ E are such that d(xn , x), d(yn , y) ’ 0.

As before, xn + yn must have a limit z , say. We want to show that z = z .

To do this, observe that, since

d(z, z ) ¤ d(z, xn +E yn ) + d(xn +E yn , xn +E yn ) + d(xn +E yn , z)

= d(z, xn +E yn ) + d(z , xn +E yn ) + (xn +E yn ) ’E (xn +E yn )

= d(z, xn +E yn ) + d(z , xn +E yn ) + (xn ’E xn ) +E (yn ’E yn ) E

¤ d(z, xn +E yn ) + d(z , xn +E yn ) + xn ’E xn E + yn ’E yn E

= d(z, xn +E yn ) + d(z , xn +E yn ) + d(xn , xn ) + d(yn , yn )

¤ d(z, xn +E yn ) + d(z , xn +E yn ) + d(xn , x) + d(xn , x) + d(yn , y) + d(yn , y)

’0+0+0+0+0+0=0

as n ’ ∞, it follows that z = z . We can, therefore, de¬ne x + y as the limit

of xn +E yn when xn , yn ∈ E and d(xn , x), d(yn , y) ’ 0. Observe that, if

x, y ∈ E, then, setting xn = x, yn = y, it follows that x +E y = x + y.

We leave it as an exercise for the reader to show that if » ∈ F and x ∈ X,

we can de¬ne »x unambiguously as the limit of » —E xn when xn ∈ E and

d(xn , x) ’ 0 and that, with this de¬nition, » —E x = »x whenever » ∈ F and

x ∈ E.

Now we must check that X with the newly de¬ned operations is indeed

a vector space. This is routine. For example, if x, y ∈ X and » ∈ F we can

¬nd xn , yn ∈ E such that xn ’ x and yn ’ y. Since E is vector space,

» —E (xn +E yn ) = » —E xn +E » —E yn

But, by our de¬nitions, xn +E yn ’ x+y, so »—E (xn +E yn ) ’ »(x+y). Again,

by de¬nition, »—E xn ’ »x and »—E yn ’ »y, so »—E xn +E »—E yn ’ »x+»y.

The uniqueness of limits now gives us

»(x + y) = »x + »y.

1

This proof with its plethora of subscript E™s may help explain why mathematicians

are prepared to put up with some ambiguity in order to achieve notational simplicity.

360 A COMPANION TO ANALYSIS

The remaining axioms are checked in the same way.

Finally, we must check that there is a norm on X such that d(x, y) =

x ’ y . First, observe that, if x, y ∈ X, we can ¬nd xn , yn ∈ E such that

xn ’ x and yn ’ y. We have

d(xn , yn ) = xn ’E yn = d(xn ’E yn , 0)

E

where 0 is the zero vector in E (and so in X). Since

xn ’E yn = xn +E (’1) — yn ’ x + (’1)y = x ’ y,

we have d(xn ’E yn , 0) ’ d(x ’ y, 0). But d(xn , yn ) ’ d(x, y) and the limit

is unique, so we have

d(x, y) = d(x ’ y, 0).

We now set a = d(a, 0) and check that is indeed a norm.

Exercise 14.1.10. Fill in the gaps in the proof of Lemma 14.1.9.

(i) Show that, if » ∈ F and x ∈ X, we can de¬ne »x unambiguously

as the limit of » —E xn when xn ∈ E and d(xn , x) ’ 0 and that, with this

de¬nition, » —E x = »x whenever » ∈ F and x ∈ E.

(ii) If you are of a careful disposition, check all the axioms for a vector

space hold for X with our operations. Otherwise, choose the two which seem

to you hardest and check those.

(iii) Check that de¬ned at the end of the proof is indeed a norm.

Lemma 14.1.11. Let (X, d) be a complete metric space with a dense subset

E. Suppose that E is a vector space (over F where F = R or F = C) with

inner product , E such that d(x, y) = x ’E y E for all x, y ∈ E, where

E is the norm induced by the inner product. Then X can be given the

structure of a vector space with inner product , such that E is a vector

subspace of X and x, y E = x, y for all x, y ∈ E.

Proof. By Lemma 14.1.9, X can be given the structure of a normed vector

space with norm such that E is a vector subspace of X and d(x, y) =

x ’ y for all x, y ∈ X. We need to show that it can be given an inner

product consistent with this norm.

First, observe that, if x, y ∈ X, we can ¬nd xn , yn ∈ E such that xn ’ x

and yn ’ y. Automatically, xn and yn form Cauchy sequences. Since any

Cauchy sequence is bounded, we can ¬nd a K such that xn , yn ¤ K for

361

Please send corrections however trivial to twk@dpmms.cam.ac.uk

all n. Thus, using the Cauchy-Schwarz inequality,

| xn , yn ’ xm , ym E| = | x n , yn ’ ym E + x n ’ xm , ym E |

E

¤ | x n , yn ’ ym E | + | x n ’ xm , ym E |

¤ x n y n ’ ym + x n ’ xm y m

¤ K y n ’ ym + K x n ’ xm ’ 0 + 0 = 0

and the xn , yn E form a Cauchy sequence in F converging to t.

Suppose now that xn , yn ∈ E are such that xn ’ x and yn ’ y. As

before, xn , yn E must have a limit t , say. We want to show that t = t .

Note that, also as before, we can ¬nd a K such that xn , yn ¤ K for

all n. Thus, using the Cauchy-Schwarz inequality again,

| xn , yn ’ xn , yn E| = | x n , yn ’ yn E + x n ’ xn , yn E |

E

¤ | x n , yn ’ yn E | + | x n ’ xn , yn E |

¤ x n y n ’ yn + x n ’ xn y n

¤ K yn ’ yn + K xn ’ xn ’ 0 + 0 = 0,

so t = t . We can thus de¬ne x, y unambiguously by choosing xn , yn ∈ E

such that xn ’ x and yn ’ y and taking x, y to be the limit of xn , yn E .

Setting xn = x, yn = y for all n we see that x, y = x, y E whenever

x, y ∈ E.

We observe that, if x ∈ X and we choose xn ∈ E such that xn ’ x, then

2

’ x, x

xn = xn , xn E

and

2

= d(xn , 0)2 ’ d(x, 0)2 = x 2

xn

so, by the uniqueness of limits, x 2 = x, x . The veri¬cation that , is

an inner product on X is left to the reader.

(An alternative proof which uses an important link between inner prod-

ucts and their derived norms is oulined in Exercises K.297 and K.298.)

Exercise 14.1.12. Complete the proof of Lemma 14.1.11 by showing that

, is an inner product on X.

By now the reader should have the formed the opinion that what seemed

to be a rather di¬cult proof technique when ¬rst used in the proof of Lemma 14.1.5

is, in fact, rather routine (though requiring continuous care). If she requires

further convincing, she can do Exercise K.299.

362 A COMPANION TO ANALYSIS

14.2 The solution

In this section we show how, starting from a metric space (X, d), we can

˜

construct a complete metric space (Y, d) and a map θ : X ’ Y such that θX

˜ ˜

is dense in (Y, d) and d(θu, θv) = d(u, v) for all u, v ∈ X. We give a direct

proof which can serve as a model in similar circumstances. (Exercise K.300

gives a quick and elegant proof which, however, only applies in this particular

case.)

We start by considering the space X of Cauchy sequences x = (xn )∞ . n=1

We write x ∼ y if x, y ∈ X and d(xn , yn ) ’ 0 as n ’ ∞. The reader should

verify the statements made in the next exercise.

Exercise 14.2.1. Suppose x, y, z ∈ X. Then

(i) x ∼ x.

(ii) If x ∼ y, then y ∼ x.

(ii) If x ∼ y and y ∼ z, then x ∼ z.

We write [x] = {y : y ∼ x} and call [x] the equivalence class of x. We

take Y to be the set of all such equivalence classes. The reader should verify

the statement made in the next exercise.

Exercise 14.2.2. Each x ∈ X belongs to exactly one equivalence class in Y .

We now want to de¬ne a metric on Y . The de¬nition follows a pattern

which is familiar from the previous section. Suppose a, b ∈ Y . If a = [x] and

b = [y], then

|d(xn , yn ) ’ d(xm , ym )| ¤ |d(xn , yn ) ’ d(xn , ym )| + |d(xn , ym ) ’ d(xm , ym )|

¤ d(yn , ym ) + d(xn , xm ) ’ 0 + 0 = 0

as n, m ’ ∞. Thus d(xn , yn ) is a Cauchy sequence in R and tends to a limit

t, say. Suppose now that a = [x ] and b = [y ]. Since

|d(xn , yn ) ’ d(xn , yn )| ¤ |d(xn , yn ) ’ d(xn , yn )| + |d(xn , yn ) ’ d(xn , yn )|

¤ d(yn , yn ) + d(xn , xn ) ’ 0 + 0 = 0,

˜

we have d(xn , yn ) ’ t as n ’ ∞. We can thus de¬ne d([x], [y]) unambigu-

ously by

˜

d([x], [y]) = lim d(xn , yn ).

n’∞

The reader should verify the statements made in the next exercise.

363

Please send corrections however trivial to twk@dpmms.cam.ac.uk

˜

Exercise 14.2.3. (i) (Y, d) is a metric space.

(ii) If we set θ(x) = [(x, x, x, . . . )], then θ is a map from X to Y such

˜

that d(θu, θv) = d(u, v).

˜

(iii) If a ∈ Y , then a = [x] for some x ∈ X. Show that d(θ(xn ), a) ’ 0

˜

as n ’ ∞, and so θ(X) is dense in (Y, d).

Lemma 14.2.4. If we adopt the hypotheses and notation introduced in this

˜

section, then (Y, d) is complete.

˜

Proof. Suppose [y(1)], [y(2)], . . . is a Cauchy sequence in (Y, d). Since θ(X)

˜ ˜

is dense in (Y, d), we can ¬nd xj ∈ X such that d(θ(xj ), [y(j)]) < j ’1 . We

observe that

˜

d(xj , xk ) = d(θ(xj ), θ(xk ))

˜ ˜ ˜

¤ d(θ(xj ), [y(j)]) + d(θ(xk ), [y(k)]) + d([y(j)], [y(k)])

˜

< j ’1 + k ’1 + d([y(j)], [y(k)]),

and so the xj form a Cauchy sequence in (X, d).

˜

We now wish to show that d([y(n)], [x]) ’ 0 as n ’ ∞. Let > 0 be

given. Since the sequence [y(n)] is Cauchy, we can ¬nd an M such that

˜

d([y(j)], [y(k)]) < /2 for all j, k ≥ M .

We now choose N ≥ M such that N ’1 < /6 and observe that the inequality

proved in the last paragraph gives

d(xj , xk ) < 5 /6 for all j, k ≥ N .

Thus

˜

d([x], θ(xk )) ¤ 5 /6 for all k ≥ N ,

and

˜ ˜ ˜

d([x], [y(k)]) ¤ d([x], θ(xk )) + d(θ(xk ), [y(k)]) < 5 /6 + k ’1 ¤ 5 /6 + N ’1 <

for all k ≥ N , so we are done.

Combining the results of this section with Lemma 14.1.5, we have the

following theorem.

Theorem 14.2.5. If (X, d) is a metric space, then there exists an essentially

˜ ˜ ˜

unique complete metric space (Y, d) such that X is dense in (Y, d) and d

restricted to X 2 is d.

364 A COMPANION TO ANALYSIS

˜

We call (Y, d) the completion of (X, d). Lemma 14.1.9 and Lemma 14.1.11

now give the following corollaries.

Lemma 14.2.6. (i) The completion of a normed vector space is a normed

vector space.

(ii) The completion of an inner product vector space is an inner product

vector space.

Exercise 14.2.7. Readers of a certain disposition (with which the author

sometimes sympathises and sometimes does not) will ¬nd the statements of

Theorem 14.2.5 and Lemma 14.2.6 unsatisfactorily lax. Redraft them in the

more rigorous style of Lemma 14.1.5 and Lemmas 14.1.9 and 14.1.11.

Exercise 14.2.8. If (X, d) is already complete, then the uniqueness of the

˜

completion means that (essentially) (X, d) = (Y, d). Go through the construc-

˜

tion of (Y, d) in this section and explain in simple terms why the construction

does indeed work as stated.

Why do we construct the reals? ™

14.3

Mathematicians know from experience that it is usually easier to ask ques-

tions than to answer them. However, in some very important cases, asking

the correct question may represent a greater intellectual breakthrough than

answering it.

From the time the study of Euclid™s Elements was reintroduced into Eu-

rope until about 1750, few of those who studied it can have doubted that it

described the geometry of the world, or, accepting that it did describe the

geometry of the world, asked why it did so. Amongst those who did was the

philosopher Kant who proposed the following interesting answer. According

to Kant, there exist certain a priori principles such that our minds are bound

to interpret the world in terms of these a priori principles. The constraints

imposed by the axioms of Euclidean geometry provided an example of such

a priori principles.

Some mathematicians felt that the axioms stated by Euclid were not

yet in the best form. In particular, they believed that the so called parallel

axiom2 was unsatisfactory and they sought to prove it from the other axioms.

Between 1750 and 1840 a number of independent workers saw that the

questions.

(A) Why is the parallel axiom true?

2

This is now usually stated in the following form. Given a line l and a point P not on

the line, there exists one and only one line l through P which does not intersect l.

365

Please send corrections however trivial to twk@dpmms.cam.ac.uk

(B) Why is Euclid™s geometry true?

should be replaced by the new questions

(A ) Do there exist mathematical systems obeying all the axioms of Euclid

except the parallel axiom and disobeying the parallel axiom?

(B ) Does Euclid™s geometry describe the world?

Although Gauss did not publish his views, he asked both of these new

questions and answered yes to the ¬rst and no to the second. He even tried

to check the validity of Euclidean geometry by physical experiment3 . In a

private letter he stated that geometry should not be classed ˜with arithmetic,

which is purely a priori, but with mechanics™.

Gauss, Bolyai, Lobachevsky and Riemann developed the theory of various

non-Euclidean geometries but did not show that they were consistent. Thus

although they believed strongly that the answer to question (A ) was no,

they did not prove it. (Certainly they did not prove it explicitly.) However,

in the second half of the 19th century mathematicians (notably Beltrami)

showed how to construct models for various non-Euclidean geometries using

Euclidean geometry. In modern terms, they showed that, if Euclid™s axioms

(including the parallel axiom) form a consistent system, then the same set

of axioms with the parallel axiom replaced by some appropriate alternative

axiom remains consistent.

Finally, in 1917, Einstein™s theory of general relativity con¬rmed the sus-

picion voiced from time to time by various farsighted mathematicians that

question (B ) should be replaced by

(B ) Is Euclid™s geometry the most convenient description of the world?

and that the answer to the new question was no.

In 1800, Euclid™s geometry was the only rigorously developed part of

mathematics. Initially, those, like Cauchy and Gauss, who sought to rigorise

the calculus, took the notion of magnitude (or as we would say real number)

as implicit (or, as Kant would say, a priori) but it gradually became clear that

the rigorous development of the calculus depended on (or was identical with)

a close study of the real numbers. However, as I tried to show in Chapter 2,

the properties of the real line that we want are by no means evident.

Faced with this problem, the natural way forward is to show that the

complicated system constituted by the real numbers can be constructed from

the simpler system of the rationals. Thus, if we believe that the rationals

exist, we must believe that the reals exist. (Gauss and others had already

shown how to construct C from R showing that, if the real numbers are

3

Euclid uses the parallel axiom to prove that the sum of the angles of a triangle add up

to 180 degrees. Gauss used surveying instruments to measure the angles of a triangle of

mountain peaks. To within the limits of experimental error, the results con¬rmed Euclid™s

prediction.

366 A COMPANION TO ANALYSIS

acceptable, then so are the complex numbers.) Simpler constructions enabled

mathematicians to obtain the rationals from the integers and the integers

from the positive integers.

In the next sequence of exercises we outline these simpler constructions.

The full veri¬cation of the statements made is long and tedious and the

reader should do as much or little as she wishes. Nothing in what follows

depends on these exercises. I have felt free to use notions like equivalence

relations, integral domains and so on which I have avoided in the main text.

Exercise 14.3.1. (Obtaining Z from N.) Here N = {0, 1, 2, . . . }. We

take (N, +, —) with its various properties as given. Show that the relation

(r, s) ∼ (r , s ) if r + s = r + s

is an equivalence relation on N2 .

We write Z = N2 / ∼ for the set of equivalence classes

[(r, s)] = {(r , s ) ∈ N2 : (r, s) ∼ (r , s )}.

Show that the following give well de¬ned operations

[(r, s)] + [(u, v)] = [(r + u, s + v)], [(r, s)] — [(u, v)] = [(ru + sv, su + rv)].

Explain why you think we chose these de¬nitions. Show that (Z, +, —) obeys

the algebraic rules that it should (in more formal terms, that (Z, +, —) is an

integral domain). If you just wish to prove a selection of results, you might

choose

(a) There exists a unique 0 ∈ Z such that x + 0 = x for all x ∈ Z.

(b) If x ∈ Z, then we can ¬nd a y ∈ Z such that x + y = 0 (we write

y = ’x).

(c) If x, y, z ∈ Z, then x — (y + z) = x — y + x — z.

(d) There exists a unique 1 ∈ Z such that x — 1 = x for all x ∈ Z.

(e) If z = 0 and x — z = y — z, then x = y.

Show that the mapping θ : N ’ Z given by θ(n) = [(n, 0)] is injective

and preserves addition and multiplication (that is θ(n + m) = θ(n) + θ(m)

and θ(n — m) = θ(n) — θ(m)). Explain brie¬‚y why this means that we may

consider N as a subset of Z.

Show that, if P = θ(N), then P obeys the appropriate version of axioms

(P1) to (P3) in Appendix A. We write x > y if x + (’y) ∈ P.

Exercise 14.3.2. (Obtaining Q from Z.) We take (Z, +, —) with its var-

ious properties as given. Show that the relation

(r, s) ∼ (r , s ) if r — s = s — r

367

Please send corrections however trivial to twk@dpmms.cam.ac.uk

is an equivalence relation on Z — (Z \ {0}).

We write Q = Z — (Z \ {0})/ ∼ for the set of equivalence classes

[(r, s)] = {(r , s ) ∈ Z — (Z \ {0}) : (r, s) ∼ (r , s )}.

Show that the following give well de¬ned operations

[(r, s)] + [(u, v)] = [(rv + su, sv)], [(r, s)] — [(u, v)] = [(ru, sv)].

Explain why you think we chose these de¬nitions. Show that (Q, +, —) obeys

the algebraic rules (A1) to (A4), (M1) to (M4) and (D) set out in Appendix A

(in algebraists™ terms (Q, +, —) is a ¬eld). If you just wish to prove a

selection of results you might choose

(a) There exists a unique 0 ∈ Q such that x + 0 = x for all x ∈ Q.

(b) If x ∈ Q, then we can ¬nd a y ∈ Q such that x + y = 0 (we write

y = ’x).

(c) If x, y, z ∈ Q, then x — (y + z) = x — y + x — z.

(d) There exists a unique 1 ∈ Z such that x — 1 = x for all x ∈ Q.

(e) If x = 0, then we can ¬nd a y ∈ Q such that xy = 1 .

De¬ne an appropriate P which obeys the appropriate version of axioms

(P1) to (P3) in Appendix A. [If you cannot do this, you do not understand

what is going on.] We write x > y if x + (’y) ∈ P. In algebraists™ terms,

(Q, +, —, >) is an ordered ¬eld.

De¬ne an injective map θ : Z ’ Q which preserves addition, multiplica-

tion and order (thus n > m implies θ(n) > θ(m)) and show that it has the

stated properties. Explain brie¬‚y why this means that we may consider Z as

a subset (more strictly as a sub-ordered integral domain) of Q.

Exercise 14.3.3. (Obtaining C from R.) This is somewhat easier than

the previous two exercises, since it does not involve equivalence classes and

it is obvious that the de¬nitions actually work. We take (R, +, —) with its

various properties as given.

We write C = R2 and de¬ne the following operations on C.

(r, s) + (u, v) = (r + u, s + v), (r, s) — (u, v) = (ru ’ sv, rv + su).

Explain why you think we chose these de¬nitions. Show that (C, +, —)

obeys the algebraic rules (A1) to (A4), (M1) to (M4) and (D) set out in

Appendix A (in algebraists™ terms (C, +, —) is a ¬eld). If you just wish to

prove a selection of results you might choose

(a) There exists a unique 0 ∈ C such that z + 0 = z for all z ∈ C.

(b) If z ∈ C, then we can ¬nd a w ∈ C such that z + w = 0 (we write

w = ’z).

368 A COMPANION TO ANALYSIS

(c) If z, w, u ∈ C, then z — (w + u) = z — w + z — u.

(d) There exists a unique 1 ∈ C such that z — 1 = z for all z ∈ C.

(e) If z = 0, then we can ¬nd a w ∈ C such that zw = 1 .

De¬ne an appropriate injective map θ : R ’ C which preserves addition

and multiplication (thus θ is a ¬eld morphism) and show that it has the

stated properties. Explain brie¬‚y why this means that we can consider R as

a sub¬eld of C.

Show that (0, 1) — (0, 1) = θ(’1). Show that, if we adopt the usual con-

vention of considering R as a sub¬eld of C and writing i = (0, 1), then every

element z of C can be written as z = a + bi with a, b ∈ R in exactly one way.

Show also that (0, ’1) — (0, ’1) = θ(’1). [This corresponds to the ambi-

guity introduced by referring to i as ˜the square root of ’1™. In C there are

two square roots of ’1 and there is no reason to prefer one square root rather

than the other.]

Mathematicians and philosophers have by now become so skilled in doubt

that, whatever their private beliefs, few of them would care to maintain in

public that the system of the positive integers (often called the ˜natural

numbers™) is given a priori4 . None the less, most mathematicians feel that,

in some sense, the system of the positive integers is much more ˜basic™ or

˜simple™ or ˜natural™ than, say, the system of the complex numbers. If we can

construct the complex numbers from the integers we feel that much more

con¬dent in the system of the complex numbers.

Much of mathematics consists in solving problems and proving theorems,

but part of its long term progress is marked by changing and extending the

nature of the problems posed. To a mathematician in 1800, the question ˜Can

you construct the reals from the rationals?™ would be incomprehensible. One

of the many obstacles to understanding the question would be the nature of

the object constructed. Let us look at our construction of Z from N in

Exercise 14.3.1. We say, in e¬ect, that ˜the integer 1 is the equivalence

class of ordered pairs (n, m) of natural numbers such that their di¬erence

n ’ m is the natural number 1™. The non-mathematician is entitled to object

that, whatever the integer 1 may be, it is certainly not that. To this, the

mathematician replies that the system constructed in Exercise 14.3.1 has

exactly the properties that we wish the integers to have and ˜if it looks like

4

It is all very well to maintain that one plus one must always make two but this fails

when the one plus one are rabbits of di¬erent sexes or rain drops which run together.

A stronger objection deals with statements like n + m = m + n. This statement might,

perhaps, be a priori true if n = 2 and m = 3 but can it really be a priori true if n and m

are integers so large that I could not write them down in my lifetime?

369

Please send corrections however trivial to twk@dpmms.cam.ac.uk

a duck, swims like a duck and quacks like a duck then it is a duck5 ™. To this,

the philosopher objects that she can construct a clockwork toy that looks like

a duck, swims like a duck and quacks like a duck. The mathematician replies

that, if all that we want from a duck is that it should look like a duck, swim

like a duck and quack like a duck, then, for our purposes, the clockwork toy

is a duck.

How do we construct the reals? ™

14.4

As the reader no doubt expects, we model the construction of the reals from

the rationals on the completion arguments in Sections 14.1 and 14.2. As the

reader, no doubt, also expects, the construction will be long and relatively

easy with a great deal of routine veri¬cation left to her. However, it is impor-

tant not to underestimate the subtlety of the task. Our earlier completion

arguments used the existence and properties of the real numbers, our new

arguments cannot. The reader should picture a street mime juggling non-

existent balls. As the mime continues, the action of juggling slowly brings

the balls into existence, at ¬rst in dim outline and then into solid reality.

We take the ordered ¬eld (Q, +, —, >) as given. We recall that our

de¬nitions of convergence and Cauchy sequence apply in any ordered ¬eld.

If xn ∈ Q and x ∈ Q, we say that xn ’ x if, given any ∈ Q with > 0, we

can ¬nd an N ( ) > 0 such that

|xn ’ x| < for all n ≥ N ( ) > 0.

We say that a sequence xn in Q is Cauchy if, given any ∈ Q with > 0, we

can ¬nd an N ( ) > 0 such that

|xn ’ xm | < for all n, m ≥ N ( ) > 0.

We start by considering the space X of Cauchy sequences x = (xn )∞ in

n=1

Q. We write x ∼ y if x, y ∈ X and xn ’ yn ’ 0 as n ’ ∞. The reader

should verify the statements made in the next exercise.

Exercise 14.4.1. Suppose x, y, z ∈ X. Then

(i) x ∼ x.

(ii) If x ∼ y, then y ∼ x.

(ii) If x ∼ y and y ∼ z, then x ∼ z.

5

This should be contrasted with the position in medical diagnosis. ˜If it looks like a

duck, swims like a duck and barks like a dog, then it is a duck. Every case presents atypical

symptoms.™

370 A COMPANION TO ANALYSIS

We write [x] = {y : y ∼ x} and call [x] the equivalence class of x. We

take R to be the set of all such equivalence classes. The reader should verify

the statements made in the next exercise.

Exercise 14.4.2. Each x ∈ X belongs to exactly one equivalence class in R.

Lemma 14.4.3. (i) If x, y ∈ X, then, taking x+y to be the sequence whose

nth term is xn + yn , we have x + y ∈ X.

(ii) If x, y, x , y , ∈ R and x ∼ x , y ∼ y , then x + y ∼ x + y .

(iii) If x, y ∈ X, then, taking x — y to be the sequence whose nth term

is xn — yn , we have x — y ∈ X.

(iv) If x, y, x , y , ∈ R and x ∼ x , y ∼ y , then x — y ∼ x — y .

Proof. I leave parts (i) and (ii) to the reader.

(iii) Since x and y are Cauchy sequences they are bounded, that is we

can ¬nd a K such that |xn |, |yn | ¤ K for all n. Thus (writing ab = a — b)

|xn yn ’xm ym | ¤ |xn yn ’ xn ym | + |xn ym ’ xm ym |

= |xn ||yn ’ ym | + |ym ||xn ’ xm | ¤ K(|yn ’ ym | + |xn ’ xm |),

and so the xn yn form a Cauchy sequence.

(iv) As in (iii), we can ¬nd a K such that |xn |, |yn | ¤ K for all n, and so

|xn yn ’xn yn | ¤ |xn yn ’ xn yn | + |xn yn ’ xn yn |

= |xn ||yn ’ yn | + |yn ||xn ’ xn | ¤ K(|yn ’ yn | + |xn ’ xn |) ’ 0

as n ’ ∞, as required.

Lemma 14.4.3 tells us that we can de¬ne addition and multiplication of real

numbers by the formulae

[x] + [y] = [x + y], [x] — [y] = [x — y].

Lemma 14.4.4. The system (R, +, —) is a ¬eld (that is to say, it satis¬es

the rules (A1) to (A4), (M1) to (M4) and (D) set out in Appendix A.)

Proof. I claim, and the reader should check, that the veri¬cation is simple

except possibly in the case of rule (M4). (We take 0 = [a] with aj = 0 for all

j and 1 = [b] with bj = 1 for all j.)

Rule (M4) states that, if [x] ∈ R and [x] = [a] (where aj = 0 for all j),

then we can ¬nd a [y] ∈ R such that [x] — [y] = [b] (where bj = 1 for all j).

To prove this, observe that, if [x] = [a], then xn = xn ’ an 0 as n ’ 0.

Thus we can ¬nd an ∈ Q with > 0 such that, given any N , we can ¬nd

371

Please send corrections however trivial to twk@dpmms.cam.ac.uk

an M ≥ N such that |xM | > . Since the sequence xn is Cauchy, we can ¬nd

an N (0) such that

|xn ’ xm | < /2 for all n, m ≥ N (0) > 0.

Choose N (1) ≥ N (0) such that |xN (1) | > . We then have

|xn | ≥ |xN (1) | ’ |xn ’ xN (1) | > /2

for all n ≥ N (0).

Set yj = 1 for j < N (0) and yj = x’1 for j ≥ N (0). If n, m ≥ N (0) we

j

have

|yn ’ ym | = |xn ’ xm ||xn |’1 |xm |’1 ¤ 4 ’2

|xn ’ xm |,

so the sequence yn is Cauchy. Since xj yj = 1 = bj for all j ≥ N (0), we have

[x] — [y] = [b], as required.

We now introduce an order on R. This is an essential step before we

can introduce distance and limits. We shall need the result of the following

exercise.

Exercise 14.4.5. (i) Let aj = 0 for all j. By adapting the argument of

the third paragraph of the proof of Lemma 14.4.4, show that, if [x] ∈ R and

[x] = [a], then either

(a) there exists an N > 0 and an ∈ Q with > 0 such that xn >

for all n ≥ N , or

(b) there exists an N > 0 and an ∈ Q with > 0 such that ’xn >

for all n ≥ N .

(ii) Suppose that [x] ∈ R and there exists an N > 0 and an ∈ Q with

> 0 such that xn > for all n ≥ N . Show that, if [y] = [x], then there

exists an N > 0 such that yn > /2 for all n ≥ N .

Lemma 14.4.6. The set P of [x] ∈ R such that there exists an N > 0 and

an ∈ Q with > 0 such that xn > for all n ≥ N is well de¬ned. It has

the following properties.

(P1) If [x], [y] ∈ P, then [x] + [y] ∈ P.

(P2) If [x], [y] ∈ P, then [x] — [y] ∈ P.

(P3) If [x] ∈ R, then one and only one of the following three statements

is true [x] ∈ P, ’[x] ∈ P or [x] = [a] where aj = 0 for all j.

Proof. The fact that P is well de¬ned follows from part (ii) of Exercise 14.4.5.

If 1 ∈ Q, 1 > 0, xn ≥ 1 for n ≥ N1 and 2 ∈ Q, 2 > 0, yn ≥ 2 for n ≥ N2 ,

372 A COMPANION TO ANALYSIS

then xn + yn ≥ 1 + 2 > 0 and xn — yn ≥ 1 — 2 > 0 for all n ≥ max(N1 , N2 ),

so conclusions (P1) and (P2) follow.

By part (i) of Exercise 14.4.5, we know that, if [x] ∈ R, then at least

one of the three statements [x] ∈ P, ’[x] ∈ P or [x] = [a] is true. Since the

statements are exclusive, exactly one is true.

As usual, we write [x] > [y] if [x] ’ [y] ∈ P. If [x] ∈ R then, as we have just

shown, either [x] ∈ P and we de¬ne |[x]| = [x] or ’[x] ∈ P and we de¬ne

|[x]| = ’[x] or [x] = [a] (where aj = 0 for all j) and we set |[x]| = [a].

If w ∈ Q, we set wn = w for all n and write θ(w) = [w]. We leave it to

the reader to verify the next lemma.

Lemma 14.4.7. The map θ : Q ’ R is injective and preserves addition,

multiplication and order.

Note that θ(0) = [a] where aj = 0 for all j.

We have now shown that (R, +, —, >) is an ordered ¬eld and so our

de¬nitions of convergence and Cauchy sequence apply. If [x(n)] ∈ R and

[x] ∈ R, we say that [x(n)] ’ [x], if given any [ ] ∈ R with [ ] > θ(0), we

can ¬nd an N ([ ]) > 0 such that

|[x(n)] ’ [x]| < [ ] for all n ≥ N ([ ]).

We say that a sequence [x(n)] in R is Cauchy if, given any [ ] ∈ R, with

[ ] > θ(0) we can ¬nd an N ([ ]) > 0 such that

|[x(n)] ’ [x(m)]| < [ ] for all n, m ≥ N ([ ]).

Lemma 14.4.8. Given any [ ] ∈ R with [ ] > θ(0), we can ¬nd a y ∈ Q

with y > 0 such that [ ] > θ(y).

Proof. This follows from our de¬nition of > via the set P and part (ii) of

Exercise 14.4.5.

Lemma 14.4.9. Given any [x], [y] ∈ R with [x] > [y], we can ¬nd a z ∈ Q

such that [x] > θ(z) > [y].

Proof. By de¬nition, [x] ’ [y] > θ(0) so, by Lemma 14.4.8, we can ¬nd a

v ∈ Q with v > 0 such that [x] ’ [y] > θ(v) > θ(0). Setting u = v/5, we

obtain a u ∈ Q with u > 0 such that [x] ’ [y] > θ(5u) > θ(0). It follows that

we can ¬nd an M such that

xj ’ yj > 4u for all j ≥ M .

373

Please send corrections however trivial to twk@dpmms.cam.ac.uk

Now choose an N ≥ M such that |yj ’ yk | < u and so, in particular,

yj ’ yk > ’u for all j, k ≥ N .

Set z = yN + 2u. By the displayed inequalities just obtained, we have

xk > 4u + yk = (yN + 2u) + (yk ’ yN ) + 2u > z + u

and

yk < y N + u ¤ z ’ u

for all k ≥ N . Thus [x] > θ(z) > [y] as required.

Up to now, our arguments have only used the fact that (Q, +, —, >) is

an ordered ¬eld. We now use the fact speci¬c to (Q, +, —, >) that every

rational x > 0 can be written as x = p/q with p and q strictly positive

integers. Since p/q ≥ 1/q, Lemma 14.4.8 implies the axiom of Archimedes.

Lemma 14.4.10. We have θ(1/n) ’ θ(0) as n ’ ∞.

Exercise 14.4.11. Give the details of the proof of Lemma 14.4.10.

We now show that every Cauchy sequence in (R, +, —, >) converges.

The proof runs along the same lines as the proof of Lemma 14.2.4

Lemma 14.4.12. If [y(1)], [y(2)], [y(3)], . . . is a Cauchy sequence in R,

then we can ¬nd an [x] ∈ R such that [y(j)] ’ [x] as j ’ ∞.

Proof. Since [y(j)] + θ(j ’1 ) > [y(j)], it follows from Lemma 14.4.9 that we

can ¬nd an xj ∈ Q such that

[y(j)] + θ(j ’1 ) > θ(xj ) > [y(j)]

and so

|θ(xj ) ’ [y(j)]| < j ’1 .

We observe that

θ(|xj ’ xk |) = |θ(xj ) ’ θ(xk )|

¤ |θ(xj ) ’ [y(j)]| + |θ(xk ) ’ [y(k)]| + |[y(j)] ’ [y(k)]|

< θ(j ’1 ) + θ(k ’1 ) + |[y(j)] ’ [y(k)]|.

374 A COMPANION TO ANALYSIS

Given any ∈ Q, we can ¬nd an N such that N ’1 < /3 and |[y(j)]’[y(k)]| <

θ( /3) for all j, k ≥ N . The results of this paragraph show that

|xj ’ xk | < /3 + /3 + /3 =

for all j, k ≥ N , and so the xj form a Cauchy sequence.

We now wish to show that [y(n)] ’ [x] as n ’ ∞. Let [ ] > θ(0) be

given. Since the sequence [y(n)] is Cauchy, we can ¬nd an M such that

|[y(j)] ’ [y(k)]| < θ(1/2)[ ] for all j, k ≥ M .

We now use the axiom of Archimedes (Lemma 14.4.10) to show that we can

pick an N ≥ M such that θ(N ’1 ) < θ(1/6)[ ] and observe that the inequality

proved in the last paragraph gives

|θ(xj ) ’ θ(xk )| < θ(1/2)[ ] for all j, k ≥ N .

Thus

|[x] ’ θ(xk )| ¤ θ(1/2)[ ] for all k ≥ N ,

and

|[x] ’ [y(k)]| ¤ |[x] ’ θ(xk )| + |θ(xk ) ’ [y(k)]| ¤ θ(1/2)[ ] + θ(N ’1 ) < [ ]

for all k ≥ N , so we are done.

Exercise 4.6.22 shows that the general principle of convergence and the axiom

of Archimedes together imply the fundamental axiom of analysis. We have

thus proved our desired theorem.

Theorem 14.4.13. (R, +, —, >) is an ordered ¬eld satisfying the funda-

mental axiom of analysis.

As we remarked earlier, most of the construction given in this section

applies to any ordered ¬eld. In Appendix G we push this idea a little further.

As might be expected from Lemma 14.1.5 and the accompanying discussion,

the real number system is unique (in an appropriately chosen sense). The

details are given in Appendix A.

375

Please send corrections however trivial to twk@dpmms.cam.ac.uk

Paradise lost? ™™

14.5

Morris Kline once wrote that ˜A proof tells us where to concentrate our

doubts.™ Our construction of the reals from the rationals is such a remarkable

result that it merits further examination. We know that the rationals are

countable and the reals are uncountable. Where in the proof does the switch

from countable to uncountable occur? Inspection of the proof shows that

the switch happens right at the beginning with the sentence ˜We start by

considering the space X of Cauchy sequences x = (xn )∞ in Q.™ If pressed

n=1

to justify this step I might split it into two:-

(a) Consider the set QN of all sequences x = (xn )∞ in Q (that is all

n=1

functions x : N ’ Q, where N is the set of strictly positive integers).

(b) Consider the subset X of QN consisting of all sequences which satisfy

the Cauchy condition.

Unfortunately my justi¬cation resembles the following pair of de¬nitions.

(a ) Consider the set „¦ of all sets.

(b ) Consider the subset ¦ of „¦ consisting of all A ∈ „¦ such that A ∈ A.

/

As the reader no doubt knows, the de¬nition of ¦ leads to a contradiction.

If ¦ ∈ ¦ then ¦ ∈ ¦, but if ¦ ∈ ¦ then ¦ ∈ ¦.

/ /

There are two schools of thought regarding this paradox. The radical

school wishes to ban all de¬nitions of the form (a) and (b) from mathematics.

At ¬rst sight this seems an easy task “ a return to the simpler and healthier

practices of our ancestors. However, a householder who investigates a trace of

dry rot in one room may well discover that her whole house is so riddled with

fungus that the whole structure must be torn down and rebuilt. It turns out

that classical mathematics (that is the kind of mathematics discussed in this

book) depends so much on de¬nitions and arguments which are unacceptable

to the radicals that they have to rebuild mathematics from scratch.

The various schemes for rebuilding restrict mathematics to the study

of ˜constructible objects™ and break decisively even at the most elementary

level with the traditional practices of mathematicians. A striking example of

this break is that, in the rebuilt structure, all functions are continuous. In

Appendix H, I try to indicate why this is so.

The conservative school, to which most mathematicians belong, says that

there are ˜natural rules™ which bar at least one of (a ) and (b ) whilst al-

lowing both (a) and (b). Just as there are di¬erent radical theories with

di¬erent criteria as to what is a ˜constructible object™, so there are di¬erent

systems of conservative ˜natural rules™. The standard system of rules (the

Zermelo-Fraenkel system) is set out with great charm and clarity in Hal-

mos™s little masterpiece Naive Set Theory [20]. Halmos also shows how the

system of positive integers can be constructed in a rather natural way using

376 A COMPANION TO ANALYSIS

the Zermelo-Fraenkel axioms. The axioms for set theory must be supple-

mented by a system of rules setting out what kind of statements are allowed

in mathematics6 and what laws of inference we may use. Once you have read

Halmos, then Johnstone™s Notes on Logic and Set Theory [26] provide an

excellent short7 but rigorous account of all that the interested non-specialist

needs to know.

All of the analysis in this book and almost all of mathematics that

is presently known can be deduced from the small collection of Zermelo-

Fraenkel axioms in the same way and with the same standard of rigour8 as

Euclid deduced his geometry from his axioms9 . However, plausible as the

Zermelo-Fraenkel axioms seem, their plausibility is based on our experience

with ¬nite sets and to construct classical mathematics we must apply these

axioms to in¬nite sets which lie outside our physical and mental experience.

We cannot therefore be sure that the axioms may not lead to some subtle

contradiction10 . Kline quotes Poincar´, a witty, though not very profound,

e

critic of set theory as saying ˜We have put a fence around the herd to protect

it from the wolves but we do not know whether some wolves were not already

within the fence.™

Few mathematicians of the conservative school seem worried by this.

Partly, this is because in nearly a century of continuous scrutiny no one

has produced a contradiction. (Personally, I am less convinced that the hu-

man race will survive the next hundred years than that the Zermelo-Fraenkel

system will survive the century.) Partly it is because, if a contradiction ap-

peared in the Zermelo-Fraenkel system, we could simply use one of the rival

conservative systems of axioms or, if the worst came to the worst, we could

admit that the radicals were right, after all, and study one of their recon-

structed systems. (Modern mathematics thus appears as a ship towing a

long line of lifeboats. If the ship sinks we move to the ¬rst lifeboat. If the

¬rst lifeboat sinks we move to the second and so on.) Partly it is because

the failure of the Zermelo-Fraenkel system would be one of the most inter-

6

Thus excluding ˜This sentence is false or meaningless™, ˜The smallest positive integera

not de¬nable in less than ¬fty English words™, ˜When does the next train leave for London?™

and so on.

7

I emphasise the word short. The path from the Zermelo-Fraenkel axioms to the

positive integers can be traversed easily in a 24 hour course.

8

I choose my words carefully. Modern mathematicians have found gaps in Euclid™s

reasoning but none of these is large enough to invalidate the work.

9

The massive Bourbaki volumes constitute a proof of this statement.

10

Hilbert tried to avoid this problem by ¬nding a model for the Zermelo-Fraenkel system

inside some system like those proposed by the radicals which would be ˜obviously™ free

of contradiction. Most mathematicians agree that G¨del™s theorem makes it extremely

o

unlikely that such a programme could be successful.

377

Please send corrections however trivial to twk@dpmms.cam.ac.uk

esting things that could happen to mathematics. And ¬nally, we have come

to accept that no really interesting part of mathematics can be proved free

of contradiction11 .

So far in this section, I have tried to give a fair representation of the

thoughts of others. For the short remainder of this section, I shall give my

own views. Mathematics changes slowly. After ten years, some problems

have been solved and some new problems have arisen, but the subject is

recognisably the same. However, over a century it changes in ways which

mathematicians at the beginning of that century could not conceive. The

20th century has, for example, seen special and general relativity, quan-

tum mechanics, the electronic computer, the theorems of G¨del and Turing

o

and Kolmogorov™s axiomatisation of probability. The 19th century saw non-

Euclidean geometry, complex variable theory, quaternions and Cantor™s set

theory. The 17th and 18th century saw the invention of the calculus.

Given this, it seems presumptuous to seek a permanent foundation for

mathematics. It is a happy accident that almost all of mathematics, as we

now know it, can be deduced from one set of axioms but, just as history has

taught us that there is not one geometry but many, so it is not unlikely that

the future will present us with di¬erent mathematics deduced from radically

di¬erent axiom schemes. The art of the mathematician will still consist in

the rigorous proof of unexpected conclusions from simple premises but those

premises will change in ways that we cannot possibly imagine.

11

A hungry fox tried to reach some clusters of grapes which he saw hanging from a vine

trained on a tree, but they were too high. So he went o¬ and comforted himself by saying

˜They weren™t ripe anyhow™. (Aesop)

Appendix A

The axioms for the real

numbers

Axioms for an ordered ¬eld. An ordered ¬eld (F, +, —, >) is a set F,

together with two binary operations + and — and a binary relation >, obeying

the following rules. (We adopt the notation ab = a — b.)

(A1) a + (b + c) = (a + b) + c.

(A2) a + b = b + a.

(A3) There is a unique element 0 ∈ F such that a + 0 = a for all a ∈ F.

(A4) If a ∈ F, then we can ¬nd an ’a ∈ F such that a + (’a) = 0.

[Conditions (A1) to (A4) tell us that (F, +) is a commutative group. We

write a ’ b = a + (’b).]

(M1) a(bc) = (ab)c.

(M2) ab = ba.

(M3) There is a unique element 1 ∈ F such that a1 = a for all a ∈ F.

(M4) If a ∈ F and a = 0, then we can ¬nd an a’1 ∈ F such that

aa’1 = 1.

[Conditions (M1) to (M4) tell us, among other things, that (F \ {0}, —)

is a commutative group.]

(D) a(b + c) = ab + ac.

[Condition (D) is called the distributive law.]

If we write P = {a ∈ F : a > 0}, then

(P1) If a, b ∈ P, then a + b ∈ P.

(P2) If a, b ∈ P, then ab ∈ P.

(P3) If a ∈ F, then one and only one of the following three statements

is true: a ∈ P, ’a ∈ P or a = 0.

We call P the set of strictly positive elements of F and write a > b if and

only if a ’ b ∈ P.

379

380 A COMPANION TO ANALYSIS

Note that, although we state the axioms here, we shall not use

them explicitly in this book. Everything depending on these axioms is,

so far as we are concerned, mere algebra.

The following series of exercises show that, up to the appropriate iso-

morphism, the reals are the unique ordered ¬eld satisfying the fundamental

axiom. Although the reader should probably go through such a proof at some

time, it will become easier as she gets more experience (for example if she

has read the ¬rst four sections of Chapter 14) and is not needed elsewhere in

this book.

Exercise A.1. We work in an ordered ¬eld (F, +, —, >). You should cite

explicitly each of the axioms from the list given on page 379 that you use. We

write 1F for the unit of F (that is, for the unique element of F with 1F a = a

for all a ∈ F) and 0F for the zero of F (that is, for the unique element of F

with 0F + a = a for all a ∈ F).

(i) By considering the equality

ab + a(’b) + (’a)(’b) = ab + a(’b) + (’a)(’b),

show that (’a)(’b) = ab for all a, b ∈ F.

(ii) By (i), we have, in particular, a2 = (’a)2 . By applying axiom (P3),

show that a2 > 0F whenever a = 0F .

(iii) Deduce that 1F > 0F .

Exercise A.2. We continue with the discussion of Exercise A.1. If n is a

strictly positive integer, let us write

n

nF = 1 F + 1 F + · · · + 1 F .

Show, by induction, or otherwise, that nF > 0F , and so in particular nF = 0F .

In the language of modern algebra, we have shown that an ordered ¬eld

has characteristic ∞. It is a standard result that any ¬eld of characteristic

∞ contains a copy of the rationals.

Exercise A.3. We continue with the discussion of Exercise A.2. If n is a

strictly negative integer, let us write nF = ’(’n)F .

(i) Show that if we set „ (n) = nF , then „ : Z ’ F is a well de¬ned

injective map with

„ (n + m) = „ (n) + „ (m),

„ (nm) = „ (n)„ (m),

„ (n) > 0F whenever n > 0,

381

Please send corrections however trivial to twk@dpmms.cam.ac.uk

for all n, m ∈ Z.

(ii) If n, m ∈ Z and m = 0, let us set φ(n/m) = „ (n)„ (m)’1 . Show

that φ : Q ’ F is a well de¬ned (remember that rn/rm = n/m for r = 0)

injective map with

φ(x + y) = φ(x) + φ(y),

φ(xy) = φ(x)φ(y),

φ(x) > 0F whenever x > 0,

for all x, y ∈ Q.

So far we have merely been doing algebra.

Exercise A.4. We continue with the discussion of Exercise A.3, but now

we assume that F satis¬es the fundamental axiom. Let us write K = φ(Q)

(informally, K is a copy of Q in F). Prove the following slight improvement

of Lemma 1.5.6. If x ∈ F, we can ¬nd xj ∈ K with x1 ¤ x2 ¤ . . . such that

xj ’ x as j ’ ∞.

Exercise A.5. We continue with the discussion of Exercise A.4, so, in par-

ticular, we assume that F satis¬es the fundamental axiom.

(i) If x ∈ R, then, by Exercise A.4, we can ¬nd xj ∈ Q with x1 ¤ x2 ¤ . . .

such that xj ’ x. Show, by using the fundamental axiom, that there is a

a ∈ F such that φ(xj ) ’ a as j ’ ∞.

(ii) Show further that, if xj ∈ Q and xj ’ x, then φ(xj ) ’ a as j ’ ∞.

(iii) Conclude that we may de¬ne θ : R ’ F by adopting the procedure of

part (i) and setting θ(x) = a.

(iv) Show that θ : R ’ F is a bijective map.

(v) Show that, if x, y ∈ R, then

θ(x + y) = θ(x) + θ(y),

θ(xy) = θ(x)θ(y),

θ(x) > 0F whenever x > 0.

Appendix B

Countability

Most of my readers will already have met countability and will not need to

read this appendix. If you have not met countability this appendix gives a

˜quick and dirty™ introduction1 .

De¬nition B.1. A set E is called countable if E = … or the elements of E

can be written out as a sequence ej (possibly with repetition), that is, we can

¬nd e1 , e2 , . . . such that

E = {e1 , e2 , e3 . . . }

Lemma B.2. A set E is countable if and only if E is ¬nite or the elements

of E can be written out as a sequence ej without repetition, that is, we can

¬nd e1 , e2 , . . . all unequal such that

E = {e1 , e2 , e3 . . . }

Proof. If the elements of E can be written out as a sequence ej (possibly

with repetition), then either E is ¬nite or we can obtain E as a sequence

without repetition by striking out all terms equal to some previous one.

If E is ¬nite, then either E = …, so E is countable by de¬nition or we can

write E = {ej : 1 ¤ j ¤ N }. Setting en = eN for n ≥ N , we have

E = {e1 , e2 , e3 . . . }

Exercise B.3. Show that any subset of a countable set is countable.

1

The standard treatment via injectivity gives more ¬‚exible methods which are easily

extended to more general situations

383

384 A COMPANION TO ANALYSIS

b1 b3 b6 b10 . . . a1,1 a1,2 a1,3 a1,4 . . .

b2 b5 b9 . . . a2,1 a2,2 a2,3 . . .

b4 b8 . . . a3,1 a3,2 . . .

b7 . . . a4,1 . . .

... ...

Figure B.1: A sequence of sequences set out as a sequence

The key result on countability, at least so far as we are concerned, is given

in the next theorem.

∞

Theorem B.4. If A1 , A2 , . . . are countable, then so is Aj .

j=1

In other words, the countable union of countable sets is countable.

Proof. We leave it to the reader to give the proof when all of the Aj are empty

(easy) or not all the Aj are empty but all but ¬nitely many are (either modify

the proof below or adjoin in¬nitely many copies of one of the non-empty Aj ).

In the remaining case, in¬nitely many of the Aj are non-empty and, by

striking out all the empty sets, we may assume that all of the Aj are non-

empty. Let us write

Aj = {aj,1 , aj,2 , aj,3 , . . . }.

If we write

N (r) = 1 + 2 + · · · + (r ’ 1) = r(r ’ 1)/2

and set bN (r)+k = ar’k+1,k [1 ¤ k ¤ r, 1 ¤ r] (so that we follow the pattern

shown in Figure B.1), we see that

∞

Aj = {b1 , b2 , b3 . . . },

j=1

∞

and so Aj is countable.

j=1

Exercise B.5. Carry out the proofs asked for in the ¬rst sentence of the

proof of Theorem B.4

Theorem B.4 forms the basis for an argument2 known as the ˜hamburger

argument™. (Consider the set Aj of hamburgers to be found in a sphere

2

Marianna Cs¨rnyei claims that this is taught in Hungarian primary schools, but she

o

may exagerate.

385

Please send corrections however trivial to twk@dpmms.cam.ac.uk

radius j kilometres with centre the middle of Trafalgar square. Clearly A j

is ¬nite and so countable. But the set A of all hamburgers in the universe

is given by A = ∞ Aj so, since the countable union of countable sets

j=1

is countable, A is countable. We can enumerate all the hamburgers in the

universe as a sequence.)

Lemma B.6. The set Q of rational numbers is countable.

Proof. Let

Aj = {r/s : |r| + |s| ¤ j, s = 0, r, s ∈ Z}.

The set Aj is ¬nite and so countable. But Q = ∞ Aj so, since the countable

j=1

union of countable sets is countable, Q is countable.

In Exercise 1.6.7 we showed that the closed interval [a, b] with a < b is not

a countable set and so (by Exercise B.3) R is uncountable. Lemma B.6 thus

provides an indirect but illuminating proof that R = Q.

Exercise B.7. (Existence of transcendentals.) A real number x is called

algebraic if

n

ar x r = 0

r=0

for some ar ∈ Z [0 ¤ r ¤ n], an = 0 and n ≥ 1, (in other words x is a root of

a non-trivial polynomial with integer coe¬cients). The ¬rst proof that not all

real numbers are algebraic was due to Liouville (see Exercise K.12). Cantor

used the idea of countability to give a new proof of this.

Let Aj be the set of all real roots of all polynomials P (t) = n ar tr such

r=0

that ar ∈ Z [0 ¤ r ¤ n], an = 0 and n ≥ 1 and

n

|ar | ¤ j.

n+

r=0

By considering properties of the Aj , show that the set of algebraic numbers

is countable. Deduce that not all real numbers are algebraic.

Exercise B.8. (i) A set S of non-negative real numbers is such that there

exists a positive constant K such that

x<K

x∈T

386 A COMPANION TO ANALYSIS

for every ¬nite subset T of S. Prove that the set of non-zero elements of S

is countable.

(ii) A set S of real numbers is such that, whenever a1 , a2 , . . . are distinct

members of S,

∞

an converges.

n=1

∞

Show that we can ¬nd b1 , b2 , . . . such that n=1 bn is absolutely convergent

and

S ⊆ {0} ∪ {bn : n ≥ 1}.

Exercise B.9. Consider a function f : (a, b) ’ R. Recall that we say that

c is a strict maximum (or strict local maximum) of f if we can ¬nd ± and β

such that a ¤ ± < c < β ¤ b such that

f (x) < f (c) for all x ∈ (±, β) with x = c.

By associating each such c with an interval with rational end points, or oth-

erwise, show that the set E of strict maxima is countable.

Give an example to show that E can be in¬nite.

Does the result remain true if we replace (a, b) by the real line R? Does an

appropriate version of the result (to be stated) remain true if we replace (a, b)

by R2 ? Does the result remain true if we replace ˜strict maxima™ by ˜maxima™

(that is, replace f (x) < f (c) in the de¬ning condition by ˜f (x) ¤ f (c)™)?

It is worth remarking that Exercise B.7 re¬‚ects a general principle.

Plausible statement B.10. The set of real numbers that we can describe

in any way is countable.

Plausible argument. I have only a ¬nite number N of symbols on my type-

writer. Thus the set Aj of real numbers that I can describe using j keystrokes

has at most N j members. Since Aj is ¬nite and so countable, we see that

the set ∞ Aj of real numbers that I can describe in any way is countable.

j=1

The reader should be warned that there are severe philosophical and math-

ematical obstacles in the way of any attempt to make precise the statement

and argument above. Never the less, most mathematicians are inclined to

accept the general thrust of the argument. When we talk about R we are

talking about a set most of whose members we can not describe.

Here is another reason to prefer rigour to intuition.

Appendix C

The care and treatment of

counterexamples

Mathematics students dislike questions beginning ˜¬nd a proof or counterex-

ample™. They know that it is much harder to prove or disprove a result if

you are not sure which of the two possible outcomes will occur.

In research we are faced with many uncertainties. We do not know

whether our problem is easy, hard or beyond our capabilities (whereas in

an exam we know that the problems will not exceed a certain level of dif-

¬culty). We often do not know if the solution of a particular problem will

open a path to further progress or turn out to be a dead end. And, whatever

our suspicions or hopes, we do not know whether the result we seek is true

or false.

In these circumstances, mathematicians often adopt a two pronged attack.

First they try to prove the result. When their attack fails, they try to identify

the point where it fails and try to construct a counterexample centred round

the point of failure. If they cannot construct a counterexample they try to

identify the reason why they cannot do so and this may give them a hint

as to how get round their previous di¬culty in the proof. By alternating

between determined attempts to seek a proof and determined attempts to

¬nd a counterexample they hope to arrive at a useful result.

If their attack ends in a counterexample, the counterexample may suggest

how to strengthen hypotheses or weaken conclusions so as to produce a true

theorem.

If the attack ends in a theorem the matter does not ¬nish there. Ex-

amination questions and exercises such as are found in this book are made

easier by the fact that we know that all the information supplied is relevant.

We know that, in real life, problems are made much harder by an excess

387

388 A COMPANION TO ANALYSIS

of irrelevant information1 . We expect, sometimes wrongly but often rightly,

that a proof which depends on unnecessary hypotheses will be less clear

than one which only uses the essential hypotheses. Thus, once a theorem

is established, we usually embark on a systematic programme of weakening

hypotheses and strengthening conclusions, not only in the hope of obtaining

a stronger theorem but also in the hope of obtaining a better proof. A the-

orem is therefore frequently accompanied by a collection of counterexamples

showing that hypotheses cannot be weakened or conclusions strengthened.

What I have said so far is probably obvious to the reader and to most

students of mathematics. However many students fail to draw the obvious

moral. Since counterexamples are not ends in themselves but objects which

we study in the hope of obtaining positive results,

counterexamples should be as simple as possible.

Let me illustrate this slogan by considering the notion of continuity. The

¬rst question we must ask after de¬ning the notion of a continuous function

is whether all functions are continuous. Here is the simplest counterexample

I can think of.

Example C.1. If we de¬ne f : R ’ R by

f (t) = 0 for t = 0

f (0) = 1,

then f is not continuous at 0.

The reader may feel that this is arti¬cial since f can be rede¬ned at one

point (by setting f (0) = 0) in such a way as to make it continuous. We may

therefore ask for a function which cannot be so de¬ned.

Example C.2. If we de¬ne H : R ’ R by

H(t) = 0 for t < 0

H(t) = 1 for t > 0

H(0) = a,

then, whatever the value of the real number a, H is not continuous at 0.

1

Remember the old riddle. A bus sets out from a depot with only the driver on board.

At the ¬rst stop two people get on. At the next, three get on and one gets o¬. At the

next, ¬ve get on and two get o¬. At the next, four get o¬ and two get on. At the next

three get o¬ and two get on. At the next ¬ve get on and two get o¬. How many stops has

the bus made?

389

Please send corrections however trivial to twk@dpmms.cam.ac.uk

Proof. If f : R ’ R is continuous at 0 then

f (1/n) ’ f (’1/n) ’ f (0) ’ f (0) = 0

as n ’ ∞. Since

H(1/n) ’ H(’1/n) = 1 0

as n ’ ∞, H is not continuous at 0.

The discontinuity exhibited by H is of a particularly simple kind since the

right and left limits

H(0+) = lim H(t) and H(0’) = lim H(t)

t’0, t>0 t’0, t<0

exist. We may, therefore, ask if there exist functions without such left and

right limits.

Example C.3. If we de¬ne f : R ’ R by

f (t) = 1/t for t = 0

f (0) = 0,

then f has neither a left limit nor a right limit at 0.

We may feel that this is unsatisfactory and ask whether this phenomenon

can occur for a bounded function.

Example C.4. If we de¬ne f : R ’ R by

f (t) = sin(1/t) for t = 0

f (0) = 0,

then f is bounded and has neither a left limit nor a right limit at 0.

1

Proof. Observe that f (1/(2nπ)) = 0 ’ 0 but f (1/((2n + 2 )π) = 1 ’ 1 as

n ’ ∞. Thus f (t) does not tend to a limit as t ’ 0 through values of t > 0.

Similarly, f (t) does not tend to a limit as t ’ 0 through values of t < 0.

Exercise C.5. Find a bounded function f : R ’ R such that f (t) ’ f (0)

as t ’ 0 through values of t > 0 but f (t) does not tend to a limit as t ’ 0

through values of t < 0.

The next exercise is a trivial complementary observation with a one line

proof.

390 A COMPANION TO ANALYSIS

Exercise C.6. Show that if f : R ’ R is such that f (t) ’ f (0) as t ’ 0

through values of t > 0 and f (t) ’ f (0) as t ’ 0 through values of t < 0

then f is continuous at 0.

In view of Exercise C.6 we may ask if a function F : R2 ’ R is such

that F (ut) ’ F (0) as t ’ 0 for all unit vectors u then F is continuous at

0 = (0, 0). This is a harder question and, instead of sitting around trying

to think of a function which might provide a counterexample, we actively

construct one ˜by hand™.

Exercise C.7. (i) Let rn = 1/n and θn = π/(2n) and take

xn = (rn cos θn , rn sin θn ).

Show that we can ¬nd δn > 0 such that xn /2 > δn > 0 and, writing

Bn = {y : xn ’ y ¤ δn },

we know that the following statement is true. If u is any unit vector, the line

{tu : t ∈ R} intersects at most one of the Bn . Give a sketch to illustrate

your result.

(ii) Show that there exists a continuous function gn : Rn ’ R such that

gn (xn ) = 1,

for all y ∈ Bn .

gn (y) = 0 /

Explain why setting F (x) = ∞ gn (x) gives a well de¬ned function. Show

n=1

that F (ut) ’ F (0) as t ’ 0 for all unit vectors u but F is not continuous

at 0.

Exercise C.8. (This exercise requires material from Chapter 6.)

(i) Show that we can ¬nd a twice continuously di¬erentiable function h :

R ’ R such that

h(0) = 1,

for all |t| > 1/2

h(t) = 0

1 ≥ h(t) ≥ 0 for all t.

If δ > 0 and z ∈ R2 sketch the function g : R2 ’ R given by g(x) =

h(δ ’1 z ’ x ).

(ii) By taking F = ∞ n’1 gn with appropriately chosen gn (you will

n=1

need to be more careful in the choice than we were in the previous exercise),

show that there exists a continuous function F : R2 ’ R with directional

derivative zero in all directions at 0 but which is not di¬erentiable at 0.

(iii) Why is the result of Exercise C.8 stronger than that of Example 7.3.14?

[We give a slight strengthening of this result in Exercise K.314.]

391

Please send corrections however trivial to twk@dpmms.cam.ac.uk

Exercise C.9. Consider a function f : R2 ’ R.

(i) Show that f is continuous at 0 = (0, 0) if and only if, whenever we

have a sequence xn ’ 0, it follows that f (xn ) ’ f (0) as n ’ ∞.

(ii) Show that f is continuous at 0 if and only if, whenever we have a

continuous function γ : [0, 1] ’ R2 with γ(0) = 0, it follows that f (γ(t)) ’

f (0) as t ’ 0 through values of t > 0.

Thus, if f (x) tends to f (0) along every path leading to 0, then f is

continuous. However, as we have seen, the result fails if we replace ˜every

path™ by ˜every straight-line path™.

If we wish to understand the ways in which a function can fail to be

continuous at a point, it is surely more instructive to begin with the simple

Example C.1 rather than rush to the complicated Exercise C.7. In the same

way, if we wish to give an example of a continuous function which fails to

be di¬erentiable it is surely best to look at the function f : R ’ R given

by f (x) = |x|, rather than the fairly intricate construction of a continuous

nowhere di¬erentiable function given in Exercise K.223.

Proceeding from the simple to the complex as we did in studying dis-

continuity at a point has a further advantage that it gives us a number of

examples of functions to think about rather than just one. To quote Halmos

If I had to describe my conclusions [on how to study mathe-

matics] in one word, I™d say examples. They are to me of paramount

importance. Every time I learn a new concept (free group, or

pseudodi¬erential operator, or paracompact space) I look for ex-

amples ” and, of course, non-examples. The examples should

include, whenever possible, the typical ones and the extreme de-

generate ones. Yes, to be sure, R2 is a real vector space, but so

is the space of all real polynomials with real coe¬cients and the

set of all analytic functions de¬ned on the open disc ” and what

about the 0-dimensional case. Are there examples that satisfy

all the conditions of the de¬nition except 1x = x? Is the set of

all real-valued monotone increasing functions de¬ned on, say, the

unit interval a real vector space? How about all the monotone

ones, both increasing and decreasing . . .

A good stock of examples, as large as possible, is indispens-

able for a thorough understanding of any concept and when I

want to learn something new, I make it my ¬rst job to build one.

Sometimes to be sure, that might require more of the theory than

the mere de¬nition reveals. When we ¬rst learn about transcen-

dental numbers, for instance, it is not at all obvious that such

392 A COMPANION TO ANALYSIS

things exist. Just the opposite happens when we are ¬rst told

about measurable sets. There are plenty of them, there is no

doubt about that; what is less easy is to ¬nd a single example of

a set that is not like that2 . [The quotation is from I Want to be

a Mathematician[21], a book well worth reading in its entirety.]

This quotation brings me to my second point. Halmos talks about typical

examples and illustrates his remarks with the idea of a vector space. In some

sense typical vector spaces exist since we have the following classi¬cation

theorem from algebra.

Theorem C.10. A ¬nite dimensional vector space over R is isomorphic to

Rn for some positive integer n.

[There is a similar but duller result for in¬nite dimensional vector spaces.]

There is no such result for continuous functions and I suspect that, to

misquote Haldane ˜The typical continuous function is not only odder than we

imagine, it is odder than we can possibly imagine™3 . This is why mathemati-

cians demand so much rigour in proof. When we showed, in Theorem 4.3.4,

that every real-valued continuous function on a closed bounded set in Rn is

bounded and attains its bounds we proved a result which applied not only to

all continuous functions that we know but to all continuous functions that

the human race will ever know and to continuous functions which nobody

will ever know and, if my belief is right, to continuous functions which are

so wild as to be literally unimaginable4 .

Even if we restrict ourselves to well behaved functions, I would argue that

it is much harder than it looks to obtain a su¬ciently large library of typical

functions. (See Remark 4 on page 347.)

On the other hand, when we talk about an example or a counterexample

we are talking about a single object and we do not need the obsessive rigour

demanded by a proof which will apply to many objects. To show that a

unicorn exists we need only exhibit a single unicorn. To show that no unicorn

exists requires much more careful argument.

2

The set considered in Exercise 8.1.4 is an example. (T.W.K.)

3

In spite of Theorem C.10, this may be true of vector spaces as well. Conway used to

say that one dimension is easy, two harder, three very hard, four still harder, . . . , that

somewhere about twenty eight things became virtually impossible but once you reached

in¬nite dimensions things became easier again.

4

Since the 1890™s it has been known that there exist everywhere di¬erentiable functions

which take a strict local maximum value at a dense set of points. Until you read this

footnote, did you imagine that such a thing was possible? Can you imagine it now? For

a modern proof see [27].

393

Please send corrections however trivial to twk@dpmms.cam.ac.uk

Demonstrations involving a single known object can be much

simpler than those involving a multitude of unknown objects.

Appendix D

A more general view of limits

One of the more tedious parts of the standard treatment of analysis given in

this book is the repeated de¬nition of various kinds of limits. Here is a small

selection.

De¬nition D.1. If an is a sequence of real numbers and a is a real number,

we say that an ’ a as n ’ ∞ if, given any > 0, we can ¬nd an n0 ( ) such

that |a ’ an | < for all n > n0 ( ).

De¬nition D.2. If an is a sequence of real numbers and a is a real number,

we say that an ’ ∞ as n ’ ∞ if, given any K, we can ¬nd an n0 (K) such

that an > K for all n > n0 (K).

De¬nition D.3. If f : (a, b) ’ R is a function and t ∈ (a, b), we say that

f (x) ’ c as x ’ t if, given any > 0, we can ¬nd an δ0 ( ) such that

|f (x) ’ c| < for all x ∈ (a, b) with 0 = |t ’ x| < δ0 ( ).

De¬nition D.4. If f : (a, b) ’ R is a function and t ∈ (a, b), we say that

f (x) ’ c as x ’ t through values of x > t if, given any > 0, we can ¬nd

an δ0 ( ) such that and |f (x) ’ c| < for all x ∈ (a, b) with 0 < x ’ t < δ 0 ( ).

In theory, each of these de¬nitions should be accompanied by a collection

of lemmas showing that each limit behaves in the way every other limit

behaves. In practice, such lemmas are left to the reader. (In this book I have

tended to look most carefully at de¬nitions of the type of De¬nition D.1.)

Experience seems to show that this procedure works quite well, both from

the pedagogic and the mathematical point of view, but it is, to say the least,

untidy.

In his book Limits, A New Approach to Real Analysis [2], Beardon pro-

poses an approach based on directed sets which avoids all these di¬culties1 .

1

The notion of a directed set goes back to E. H. Moore in 1910 but Beardon™s is the

only elementary text I know which uses it.

395

396 A COMPANION TO ANALYSIS

Figure D.1: A family tree for division

For a mixture of reasons, some good and some not so good, the teaching of

elementary analysis is extremely conservative2 and it remains to be seen if

this innovation will be widely adopted.

De¬nition D.5. A relation on a non-empty set X is called a direction if

(i) whenever x y and y z, then x z, and

(ii) whenever x, y ∈ X, we can ¬nd z ∈ X such that z x and z y.

We call (X, ) a directed set.