ńņš. 7 |

d

=ā’ h(t) F,3 (t, f (t), f (t)) dt.

dt

a

Combining the results of the last two sentences, we see that

b

d

h(t) F,2 (t, f (t), f (t)) ā’

0= F,3 (t, f (t), f (t)) dt.

dt

a

Since this result must hold for all h ā A, we see that

d

F,2 (t, f (t), f (t)) ā’ F,3 (t, f (t), f (t)) = 0

dt

for all t ā [a, b] (for details see Lemma 8.4.7 below) and this is the result we

set out to prove.

196 A COMPANION TO ANALYSIS

In order to tie up loose ends, we need the following lemma.

Lemma 8.4.7. Suppose f : [a, b] ā’ R is continuous and

b

f (t)h(t) dt = 0,

a

whenever h : [a, b] ā’ R is an inļ¬nitely diļ¬erentiable function with h(a) =

h(b) = 0. Then f (t) = 0 for all t ā [a, b].

Proof. By continuity, we need only prove that f (t) = 0 for all t ā (a, b).

Suppose that, in fact, f (x) = 0 for some x ā (a, b). Without loss of generality

we may suppose that f (x) > 0 (otherwise, consider ā’f ). Since (a, b) is open

and f is continuous we can ļ¬nd a Ī“ > 0 such that [x ā’ Ī“, x + Ī“] ā (a, b) and

|f (t) ā’ f (x)| < f (x)/2 for t ā [x ā’ Ī“, x + Ī“]. This last condition tells us that

f (t) > f (x)/2 for t ā [x ā’ Ī“, x + Ī“].

In Example 7.1.6 we constructed an inļ¬nitely diļ¬erentiable function E :

R ā’ R with E(t) = 0 for t ā¤ 0 and E(t) > 0 for t > 0. Setting h(t) =

E(t ā’ x + Ī“)E(ā’t + x + Ī“) when t ā [a, b], we see that h is an inļ¬nitely

diļ¬erentiable function with h(t) > 0 for t ā (x ā’ Ī“, x + Ī“) and h(t) = 0

otherwise (so that, in particular h(a) = h(b) = 0). By standard results on

the integral,

b x+Ī“ x+Ī“

f (t)h(t) dt ā„

f (t)h(t) dt = (f (x)/2)h(t) dt

a xā’Ī“ xā’Ī“

x+Ī“

f (x)

= h(t) dt > 0,

2 xā’Ī“

so we are done.

Exercise 8.4.8. State explicitly the ā˜standard results on the integralā™ used in

the last sentence of the previous proof and show how they are applied.

Theorem 8.4.6 is often stated in the following form. If the function y :

[a, b] ā’ R minimises J then

ā‚F d ā‚F

= .

ā‚y dx ā‚y

This is concise but can be confusing to the novice6 .

The Euler-Lagrange equation can only be solved explicitly in a small

number of special cases. The next exercise (which should be treated as an

6

It certainly confused me when I met it for the ļ¬rst time.

197

Please send corrections however trivial to twk@dpmms.cam.ac.uk

exercise in calculus rather than analysis) shows how, with the exercise of

some ingenuity, we can solve the brachistochrone problem with which we

started. Recall that this asked us to minimise

1/2

b

1 + f (x)2

1

J(f ) = dx.

(2g)1/2 Īŗ ā’ f (x)

a

Exercise 8.4.9. We use the notation and assumptions of Theorem 8.4.6.

(i) Suppose that F (u, v, w) = G(v, w) (often stated as ā˜t does not appear

explicitly in F =F (t, y, y )ā™). Show that the Euler-Lagrange equation becomes

d

G,1 (f (t), f (t)) = G,2 (f (t), f (t))

dt

and may be rewritten

d

G(f (t), f (t)) ā’ f (t)G,2 (f (t), f (t)) = 0.

dt

Deduce that

G(f (t), f (t)) ā’ f (t)G,2 (f (t), f (t)) = c

ā‚F

where c is a constant. (This last result is often stated as F ā’ y = c.)

ā‚y

(ii) (This is not used in the rest of the question.) Suppose that F (u, v, w) =

G(u, w). Show that

G,2 (t, f (t)) = c

where c is a constant.

(iii) By applying (i), show that solutions of the Euler-Lagrange equation

associated with the brachistochrone are solutions of

1

=c

((Īŗ ā’ f (x))(1 + f (x)2 ))1/2

where c is a constant. Show that this equation can be rewritten as

1/2

B + f (x)

f (x) = .

A ā’ f (x)

(iv) We are now faced with ļ¬nding the curve

1/2

dy B+y

= .

Aā’y

dx

198 A COMPANION TO ANALYSIS

If we are suļ¬ciently ingenious (or we know the answer), we may be led to

try and express this curve in parametric form by setting

Aā’B A+B

ā’

y= cos Īø.

2 2

Show that

dx A+B

= (1 + cos Īø),

dĪø 2

and conclude that our curve is (in parametric form)

x = a + k(Īø ā’ sin Īø), y = b ā’ k cos Īø

for appropriate constants a, b and k. Thus any curve which minimises the

time of descent must be a cycloid.

It is important to observe that we have shown that any minimising func-

tion satisļ¬es the Euler-Lagrange equation and not that any function sat-

isfying the Euler-Lagrange equation is a minimising function. Exactly the

same argument (or replacing J by ā’J), shows that any maximising function

satisļ¬es the Euler-Lagrange equation. Further, if we reļ¬‚ect on the simpler

problem discussed in section 7.3, we see that the Euler-Lagrange equation

will be satisļ¬ed by functions f such that

Gh (Ī·) = J(f + Ī·h)

has a minimum at Ī· = 0 for some h ā E and a maximum at Ī· = 0 for others.

Exercise 8.4.10. With the notation of this section show that, if f satisļ¬es

the Euler-Lagrange equations, then Gh (0) = 0.

To get round this problem, examiners ask you to ā˜ļ¬nd the values of f

which make J stationaryā™ where the phrase is equivalent to ā˜ļ¬nd the val-

ues of f which satisfy the Euler-Lagrange equationsā™. In real life, we use

physical intuition or extra knowledge about the nature of the problem to

ļ¬nd which solutions of the Euler-Lagrange equations represent maxima and

which minima.

Mathematicians spent over a century seeking to ļ¬nd an extension to the

Euler-Lagrange method which would enable them to distinguish true maxima

and minima. However, they were guided by analogy with the one dimensional

(if f (0) = 0 and f (0) > 0 then 0 is minimum) and ļ¬nite dimensional

case and it turns out that the analogy is seriously defective. In the end,

199

Please send corrections however trivial to twk@dpmms.cam.ac.uk

Figure 8.1: A problem for the calculus of variations

Weierstrass produced examples which made it plain what was going on. We

discuss a version of one of them.

Consider the problem of minimising

1

(1 ā’ (f (x))4 )2 + f (x)2 dx

I(f ) =

0

where f : [0, 1] ā’ R is once continuously diļ¬erentiable and f (0) = f (1) = 0.

Exercise 8.4.11. We look at

Gh (Ī·) = I(Ī·h).

Show that Gh (Ī·) = 1 + Ah Ī· 2 + Bh Ī· 4 + Ch Ī· 8 where Ah , Bh , Ch depend on

h. Show that, if h is not identically zero, Ah > 0 and deduce that Gh has a

strict minimum at 0 for all non-zero h ā E.

We are tempted to claim that ā˜I(f ) has a local minimum at f = 0ā™.

Now look at the function gn [n a strictly positive integer] illustrated in

Figure 8.1 and deļ¬ned by

2r 2r 1

gn (x) = x ā’ for x ā’ ā¤ ,

2n 2n 4n

2r + 1 2r + 1 1

ā’x for x ā’ ā¤

gn (x) = ,

2n 2n 4n

whenever r is an integer and x ā [0, 1]. Ignoring the ļ¬nite number of points

where gn is not diļ¬erentiable, we see that gn (x) = Ā±1 at all other points,

and so

1

gn (x)2 dx ā’ 0 as n ā’ ā.

I(gn ) =

0

200 A COMPANION TO ANALYSIS

Figure 8.2: The same problem, smoothed

It is clear that we can ļ¬nd a similar sequence of functions fn which is con-

tinuously diļ¬erentiable by ā˜rounding the sharp bitsā™ as in Figure 8.2.

The reader who wishes to dot the iā™s and cross the tā™s can do the next

exercise.

> 0 and let k : [0, 1] ā’ R be the function

Exercise 8.4.12. (i) Let 1/2 >

such that

k(x) = ā’1 x for 0 ā¤ x ā¤ ,

for ā¤ x ā¤ 1 ā’ ,

k(x) = 1

k(x) = ā’1 (1 ā’ x) for 1 ā’ ā¤ x ā¤ 1.

Sketch k.

(ii) Let kn (x) = (ā’1)[2nx] k(2nx ā’ 2[nx]) for x ā [0, 1]. (Here [2nx] means

the integer part of 2nx.) Sketch the function kn .

(iii) Let

x

Kn (x) = kn (t) dt

0

for x ā [0, 1]. Sketch Kn . Show that 0 ā¤ Kn (x) ā¤ 1/(2n) for all x. Show that

Kn is once diļ¬erentiable with continuous derivative. Show that |Kn (x)| ā¤ 1

for all x and identify the set of points where |Kn (x)| = 1.

(iv) Show that there exists a sequence of continuously diļ¬erentiable func-

tions fn : [0, 1] ā’ R, with fn (0) = fn (1) = 0, such that

I(fn ) ā’ 0 as n ā’ ā.

[This result is slightly improved in Example K.134.]

This example poses two problems. The ļ¬rst is that in some sense the fn

are close to f0 = 0 with I(fn ) < I(f0 ), yet the Euler-Lagrange approach of

201

Please send corrections however trivial to twk@dpmms.cam.ac.uk

Exercise 8.4.11 seemed to show that I(f0 ) was smaller than those I(f ) with

f close to f0 . One answer to this seeming paradox is that, in Exercise 8.4.11,

we only looked at Gh (Ī·) = I(Ī·h) as Ī· became small, so we only looked at

certain paths approaching f0 and not at all possible modes of approach. As

Ī· becomes small, not only does Ī·h become small but so does Ī·h . However,

as n becomes large, fn becomes small but fn does not. In general when

the Euler-Lagrange method looks at a function f it compares it only with

functions which are close to f and have derivative close to f . This does

not aļ¬ect the truth of Theorem 8.4.6 (which says that the Euler-Lagrange

equation is a necessary condition for a minimum) but makes it unlikely that

the same ideas can produce even a partial converse.

Once we have the notion of a metric space we can make matters even

clearer. (See Exercise K.199 to K.201.)

Exercise 8.4.13. This exercise looks back to Section 7.3. Let U be an open

subset of R2 containing (0, 0). Suppose that f : U ā’ R has second order

partial derivatives on U and these partial derivatives are continuous at (0, 0).

Suppose further that f,1 (0, 0) = f,2 (0, 0) = 0. If u ā R2 we write Gu (Ī·) =

f (Ī·u).

(i) Show that Gu (0) = 0 for all u ā R2 .

(ii) Let e1 = (1, 0) and e2 = (0, 1) Suppose that Ge1 (0) > 0 and Ge2 (0) >

0. Show, by means of an example, that (0, 0) need not be a local minimum

for f . Does there exist an f with the properties given which attains a local

minimum at (0, 0)? Does there exist an f with the properties given which

attains a local maximum at (0, 0)?

(iii) Suppose that Gu (0) > 0 whenever u is a unit vector. Show that f

attains a local minimum at (0, 0).

The second problem raised by results like Exercise 8.4.12 is also very

interesting.

Exercise 8.4.14. Use Exercise 8.3.4 to show that I(f ) > 0 whenever f :

[0, 1] ā’ R is a continuously diļ¬erentiable function.

Conclude, using the discussion above, that the set

{I(f ) : f continuously diļ¬erentiable}

has an inļ¬mum (to be identiļ¬ed) but no minimum.

Exercise 8.1. Here is a simpler (but less interesting) example of a varia-

tional problem with no solution, also due to Weierstrass. Consider the set

202 A COMPANION TO ANALYSIS

E of functions f : [ā’1, 1] ā’ R with continuous derivative and such that

f (ā’1) = ā’1, f (1) = 1. Show that

1

x2 f (x)2 dx = 0

inf

f āE ā’1

1

x2 f0 (x)2 dx = 0.

but there does not exist any f0 ā E with ā’1

The discovery that that they had been talking about solutions to prob-

lems which might have no solutions came as a severe shock to the pure

mathematical community. Of course, examples like the one we have been

discussing are ā˜artiļ¬cialā™ in the sense that they have been constructed for

the purpose but unless we can come up with some criterion for distinguish-

ing ā˜artiļ¬cialā™ problems from ā˜realā™ problems this takes us nowhere. ā˜If we

have actually seen one tiger, is not the jungle immediately ļ¬lled with tigers,

and who knows where the next one lurks.ā™ The care with which we proved

Theorem 4.3.4 (a continuous function on a closed bounded set is bounded

and attains its bounds) and Theorem 4.4.4 (Rolleā™s theorem, considered as

the statement that, if a diļ¬erentiable function f on an open interval (a, b)

attains a maximum at x, then f (x) = 0) are distant echos of that shock. On

the other hand, the new understanding which resulted reviviļ¬ed the study

of problems of maximisation and led to much new mathematics.

It is always possible to claim that Nature (with a capital N) will never set

ā˜artiļ¬cialā™ problems and so the applied mathematician need not worry about

these things. ā˜Nature is not troubled by mathematical diļ¬culties.ā™ However,

a physical theory is not a description of nature (with a small n) but a model

of nature which may well be troubled by mathematical diļ¬culties. There

are at least two problems in physics where the model has the characteristic

features of our ā˜artiļ¬cialā™ problem. In the ļ¬rst, which asks for a description

of the electric ļ¬eld near a very sharp charged needle, the actual experiment

produces sparking. In the second, which deals with crystallisation as a system

for minimising an energy function not too far removed from I, photographs

reveal patterns not too far removed from Figure 8.1!

8.5 Vector-valued integrals

So far we have dealt only with the integration of functions f : [a, b] ā’ R.

The general programme that we wish to follow would direct us to consider

the integration of functions f : E ā’ Rm where E is a well behaved subset of

Rn . In this section we shall take the ļ¬rst step by considering the special case

of a well behaved function f : [a, b] ā’ Rm . Since C can be identiļ¬ed with R2 ,

203

Please send corrections however trivial to twk@dpmms.cam.ac.uk

our special case contains, as a still more special (but very important case),

the integration of well behaved complex-valued functions f : [a, b] ā’ C.

The deļ¬nition is simple.

Deļ¬nition 8.5.1. If f : [a, b] ā’ Rm is such that fj : [a, b] ā’ R is Rie-

mann integrable for each j, then we say that f is Riemann integrable and

b

f (x) dx = y where y ā Rm and

a

b

yj = fj (x) dx

a

for each j.

In other words,

b b

f (x) dx = fj (x) dx.

a a

j

It is easy to obtain the properties of this integral directly from its deļ¬ni-

tion and the properties of the one dimensional integral. Here is an example.

Lemma 8.5.2. If Ī± : Rm ā’ Rp is linear and f : [a, b] ā’ Rm is Riemann

integrable, then so is Ī±f and

b b

(Ī±f )(x) dx = Ī± f (x) dx .

a a

Proof. Let Ī± have matrix representation (aij ). By Lemma 8.2.11,

m

(Ī±f )i = aij fj

j=1

is Riemann integrable and

bm m b

aij fj (x) dx = aij fj (x) dx.

a a

j=1 j=1

Comparing this with Deļ¬nition 8.5.1, we see that we have the required result.

Taking Ī± to be any orthogonal transformation of Rm to itself, we see

that our deļ¬nition of the integral is, in fact, coordinate independent. (Re-

member, it is part of our programme that nothing should depend on the

particular choice of coordinates we use. The reader may also wish to look at

Exercise K.137.)

Choosing a particular orthogonal transformation, we obtain the following

nice result.

204 A COMPANION TO ANALYSIS

Theorem 8.5.3. If f : [a, b] ā’ Rm is Riemann integrable then

b

f (x) dx ā¤ (b ā’ a) sup f (x) .

xā[a,b]

a

This result falls into the standard pattern

size of integral ā¤ length Ć— sup.

Proof. If y is a vector in Rm , we can always ļ¬nd a rotation Ī± of Rm such

that Ī±y lies along the x1 axis, that is to say, (Ī±y)1 ā„ 0 and (Ī±y)j = 0 for

b

2 ā¤ j ā¤ m. Let y = a f (x) dx. Then

b b

f (x) dx = Ī± f (x) dx

a a

b

= Ī± f (x) dx

a 1

b

= (Ī±f (x))1 dx

a

ā¤ (b ā’ a) sup |(Ī±f (x))1 |

xā[a,b]

ā¤ (b ā’ a) sup Ī±f (x)

xā[a,b]

= (b ā’ a) sup f (x) .

xā[a,b]

Exercise 8.5.4. Justify each step in the chain of equalities and inequalities

which concluded the preceding proof.

Exercise 8.5.5. Show that the collection R of Riemann integrable functions

f : [a, b] ā’ Rm forms a real vector space with the natural operations. If we

write

b

Tf = f (x) dx

a

and f ā = suptā[a,b] f (t) , show that T : R ā’ R is a linear map and

T f ā¤ (b ā’ a) f ā .

Chapter 9

Developments and limitations

of the Riemann integral ā™„

9.1 Why go further?

Let us imagine a conversation in the 1880ā™s between a mathematician opposed

to the ā˜new rigourā™ and a mathematician who supported it. The opponent

might claim that the deļ¬nition of the Riemann integral given in section 8.2

was dull and gave rise to no new theorems. The supporter might say, as

this book does, that deļ¬nitions are necessary in order that we know when

we have proved something and to understand what we have proved when we

have proved it. He would, however, have to admit both the dullness and the

lack of theorems. Both sides would regretfully agree that there was probably

little more to say about the matter.

Twenty years later, Lebesgue, building on work of Borel and others,

produced a radically new theory of integration. From the point of view

of Lebesgueā™s theory, Riemann integration has a profound weakness. We

saw in Lemma 8.2.11 and Exercise 8.2.14 that we cannot leave the class of

Riemann integrable functions if we only perform algebraic operations (for

example the product of two Riemann integrable functions is again Riemann

integrable). However we can leave the class of Riemann integrable functions

by performing limiting operations.

Exercise 9.1.1. Let fn : [0, 1] ā’ R be deļ¬ned by fn (r2ā’n ) = 1 if r is an

integer with 0 ā¤ r ā¤ 2n , fn (x) = 0, otherwise.

(i) Show that fn is Riemann integrable.

(ii) Show that there exists an f : [0, 1] ā’ R, which you should deļ¬ne

explicitly, such that fn (x) ā’ f (x) as n ā’ ā, for each x ā [0, 1].

(iii) Show, however, that f is not Riemann integrable.

205

206 A COMPANION TO ANALYSIS

[See also Exercise K.138.]

The class of Lebesgue integrable functions includes every Riemann inte-

grable function but behaves much better when we perform limiting opera-

tions. As an example, which does not give the whole picture but shows the

kind of result that can be obtained, contrast Exercise 9.1.1 with the following

lemma.

Lemma 9.1.2. Let fn : [a, b] ā’ R be a sequence of Lebesgue integrable

functions with |fn (x)| ā¤ M for all x ā [0, 1] and all n. If fn (x) ā’ f (x) as

n ā’ ā for each x ā [0, 1], then f is Lebesgue integrable and

b b

fn (x) dx ā’ f (x) dx.

a a

It is important to realise that mathematicians prize the Lebesgue inte-

gral, not because it integrates more functions (most functions that we meet

explicitly are Riemann integrable), but because it gives rise to beautiful the-

orems and, at a deeper level, to beautiful theories way beyond the reach of

the Riemann integral.

DieudonnĀ“ dismisses the Riemann integral with scorn in [13], Chap-

e

ter VIII.

It may well be suspected that, had it not been for its pres-

tigious name, this [topic] would have been dropped long ago

[from elementary analysis courses], for (with due reverence to

Riemannā™s genius) it is certainly clear to any working mathemati-

cian that nowadays such a ā˜theoryā™ has at best the importance of

a mildly interesting exercise in the general theory of measure and

integration. Only the stubborn conservatism of academic tradi-

tion could freeze it into a regular part of the curriculum, long

after it had outlived its historical importance.

Stubborn academic conservatives like the present writer would reply that,

as a matter of observation, many working mathematicians1 do not use and

have never studied Lebesgue integration and its generalisation to measure

theory. Although measure theory is now essential for the study of all branches

of analysis and probability, it is not needed for most of number theory, alge-

bra, geometry and applied mathematics.

1

Of course, it depends on who you consider to be a mathematician. A particular French

academic tradition begins by excluding all applied mathematicians, continues by excluding

all supporters of the foreign policy of the United States and ends by restricting the title

Ā“

to pupils of the Ecole Normale SupĀ“rieure.

e

207

Please send corrections however trivial to twk@dpmms.cam.ac.uk

It is frequently claimed that Lebesgue integration is as easy to teach as

Riemann integration. This is probably true, but I have yet to be convinced

that it is as easy to learn. Under these circumstances, it is reasonable to

introduce Riemann integration as an ad hoc tool to be replaced later by a

more powerful theory, if required. If we only have to walk 50 metres, it makes

no sense to buy a car.

On the other hand, as the distance to be traveled becomes longer, walking

becomes less attractive. We could walk from London to Cambridge but few

people wish to do so. This chapter contains a series of short sections showing

how the notion of the integral can be extended in various directions. I hope

that the reader will ļ¬nd them interesting and instructive but, for the reasons

just given, she should not invest too much time and eļ¬ort in their contents

which, in many cases, can be given a more elegant, inclusive and eļ¬cient

exposition using measure theory.

I believe that, provided it is not taken too seriously, this chapter will be

useful to those who do not go on to do measure theory by showing that

the theory of integration is richer than most elementary treatments would

suggest and to those who will go on to do measure theory by opening their

minds to some of the issues involved.

Improper integrals ā™„

9.2

We have deļ¬ned Riemann integration for bounded functions on bounded

intervals. However, the reader will already have evaluated, as a matter of

routine, so called ā˜improper integralsā™2 in the following manner

1 1

ā’1/2

xā’1/2 dx = lim [2x1/2 ]1 = 2,

x dx = lim

ā’0+ ā’0+

0

and

ā R

ā’2

xā’2 dx = lim [ā’xā’1 ]R = 1.

x dx = lim 1

Rā’ā Rā’ā

1 1

A full theoretical treatment of such integrals with the tools at our disposal

is apt to lead into into a howling wilderness of ā˜improper integrals of the ļ¬rst

kindā™, ā˜Cauchy principal valuesā™ and so on. Instead, I shall give a few typical

2

There is nothing particularly improper about improper integrals (at least, if they are

absolutely convergent, see page 211), but this is what they are traditionally called. Their

other traditional name ā˜inļ¬nite integralsā™ removes the imputation of moral obliquity but

is liable to cause confusion in other directions.

208 A COMPANION TO ANALYSIS

theorems, deļ¬nitions and counterexamples from which the reader should be

able to construct any theory that she needs to justify results in elementary

calculus.

Deļ¬nition 9.2.1. If f : [a, ā) ā’ R is such that f |[a,X] ā R[a, X] for each

ā

X

X > a and a f (x) dx ā’ L as X ā’ ā, then we say that a f (x) dx exists

with value L.

Lemma 9.2.2. Suppose f : [a, ā) ā’ R is such that f |[a,X] ā R[a, X] for

ā

each X > a. If f (x) ā„ 0 for all x, then a f (x) dx exists if and only if there

X

exists a K such that a f (x) dx ā¤ K for all X.

Proof. As usual we split the proof into two parts dealing with ā˜ifā™ and ā˜only

ifā™ separately.

ā X

Suppose ļ¬rst that a f (x) dx exists, that is to say a f (x) dx tends to

n

a limit as X ā’ ā. Let un = a f (x) dx when n is an integer with n ā„ a.

Since f is positive, un is an increasing sequence. Since un tends to a limit, it

must be bounded, that is to say, there exists a K such that un ā¤ K for all

n ā„ a. If X ā„ a we choose an integer N ā„ X and observe that

X N

f (x) dx ā¤ f (x) dx = uN ā¤ K

a a

as required.

X

Suppose, conversely, that there exists a K such that a f (x) dx ā¤ K for

n

all X ā„ a. Deļ¬ning un = a f (x) dx as before, we observe that the un form

an increasing sequence bounded above by K. By the fundamental axiom it

follows that un tends to a limit L, say. In particular, given > 0, we can

ļ¬nd an n0 ( ) such that L ā’ < un ā¤ L for all n ā„ n0 ( ).

If X is any real number with X > n0 ( ) + 1, we can ļ¬nd an integer n

with n + 1 ā„ X > n. Since n ā„ n0 ( ), we have

X

L ā’ < un ā¤ f (x) dx ā¤ un+1 ā¤ L

a

X X

and |L ā’ f (x) dx ā’ L as X ā’ ā, as required.

f (x) dx| < . Thus

a a

n

Exercise 9.2.3. Show that 0 sin(2Ļx) dx tends to a limit as n ā’ ā through

X

integer values, but 0 sin(2Ļx) dx does not tend to a limit as X ā’ ā.

We use Lemma 9.2.2 to prove the integral comparison test.

Lemma 9.2.4. Suppose f : [1, ā) ā’ R is a decreasing continuous positive

ā

function. Then ā f (n) exists if and only if 1 f (x) dx does.

n=1

209

Please send corrections however trivial to twk@dpmms.cam.ac.uk

ā

Just as with sums we sometimes say that ā˜ 1 f (x) dx convergesā™ rather

ā

than ā˜ 1 f (x) dx existsā™. The lemma then says ā˜ ā f (n) converges if and

n=1

ā

only if 1 f (x) dx doesā™.

The proof of Lemma 9.2.4 is set out in the next exercise.

Exercise 9.2.5. Suppose f : [1, ā) ā’ R is a decreasing continuous positive

function.

(i) Show that

n+1

f (n) ā„ f (x) dx ā„ f (n + 1).

n

(ii) Deduce that

N N +1

N +1

f (n) ā„ f (x) dx ā„ f (n).

1

1 2

(iii) By using Lemma 9.2.2 and the corresponding result for sums, deduce

Lemma 9.2.4.

Exercise 9.2.6. (i) Use Lemma 9.2.4 to show that ā nā’Ī± converges if

n=1

Ī± > 1 and diverges if Ī± ā¤ 1.

(ii) Use the inequality established in Exercise 9.2.5 to give a rough esti-

mate of the size of N required to give N nā’1 > 100.

n=1

(iii) Use the methods just discussed to do Exercise 5.1.10.

Exercise 9.2.7. (Simple version of Stirlingā™s formula.) The ideas of

Exercise 9.2.5 have many applications.

(i) Suppose g : [1, ā) ā’ R is an increasing continuous positive function.

Obtain inequalities for g corresponding to those for f in parts (i) and (ii) of

Exercise 9.2.5.

(ii) By taking g(x) = log x in part (i), show that

N

log(N ā’ 1)! ā¤ log x dx ā¤ log N !

1

and use integration by parts to conclude that

log(N ā’ 1)! ā¤ N log N ā’ N + 1 ā¤ log N ! .

(iii) Show that log N ! = N log N ā’ N + Īø(N )N where Īø(N ) ā’ 0 as

N ā’ ā.

[A stronger result is proved in Exercise K.141.]

210 A COMPANION TO ANALYSIS

We have a result corresponding to Theorem 4.6.12

Lemma 9.2.8. Suppose f : [a, ā) ā’ R is such that f |[a,X] ā R[a, X] for

ā ā

each X > a. If a |f (x)| dx exists, then a f (x) dx exists.

It is natural to state Lemma 9.2.8 in the form ā˜absolute convergence of

the integral implies convergenceā™.

Exercise 9.2.9. Prove Lemma 9.2.8 by using the argument of Exercise 4.6.14 (i).

Exercise 9.2.10. Prove the following general principle of convergence for

integrals.

Suppose f : [a, ā) ā’ R is such that f |[a,X] ā R[a, X] for each X > a.

ā

Show that a f (x) dx exists if and only if, given any > 0, we can ļ¬nd an

X0 ( ) > a such that

Y

f (x) dx <

X

whenever Y ā„ X ā„ X0 ( ).

Exercise 9.2.11. (i) Following the ideas of this section and Section 8.5,

ā

provide the appropriate deļ¬nition of a f (x) dx for a function f : [a, ā) ā’

Rm .

(ii) By taking components and using Exercise 9.2.10, or otherwise, prove

a general principle of convergence for such integrals.

(iii) Use part (ii) and the method of proof of Theorem 4.6.12 to prove the

following generalisation of Lemma 9.2.8.

Suppose f : [a, ā) ā’ Rm is such that f |[a,X] ā R[a, X] for each X > a. If

ā ā

f (x) dx exists then a f (x) dx exists.

a

Exercise 9.2.12. Suppose f : [a, b) ā’ R is such that f |[a,c] ā R[a, c] for

each a < c < b. Produce a deļ¬nition along the lines of Deļ¬nition 9.2.1 of

b

what it should mean for a f (x) dx to exist with value L.

State and prove results analogous to Lemma 9.2.2 and Lemma 9.2.8.

Additional problems arise when there are two limits involved.

Example 9.2.13. If Ī», Āµ > 0 then

Ī»R

x Ī»

dx ā’ log

1 + x2 Āµ

ā’ĀµR

as R ā’ ā.

211

Please send corrections however trivial to twk@dpmms.cam.ac.uk

Proof. Direct calculation, which is left to the reader.

A pure mathematician gets round this problem by making a deļ¬nition along

these lines.

Deļ¬nition 9.2.14. If f : R ā’ R is such that f |[ā’X,Y ] ā R[ā’X, Y ] for each

ā

X, Y > 0, then ā’ā f (x) dx exists with value L if and only if the following

condition holds. Given > 0 we can ļ¬nd an X0 ( ) > 0 such that

Y

f (x) dx ā’ L < .

ā’X

for all X, Y > X0 ( ).

Exercise 9.2.15. Let f : R ā’ R be such that f |[ā’X,Y ] ā R[X, Y ] for

ā ā

each X, Y > 0. Show that ā’ā f (x) dx exists if and only if 0 f (x) dx =

R 0 0

limRā’ā 0 f (x) dx and ā’ā f (x) dx = limSā’ā ā’S f (x) dx exist. If the inte-

grals exist, show that

ā ā

0

f (x) dx = f (x) dx + f (x) dx.

ā’ā ā’ā 0

The physicist gets round the problem by ignoring it. If she is a real

physicist with correct physical intuition this works splendidly3 but if not,

not.

Speaking broadly, improper integrals E f (x) dx work well when they are

absolutely convergent, that is to say, E |f (x)| dx < ā, but are full of traps

for the unwary otherwise. This is not a weakness of the Riemann integral but

inherent in any mathematical situation where an object only exists ā˜by virtue

of the cancellation of two inļ¬nite objectsā™. (Recall Littlewoodā™s example on

page 81.)

Example 9.2.16. Suppose we deļ¬ne the PV (principle value) integral by

ā R

PV f (x) dx = lim f (x) dx

Rā’ā

ā’ā ā’R

whenever the right hand side exists. Show, by considering Example 9.2.13, or

otherwise, that the standard rule for change of variables fails for PV integrals.

3

In [8], Boas reports the story of a friend visiting the Princeton common room ā˜ . . .

where Einstein was talking to another man, who would shake his head and stop him;

Einstein then thought for a while, then started talking again; was stopped again; and so

on. After a while, . . . my friend was introduced to Einstein. He asked Einstein who the

other man was. āOh,ā said Einstein, āthatā™s my mathematician.ā ā™

212 A COMPANION TO ANALYSIS

Integrals over areas ā™„

9.3

At ļ¬rst sight, the extension of the idea of Riemann integration from functions

deļ¬ned on R to functions deļ¬ned on Rn looks like childā™s play. We shall do

the case n = 2 since the general case is a trivial extension.

Let R = [a, b] Ć— [c, d] and consider f : R ā’ R such that there exists a K

with |f (x)| ā¤ K for all x ā R. We deļ¬ne a dissection D of R to be a ļ¬nite

collection of rectangles Ij = [aj , bj ] Ć— [cj , dj ] [1 ā¤ j ā¤ N ] such that

N

(i) Ij = R,

j=1

(ii) Ii ā© Ij is either empty or consists of a segment of a straight line

[1 ā¤ j < i ā¤ N ].

If D = {Ij : 1 ā¤ j ā¤ N } and D = {Ik : 1 ā¤ k ā¤ N } are dissections we

write D ā§ D for the set of non-empty rectangles of the form Ij ā© Ik . If every

Ik ā D is contained in some Ij ā D we write D D.

We deļ¬ne the upper sum and lower sum associated with D by

N

S(f, D) = |Ij | sup f (x),

xāIj

j=1

N

s(f, D) = |Ij | inf f (x)

xāIj

j=1

where |Ij | = (bj ā’ aj )(dj ā’ cj ), the area of Ij .

Exercise 9.3.1. (i) Suppose that D and D are dissections with D D.

Show, using the method of Exercise 8.2.1, or otherwise, that

S(f, D) ā„ S(f, D ) ā„ s(f, D ) ā„ s(f, D).

(ii) State and prove a result corresponding to Lemma 8.2.3.

(iii) Explain how this enables us to deļ¬ne upper and lower integrals and

hence complete the deļ¬nition of Riemann integration. We write the integral

as

f (x) dA

R

when it exists.

(iv) Develop the theory of Riemann integration on R as far as you can.

(You should be able to obtain results like those in Section 8.2 as far as the

end of Exercise 8.2.15.) You should prove that if f is continuous on R then

it is Riemann integrable.

213

Please send corrections however trivial to twk@dpmms.cam.ac.uk

We can do rather more than just prove the existence of

f (x) dA

R

when f is continuous on the rectangle R.

Theorem 9.3.2. (Fubiniā™s theorem for continuous functions.) Let

R = [a, b] Ć— [c, d]. If f : R ā’ R is continuous, then the functions F1 : [a, b] ā’

R and F2 : [c, d] ā’ R deļ¬ned by

d b

F1 (x) = f (x, s) ds and F2 (y) = f (t, y) dt

c a

are continuous and

b d

F1 (x) dx = F2 (y) dy = f (x) dA.

a c R

This result is more usually written as

b d d b

f (x, y) dy dx = f (x, y) dx dy = f (x) dA,

a c c a R

or, simply,

b d d b

f (x, y) dy dx = f (x, y) dx dy = f (x) dA.

a c c a [a,b]Ć—[c,d]

(See also Exercises K.152, K.154 and K.155.)

We prove Theorem 9.3.2 in two exercises.

Exercise 9.3.3. (We use the notation of Theorem 9.3.2.) If |f (x, s) ā’

f (w, s)| ā¤ for all s ā [c, d] show that |F1 (x) ā’ F1 (w)| ā¤ (d ā’ c). Use

the uniform continuity of f to conclude that F1 is continuous.

For the next exercise we recall the notion of an indicator function IE for a

set E. If E ā R, then IE : R ā’ R is deļ¬ned by IE (a) = 1 if a ā E, IE (a) = 0

otherwise.

Exercise 9.3.4. We use the notation of Theorem 9.3.2. In this exercise

interval will mean open, half open or closed interval (that is intervals of the

form, (Ī±, Ī²), [Ī±, Ī²), (Ī±, Ī²] or [Ī±, Ī²]) and rectangle will mean the product of

two intervals. We say that g satisļ¬es the Fubini condition if

b d d b

g(x, y) dy dx = g(x, y) dy dx = g(x) dA.

a c c a R

214 A COMPANION TO ANALYSIS

> 0, we can ļ¬nd rectangles Rj ā R and Ī»j ā R

(i) Show that, given

such that, writing

N

Ī» j IR j ,

H=

j=1

we have H(x) ā’ ā¤ F (x) ā¤ H(x) + for all x ā R.

(ii) Show by direct calculation that IB satisļ¬es the Fubini condition when-

ever B is a rectangle. Deduce that H satisļ¬es the Fubini condition and use

(i) (carefully) to show that F does.

All this looks very satisfactory, but our treatment hides a problem. If we

look at how mathematicians actually use integrals we ļ¬nd that they want

to integrate over sets which are more complicated than rectangles with sides

parallel to coordinate axes. (Indeed one of the guiding principles of this book

is that coordinate axes should not have a special role.) If you have studied

mathematical methods you will have come across the formula for change of

variables4

ā‚(u, v)

f (u, v) du dv = f (u(x, y), v(x, y)) dx dy,

ā‚(x, y)

E E

where

E = {(u(x, y), v(x, y)) : (x, y) ā E}.

Even if you do not recognise the formula, you should see easily that any

change of variable formula will involve changing not only the integrand but

the set over which we integrate.

It is not hard to come up with an appropriate deļ¬nition for integrals over

a set E.

Deļ¬nition 9.3.5. Let E be a bounded set and f : E ā’ R a bounded func-

tion. Choose a < b and c < d such that R = [a, b] Ć— [c, d] contains E and

Ė Ė Ė Ė

deļ¬ne f : R ā’ R by f (x) = f (x) if x ā E, f (x) = 0 otherwise. If R f (x) dA

exists, we say that E f (x) dA exists and

Ė

f (x) dA = f (x) dA.

E R

Exercise 9.3.6. Explain brieļ¬‚y why the deļ¬nition is independent of the

choice of R.

4

This formula is included as a memory jogger only. It would require substantial sup-

porting discussion to explain the underlying conventions and assumptions.

215

Please send corrections however trivial to twk@dpmms.cam.ac.uk

The most important consequence of this deļ¬nition is laid bare in the next

exercise.

Exercise 9.3.7. Let R = [a, b] Ć— [c, d] and E ā R. Let R be the set of

functions f : R ā’ R which are Riemann integrable. Then E f (x) dA exists

for all f ā R if and only if IE ā R.

I (x) dA

If we think about the meaning of we are led to the following

RE

deļ¬nition5 .

Deļ¬nition 9.3.8. A bounded set E in R2 has Riemann area 1 dA if that

E

integral exists.

Recall that, if R = [a, b] Ć— [c, d], we write |R| = (b ā’ a)(d ā’ c).

Exercise 9.3.9. Show that a bounded set E has Riemann area |E| if and

only if, given any , we can ļ¬nd disjoint rectangles Ri = [ai , bi ] Ć— [ci , di ]

[1 ā¤ i ā¤ N ] and (not necessarily disjoint) rectangles Rj = [aj , bj ] Ć— [cj , dj ]

[1 ā¤ j ā¤ M ] such that

N M N M

Ri ā E ā |Ri | ā„ |E| ā’ and |Rj | ā¤ |E| + .

Rj ,

i=1 j=1 i=1 j=1

Exercise 9.3.10. Show that, if E has Riemann area and f is deļ¬ned and

Riemann integrable on some rectangle R = [a, b] Ć— [c, d] containing E, then

f (x) dA exists and

E

f (x) dA ā¤ sup |f (x)||E|.

xāE

E

In other words

size of integral ā¤ area Ć— sup .

Our discussion tells us that in order to talk about

f (x) dA

E

we need to know not only that f is well behaved (Riemann integrable) but

that E is well behaved (has Riemann area). Just as the functions f which

occur in ā˜ļ¬rst mathematical methodsā™ courses are Riemann integrable, so the

sets E which appear in such courses have Riemann area, though the process

of showing this may be tedious.

5

Like most of the rest of this chapter, this is not meant to be taken too seriously. What

we call ā˜Riemann areaā™ is traditionally called ā˜contentā™. The theory of content is pretty

but was rendered obsolete by the theory of measure.

216 A COMPANION TO ANALYSIS

Exercise 9.3.11. (The reader may wish to think about how to do this exer-

cise without actually writing down all the details.)

(i) Show that a rectangle whose sides are not necessarily parallel to the

axis has Riemann area and that this area is what we expect.

(ii) Show that a triangle has Riemann area and that this area is what we

expect.

(iii) Show that a polygon has Riemann area and that this area is what

we expect. (Of course, the answer is to cut it up into a ļ¬nite number of

triangles, but can this always be done?)

However, if we want to go further, it becomes rather hard to decide which

sets are nice and which are not. The problem is already present in the

one-dimensional case, but hidden by our insistence on only integrating over

intervals.

Deļ¬nition 9.3.12. A bounded set E in R has Riemann length if, taking any

[a, b] ā E, we have IE ā R([a, b]). We say then that E has Riemann length

b

IE (t) dt.

|E| =

a

Exercise 9.3.13. (i) Explain why the deļ¬nition just given is independent of

the choice of [a, b].

(ii) Show that

IAāŖB = IA + IB ā’ IA IB .

Hence show that, if A and B have Riemann length, so does A āŖ B. Prove

similar results for A ā© B and A \ B.

(iii) By reinterpreting Exercise 9.1.1 show that we can ļ¬nd An ā [0, 1]

such that An has Riemann length for each n but ā An does not.

n=1

(iv) Obtain results like (ii) and (iii) for Riemann area.

It also turns out that the kind of sets we have begun to think of as nice,

that is open and closed sets, need not have Riemann area.

Lemma 9.3.14. There exist bounded closed and open sets in R which do not

have Riemann length. There exist bounded closed and open sets in R2 which

do not have Riemann area.

The proof of this result is a little complicated so we have relegated it to

Exercise K.156.

Any belief we may have that we have a ā˜natural feelingā™ for how area

behaves under complicated maps is ļ¬nally removed by an example of Peano.

217

Please send corrections however trivial to twk@dpmms.cam.ac.uk

Theorem 9.3.15. There exists a continuous surjective map f : [0, 1] ā’

[0, 1] Ć— [0, 1].

Thus there exists a curve which passes through every point of a square!

A proof depending on the notion of uniform convergence is given in Exer-

cise K.224.

Fortunately all these diļ¬culties vanish like early morning mist in the light

of Lebesgueā™s theory.

The Riemann-Stieltjes integral ā™„

9.4

In this section we discuss a remarkable extension of the notion of integral due

to Stieltjes. The reader should ļ¬nd the discussion gives an excellent revision

of many of the ideas of Chapter 8.

Before doing so, we must dispose of a technical point. When authors talk

about the Heaviside step function H : R ā’ R they all agree that H(t) = 0

for t < 0 and H(t) = 1 for t > 0. However, some take H(0) = 0, some take

H(0) = 1 and some take H(0) = 1/2. Usually this does not matter but it is

helpful to have consistency.

Deļ¬nition 9.4.1. Let E ā R We say that a function f : E ā’ R is a right

continuous function if, for all x ā E, f (t) ā’ f (x) whenever t ā’ x through

values of t ā E with t > x.

Exercise 9.4.2. Which deļ¬nition of the Heaviside step function makes H

right continuous?

In the discussion that follows, G : R ā’ R will be a right continuous

increasing function. (Exercise K.158 sheds some light on the nature of such

functions, but is not needed for our discussion.) We assume further that

there exist A and B with G(t) ā’ A as t ā’ ā’ā and G(t) ā’ B as t ā’ ā.

Exercise 9.4.3. If F : R ā’ R is an increasing function show that the fol-

lowing two statements are equivalent:-

(i) F is bounded.

(ii) F (t) tends to (ļ¬nite) limits as t ā’ ā’ā and as t ā’ ā.

We shall say that any ļ¬nite set D containing at least two points is a

dissection of R. By convention we write

D = {x0 , x1 , . . . , xn } with x0 < x1 < x2 < Ā· Ā· Ā· < xn .

(Note that we now demand that the xj are distinct.)

218 A COMPANION TO ANALYSIS

Now suppose f : R ā’ R is a bounded function. We deļ¬ne the upper

Stieltjes G sum of f associated with D by

n

SG (f, D) =(G(x0 ) ā’ A) sup f (t) + (G(xj ) ā’ G(xjā’1 )) sup f (t)

tā¤x0 tā(xjā’1 ,xj ]

j=1

+ (B ā’ G(xn )) sup f (t)

t>xn

(Note that we use half open intervals, since we have to be more careful about

overlap than when we dealt with Riemann integration.)

Exercise 9.4.4. (i) Deļ¬ne the lower Stieltjes G sum sG (f, D) in the appro-

priate way.

(ii) Show that, if D and D are dissections of R, then SG (f, D) ā„ sG (f, D ).

(iii) Deļ¬ne the upper Stieltjes G integral by I ā— (G, f ) = inf D S(f, D).

Give a similar deļ¬nition for the lower Stieltjes G integral Iā— (G, f ) and show

that I ā— (G, f ) ā„ Iā— (G, f ).

If I ā— (G, f ) = Iā— (G, f ), we say that f is Riemann-Stieltjes integrable with

respect to G and we write

f (x) dG(x) = I ā— (G, f ).

R

Exercise 9.4.5. (i) State and prove a criterion for Riemann-Stieltjes inte-

grability along the lines of Lemma 8.2.6.

(ii) Show that the set RG of functions which are Riemann-Stieltjes in-

tegrable with respect to G forms a vector space and the integral is a linear

functional (i.e. a linear map from RG to R).

(iii) Suppose that f : R ā’ R is Riemann-Stieltjes integrable with respect

to G, that K ā R and |f (t)| ā¤ K for all t ā R. Show that

f (x) dG(x) ā¤ K(B ā’ A).

R

(iv) Show that, if f, g : R ā’ R are Riemann-Stieltjes integrable with

respect to G, so is f g (the product of f and g).

(v) If f : R ā’ R is Riemann-Stieltjes integrable with respect to G, show

that |f | is also and that

|f (x)| dG(x) ā„ f (x) dG(x) .

R R

(vi) Prove that, if f : R ā’ R is a bounded continuous function, then f is

Riemann-Stieltjes integrable with respect to G. [Hint: Use the fact that f is

uniformly continuous on any [ā’R, R]. Choose R suļ¬ciently large.]

219

Please send corrections however trivial to twk@dpmms.cam.ac.uk

The next result is more novel, although its proof is routine (it resembles

that of Exercise 9.4.5 (ii)).

Exercise 9.4.6. Suppose that F, G : R ā’ R are right continuous increasing

bounded functions and Ī», Āµ ā„ 0. Show that, if f : R ā’ R is Riemann-

Stieltjes integrable with respect to both F and G, then f is Riemann-Stieltjes

integrable with respect to Ī»F + ĀµG and

f (x) d(Ī»F + ĀµG)(x) = Ī» f (x) dF (x) + Āµ f (x) dG(x).

R R R

Exercise 9.4.7. (i) If a ā R, show, by choosing appropriate dissections, that

I(ā’ā,a] is Riemann-Stieltjes integrable with respect to G and

I(ā’ā,a] (x) dG(x) = G(a) ā’ A.

R

(ii) If a ā R, show that I(ā’ā,a) is Riemann-Stieltjes integrable with respect

to G if and only if G is continuous at a. If G is continuous at a show that

I(ā’ā,a) (x) dG(x) = G(a) ā’ A.

R

(iii) If a < b, show that I(a,b] is Riemann-Stieltjes integrable with respect

to G and

I(a,b] (x) dG(x) = G(b) ā’ G(a).

R

(iv) If a < b, show that I(a,b) is Riemann-Stieltjes integrable with respect

to G if and only if G is continuous at b.

Combining the results of Exercise 9.4.7 with Exercise 9.4.5, we see that,

if f is Riemann-Stieltjes integrable with respect to G, we may deļ¬ne

I(a,b] (x)f (x) dG(x)

f (x) dG(x) =

R

(a,b]

and make similar deļ¬nitions for integrals like f (x) dG(x)

(ā’ā,a]

Exercise 9.4.8. Show that, if G is continuous and f is Riemann-Stieltjes

integrable with respect to G, then we can deļ¬ne [a,b] f (x) dG(x) and that

f (x) dG(x) = f (x) dG(x).

(a,b] [a,b]

220 A COMPANION TO ANALYSIS

Remark: When we discussed Riemann integration, I said that, in mathe-

matical practice, it was unusual to come across a function that was Lebesgue

integrable but not Riemann integrable. In Exercise 9.4.7 (iv) we saw that the

function I(a,b) , which we come across very frequently in mathematical prac-

tice, is not Riemann-Stieltjes integrable with respect to any right continuous

increasing function G which has a discontinuity at b. In the Lebesgue-Stieltjes

theory, I(a,b) is always Lebesgue-Stieltjes integrable with respect to G. (Ex-

ercise K.161 extends Exercise 9.4.7 a little.)

The next result has an analogous proof to the fundamental theorem of

the calculus (Theorem 8.3.6).

Exercise 9.4.9. Suppose that G : R ā’ R is an increasing function with con-

tinuous derivative. Suppose further that f : R ā’ R is a bounded continuous

function. If we set

I(t) = f (x) dG(x),

(ā’ā,t]

show that then I is diļ¬erentiable and I (t) = f (t)G (t) for all t ā R.

Using the mean value theorem, in the form which states that the only

function with derivative 0 is a constant, we get the following result.

Exercise 9.4.10. Suppose that G : R ā’ R is an increasing function with

continuous derivative. If f : R ā’ R is a bounded continuous function, show

that

b

f (x) dG(x) = f (x)G (x) dx.

(a,b] a

Show also that

ā

f (x) dG(x) = f (x)G (x) dx,

R ā’ā

explaining carefully the meaning of the right hand side of the equation.

However, there is no reason why we should restrict ourselves even to

continuous functions when considering Riemann-Stieltjes integration.

Exercise 9.4.11. (i) If c ā R, deļ¬ne Hc : R ā’ R by Hc (t) = 0 if t < c,

Hc (t) = 1 if t ā„ c. Show, by ļ¬nding appropriate dissections, that, if f : R ā’

R is a bounded continuous function, we have

f (x) dHc (x) = f (c)

(a,b]

221

Please send corrections however trivial to twk@dpmms.cam.ac.uk

when c ā (a, b]. What happens if c ā (a, b]

/

(ii) If a < c1 < c2 < Ā· Ā· Ā· < cm < b and Ī»1 , Ī»2 , . . . , Ī»m ā„ 0, ļ¬nd a right

continuous function G : [a, b] ā’ R such that, if f : (a, b] ā’ R is a bounded

continuous function, we have

m

f (x) dG(x) = Ī»j f (cj ).

(a,b] j=1

Exercise 9.4.11 shows that Riemann-Stieltjes integration provides a frame-

work in which point masses may be considered along with continuous densi-

ties6 .

The reader may agree with this but still doubt the usefulness of Riemann-

Stieltjes point of view. The following discussion may help change her mind.

What is a real-valued random variable? It is rather hard to give a proper

mathematical deļ¬nition with the mathematical apparatus available in 18807 .

However any real-valued random variable X is associated with a function

P (x) = Pr{X ā¤ x}.

Exercise 9.4.12. Convince yourself that P : R ā’ R is a right continuous

increasing function with P (t) ā’ 0 as t ā’ ā’ā and P (t) ā’ 1 as t ā’ ā.

(Note that, as we have no proper deļ¬nitions, we can give no proper proofs.)

Even if we have no deļ¬nition of a random variable, we do have a deļ¬nition

of a Riemann-Stieltjes integral. So, in a typical mathematicianā™s trick, we

turn everything upside down.

Suppose P : R ā’ R is a right continuous increasing function with P (t) ā’

0 as t ā’ ā’ā and P (t) ā’ 1 as t ā’ ā. We say that P is associated with a

real-valued random variable X if

IE (x) dP (x)

Pr{X ā E} =

R

when IE is Riemann-Stieltjes integrable with respect to P . (Thus, for exam-

ple, E could be (ā’ā, a] or (a, b].) If the reader chooses to read Pr{X ā E}

as ā˜the probability that X ā Eā™ that is up to her. So far as we are concerned,

Pr{X ā E} is an abbreviation for R IE (x) dP (x).

6

Note that, although we have justiļ¬ed the concept of a ā˜delta functionā™, we have not

justiļ¬ed the concept of ā˜the derivative of the delta functionā™. This requires a further

generalisation of our point of view to that of distributions.

7

The Holy Roman Empire was neither holy nor Roman nor an empire. A random

variable is neither random nor a variable.

222 A COMPANION TO ANALYSIS

In the same way we deļ¬ne the expectation Ef (X) by

Ef (X) = f (x) dP (x)

R

when f is Riemann-Stieltjes integrable with respect to P . The utility of

this deļ¬nition is greatly increased if we allow improper Riemann-Stieltjes

integrals somewhat along the lines of Deļ¬nition 9.2.14.

Deļ¬nition 9.4.13. Let G be as throughout this section. If f : R ā’ R, and

R, S > 0 we deļ¬ne fRS : R ā’ R by

if R ā„ f (t) ā„ ā’S

fRS (t) = f (t)

fRS (t) = ā’S if ā’S > f (t),

fRS (t) = R if f (t) > R.

If fRS is Riemann-Stieltjes integrable with respect to G for all R, S > 0, and

we can ļ¬nd an L such that, given > 0, we can ļ¬nd an R0 ( ) > 0 such that

fRS (x) dG(x) ā’ L < .

R

for all R, S > R0 ( ), then we say that f is Riemann-Stieltjes integrable with

respect to G with Riemann-Stieltjes integral

f (x) dG(x) = L.

R

|f (x)| dG(x)

(As before, we add a warning that care must be exercised if R

fails to converge.)

Lemma 9.4.14. (Tchebychevā™s inequality.) If P is associated with a

real-valued random variable X and EX 2 exists then

EX 2

Pr{X > a or ā’ a ā„ X} ā¤ 2 .

a

Proof. Observe that

x2 ā„ a2 IR\(ā’a,a] (x)

for all x and so

a2 IR\(ā’a,a] (x) dG(x).

x2 dG(x) ā„

R R

223

Please send corrections however trivial to twk@dpmms.cam.ac.uk

Thus

IR\(ā’a,a] (x) dG(x).

x2 dG(x) ā„ a2

R R

In other words,

EX 2 ā„ a2 Pr{X ā (ā’a, a]},

/

which is what we want to prove.

Exercise 9.4.15. (i) In the proof of Tchebchyevā™s theorem we used various

simple results on improper Riemann-Stieltjes integrals without proof. Identify

these results and prove them.

(ii) If P (t) = ( Ļ ā’ tanā’1 x)/Ļ, show that EX 2 does not exist. Show that

2

this is also the case if we choose P given by

P (t) = 0 if t < 1

P (t) = 1 ā’ 2ā’n if 2n ā¤ t < 2n+1 , n ā„ 0 an integer.

Exercise 9.4.16. (Probabilists call this result ā˜Markovā™s inequalityā™. Ana-

lysts simply call it a ā˜Tchebychev type inequalityā™.) Suppose Ļ : [0, ā) ā’ R

is an increasing continuous positive function. If P is associated with a real-

valued random variable X and EĻ(X) exists, show that

EĻ(X)

Pr{X ā (ā’a, a]} ā¤

/ .

Ļ(a)

In elementary courses we deal separately with discrete random variables

(typically, in our notation, P is constant on each interval [n, n + 1)) and con-

tinuous random variables8 (in our notation, P has continuous derivative, this

derivative is the ā˜density functionā™). It is easy to construct mixed examples.

Exercise 9.4.17. The height of water in a river is a random variable Y with

Pr{Y ā¤ y} = 1 ā’ eā’y for y ā„ 0. The height is measured by a gauge which

registers X = min(Y, 1). Find Pr{X ā¤ x} for all x.

Are there real-valued random variables which are not just a simple mix

of discrete and continuous? In Exercise K.225 (which depends on uniform

convergence) we shall show that there are.

The Riemann-Stieltjes formalism can easily be extended to deal with two

random variables X and Y by using a two dimensional Riemann-Stieltjes

integral with respect to a function

P (x, y) = Pr{X ā¤ x, Y ā¤ y}.

8

See the previous footnote on the Holy Roman Empire.

224 A COMPANION TO ANALYSIS

In the same way we can deal with n random variables X1 , X2 , . . . , Xn . How-

ever, we cannot deal with inļ¬nite sequences X1 , X2 , . . . of random variables

in the same way. Modern probability theory depends on measure theory.

In the series of exercises starting with Exercise K.162 and ending with

Exercise K.168 we see that the Riemann-Stieltjes integral can be generalised

further.

How long is a piece of string? ā™„

9.5

The topic of line integrals is dealt with quickly and eļ¬ciently in many texts.

The object of this section is to show why the texts deal with the matter in

the way they do. The reader should not worry too much about the details

and reserve such matters as ā˜learning deļ¬nitionsā™ for when she studies a more

eļ¬cient text.

The ļ¬rst problem that meets us when we ask for the length of a curve

is that it is not clear what a curve is. One natural way of deļ¬ning a curve

is that it is a continuous map Ī³ : [a, b] ā’ Rm . If we do this it is helpful to

consider the following examples.

: [0, 1] ā’ R2 with Ī³ 1 (t) = (cos 2Ļt, sin 2Ļt)

Ī³1

: [1, 2] ā’ R2 with Ī³ 2 (t) = (cos 2Ļt, sin 2Ļt)

Ī³2

: [0, 2] ā’ R2 with Ī³ 3 (t) = (cos Ļt, sin Ļt)

Ī³3

: [0, 1] ā’ R2 Ī³ 4 (t) = (cos 2Ļt2 , sin 2Ļt2 )

with

Ī³4

: [0, 1] ā’ R2 Ī³ 5 (t) = (cos 2Ļt, ā’ sin 2Ļt)

with

Ī³5

: [0, 1] ā’ R2 with Ī³ 6 (t) = (cos 4Ļt, sin 4Ļt)

Ī³6

Exercise 9.5.1. Trace out the curves Ī³ 1 to Ī³ 6 . State in words how the

curves Ī³ 1 , Ī³ 4 , Ī³ 5 and Ī³ 6 diļ¬er.

Exercise 9.5.2. (i) Which of the curves Ī³ 1 to Ī³ 6 are equivalent and which

are not, under the following deļ¬nitions.

(a) Two curves Ļ„ 1 : [a, b] ā’ R2 and Ļ„ 2 : [c, d] ā’ R2 are equivalent

if there exist real numbers A and B with A > 0 such that Ac + B = a,

Ad + B = b and Ļ„ 1 (At + b) = Ļ„ 2 (t) for all t ā [c, d].

(b) Two curves Ļ„ 1 : [a, b] ā’ R2 and Ļ„ 2 : [c, d] ā’ R2 are equivalent if

there exists a strictly increasing continuous surjective function Īø : [c, d] ā’

[a, b] such that Ļ„ 1 (Īø(t)) = Ļ„ 2 (t) for all t ā [c, d].

(c) Two curves Ļ„ 1 : [a, b] ā’ R2 and Ļ„ 2 : [c, d] ā’ R2 are equivalent

if there exists a continuous bijective function Īø : [c, d] ā’ [a, b] such that

Ļ„ 1 (Īø(t)) = Ļ„ 2 (t) for all t ā [c, d].

225

Please send corrections however trivial to twk@dpmms.cam.ac.uk

(d) Two curves Ļ„ 1 : [a, b] ā’ R2 and Ļ„ 2 : [c, d] ā’ R2 are equivalent if

Ļ„ 1 ([a, b]) = Ļ„ 2 ([c, d]).

(ii) If you know the deļ¬nition of an equivalence relation verify that con-

ditions (a) to (d) do indeed give equivalence relations.

Naturally we demand that ā˜equivalent curvesā™ (that is curves which we

consider ā˜identicalā™) should have the same length. I think, for example, that

a deļ¬nition which gave diļ¬erent lengths to the curves described by Ī³ 1 and

Ī³ 2 would be obviously unsatisfactory. However, opinions may diļ¬er as to

when two curves are ā˜equivalentā™. At a secondary school level, most people

would say that the appropriate notion of equivalence is that given as (d) in

Exercise 9.5.2 and thus the curves Ī³ 1 and Ī³ 6 should have the same length.

Most of the time, most mathematicians9 would say that the curves Ī³ 1 and

Ī³ 6 are not equivalent and that, ā˜since Ī³ 6 is really Ī³ 1 done twiceā™, Ī³ 6 should

have twice the length of Ī³ 1 . If the reader is dubious she should replace the

phrase ā˜length of curveā™ by ā˜distance traveled along the curveā™.

The following chain of ideas leads to a natural deļ¬nition of length. Sup-

pose Ī³ : [a, b] ā’ Rm is a curve (in other words Ī³ is continuous). As usual,

we consider dissections

D = {t0 , t1 , t2 , . . . , tn }

with a = t0 ā¤ t1 ā¤ t2 ā¤ Ā· Ā· Ā· ā¤ tn = b. We write

n

L(Ī³, D) = Ī³(tjā’1 ) ā’ Ī³(tj ) ,

j=1

where a ā’ b is the usual Euclidean distance between a and b.

Exercise 9.5.3. (i) Explain why L(Ī³, D) may be considered as the ā˜length

of the approximating curve obtained by taking straight line segments joining

each Ī³(tjā’1 ) to Ī³(tj )ā™.

(ii) Show that, if D1 and D2 are dissections with D1 ā D2 ,

L(Ī³, D2 ) ā„ L(Ī³, D1 ).

Deduce that, if D3 and D4 are dissections, then

L(Ī³, D3 āŖ D4 ) ā„ max(L(Ī³, D3 ), L(Ī³, D4 )).

The two parts of Exercise 9.5.3 suggest the following deļ¬nition.

9

But not all mathematicians and not all the time. One very important deļ¬nition of

length associated with the name Hausdorļ¬ agrees with the school level view.

226 A COMPANION TO ANALYSIS

Deļ¬nition 9.5.4. We say that a curve Ī³ : [a, b] ā’ Rm is rectiļ¬able if

there exists a K such that L(Ī³, D) ā¤ K for all dissections D. If a curve is

rectiļ¬able, we write

length(Ī³) = sup L(Ī³, D)

D

the supremum being taken over all dissections of [a, b].

Not all curves are rectiļ¬able.

Exercise 9.5.5. (i) Let f : [0, 1] ā’ R be the function given by the conditions

f (0) = 0, f is linear on [2ā’nā’2 3, 2ā’n ] with f (2ā’nā’2 3) = 0 and f (2ā’n ) =

(n + 1)ā’1 , f is linear on [2ā’nā’1 , 2ā’nā’2 3] with f (2ā’nā’2 3) = 0 and f (2ā’nā’1 ) =

(n + 2)ā’1 [n ā„ 0].

Sketch the graph of f and check that f is continuous. Show that the curve

Ī³ : [0, 1] ā’ R2 given by Ī³(t) = (t, f (t)) is not rectiļ¬able.

(ii) Let g : [ā’1, 1] ā’ R be the function given by the conditions g(0) = 0,

g(t) = t2 sin |t|Ī± for t = 0, where Ī± is real. Show that g is diļ¬erentiable

everywhere, but that, for an appropriate choice of Ī±, the curve Ļ„ : [ā’1, 1] ā’

R2 given by Ļ„ (t) = (t, g(t)) is not rectiļ¬able.

Exercise 9.5.6. (i) By using the intermediate value theorem, show that a

continuous bijective function Īø : [c, d] ā’ [a, b] is either strictly increasing or

strictly decreasing.

(ii) Suppose that Ī³ : [a, b] ā’ Rm is a rectiļ¬able curve and Īø : [c, d] ā’ [a, b]

is a continuous bijection. Show that Ī³ ā—¦ Īø (where ā—¦ denotes composition) is

a rectiļ¬able curve and

length(Ī³ ā—¦ Īø) = length(Ī³).

(iii) Let Ļ„ : [ā’1, 1] ā’ R2 given by Ļ„ (t) = (sin Ļt, 0). Show that length(Ļ„ ) =

4. Comment brieļ¬‚y.

The next exercise is a fairly obvious but very useful observation.

Exercise 9.5.7. Suppose that Ī³ : [a, b] ā’ Rm is rectiļ¬able. Show that, if

a ā¤ t ā¤ b, then the restriction Ī³|[a,t] : [a, t] ā’ Rm is rectiļ¬able. If we write

lĪ³ (t) = length(Ī³|[a,t] ),

show that lĪ³ : [a, b] ā’ R is an increasing function with lĪ³ (a) = 0.

With a little extra eļ¬ort we can say rather more about lĪ³ .

227

Please send corrections however trivial to twk@dpmms.cam.ac.uk

Exercise 9.5.8. We use the hypotheses and notation of Exercise 9.5.7.

(i) Suppose that Ī³ has length L and that

D = {t0 , t1 , t2 , . . . , tn }

with a = t0 ā¤ t1 ā¤ t2 ā¤ Ā· Ā· Ā· ā¤ tn = b is a dissection such that

ńņš. 7 |