ńņš. 6 |

as (h2 + k 2 )1/2 ā’ 0. Setting k = h, we get

h+h f (h, h) + h

21/2 = ā’0

=

(h2 + h2 )1/2 (h2 + h2 )1/2

as h ā’ 0, which is absurd. Thus f is not diļ¬erentiable at (0, 0).

(We give a stronger result in Exercise C.8 and a weaker but slightly easier

result in Exercise 7.3.16.)

Exercise 7.3.15. Write down the details behind the ļ¬rst sentence of our

proof of Example 7.3.14. You will probably wish to quote Lemma 6.2.11 and

Exercise 6.2.17.

Exercise 7.3.16. If

xy

f (x, y) = for (x, y) = (0, 0),

(x2 + y 2 )1/2

f (0, 0) = 0,

show that f is diļ¬erentiable except at (0, 0), is continuous at (0, 0) and has

partial derivatives f,1 (0, 0) and f,2 (0, 0) at (0, 0) but has directional deriva-

tives in no other directions at (0, 0). Discuss your results brieļ¬‚y using the

ideas of Exercise 7.3.12.

A further exercise on the ideas just used is given as Exercise K.108.

Emboldened by our success, we could well guess immediately a suitable

function to look for in the context of Theorem 7.2.6.

Exercise 7.3.17. Suppose that f : R2 ā’ R is given by f (0, 0) = 0 and

f (r cos Īø, r sin Īø) = r 2 sin 4Īø,

for r > 0. Show that

4xy(x2 ā’ y 2 )

f (x, y) =

x2 + y 2

for (x, y) = 0. Sketch the contour lines f (x, y) = h, 22 h, 32 h, . . . and

compare the result with Figure 7.2.

Exercise 7.3.18. Suppose that

xy(x2 ā’ y 2 )

f (x, y) = for (x, y) = (0, 0),

(x2 + y 2 )

f (0, 0) = 0.

163

Please send corrections however trivial to twk@dpmms.cam.ac.uk

(i) Compute f,1 (0, y), for y = 0, by using standard results of the calculus.

(ii) Compute f,1 (0, 0) directly from the deļ¬nition of the derivative.

(iii) Find f,2 (x, 0) for all x.

(iv) Compute f,12 (0, 0) and f,21 (0, 0).

(v) Show that f has ļ¬rst and second partial derivatives everywhere but

f,12 (0, 0) = f,21 (0, 0).

It is profoundly unfortunate that Example 7.3.14 and Exercise 7.3.18 seem

to act on some examiners like catnip on a cat. Multi-dimensional calculus

leads towards diļ¬erential geometry and inļ¬nite dimensional calculus (func-

tional analysis). Both subjects depend on understanding objects which we

know to be well behaved but which our limited geometric intuition makes it

hard for us to comprehend. Counterexamples, such as the ones just produced,

which depend on functions having some precise degree of diļ¬erentiability are

simply irrelevant.

At the beginning of this section we used a ļ¬rst order local Taylor expan-

sion and results on linear maps to establish the behaviour of a well behaved

function f near a point x where Df (x) = 0. We then used a second order lo-

cal Taylor expansion and results on bilinear maps to establish the behaviour

of a well behaved function f near a point x where Df (x) = 0 on condition

that D2 f (x) was non-singular. Why should we stop here?

It is not the case that we can restrict ourselves to functions f for which

D2 f (x) is non-singular at all points.

Exercise 7.3.19. (i) Let A(t) be a 3 Ć— 3 real symmetric matrix with A(t) =

(aij (t)). Suppose that the entries aij : R ā’ R are continuous. Explain why

det A : R ā’ R is continuous. By using an expression for det A in terms

of the eigenvalues of A, show that, if A(0) is positive deļ¬nite and A(1) is

negative deļ¬nite, then there must exist a c ā (0, 1) with A(c) singular.

(ii) Let m be an odd positive integer, U an open subset of Rm and Ī³ :

[0, 1] ā’ U a continuous map. Suppose that f : U ā’ R has continuous

second order partial derivatives on U , that f attains a local minimum at Ī³(0)

and a local maximum at Ī³(1). Show that there exists a c ā [0, 1] such that

D2 f (Ī³(t)) is singular.

There is nothing special about the choice of m odd in Exercise 7.3.19.

We do the case m = 2 in Exercise K.106 and ambitious readers may wish to

attack the general case themselves (however, it is probably only instructive if

you make the argument watertight). Exercise K.43 gives a slightly stronger

result when m = 1.

However, it is only when Df (x) vanishes and D 2 f (x) is singular at the

same point x that we have problems and we can readily convince ourselves

(note this is not the same as proving) that this is rather unusual.

164 A COMPANION TO ANALYSIS

Exercise 7.3.20. Let f : R ā’ R be given by f (x) = ax3 + bx2 + cx + d

with a, b, c, d real. Show that there is a y with f (y) = f (y) = 0 if and

only if one of the following two conditions hold:- a = 0 and b2 = 3ac, or

a = b = c = 0,

Faced with this kind of situation mathematicians tend to use the word

generic and say ā˜in the generic case, the Hessian is non-singular at the critical

pointsā™. This is a useful way of thinking but we must remember that:-

(1) If we leave the word generic undeļ¬ned, any sentence containing the

word generic is, strictly speaking, meaningless.

(2) In any case, if we look at any particular function, it ceases to be

generic. (A generic function is one without any particular properties. Any

particular function that we look at has the particular property that we are

interested in it.)

(3) The generic case may be a lot worse than we expect. Most mathemati-

cians would agree that the generic function f : R ā’ R is unbounded on every

interval (a, b) with a < b, that the generic bounded function f : R ā’ R is dis-

continuous at every point and that the generic continuous function f : R ā’ R

is nowhere diļ¬erentiable. We should have said something more precise like

ā˜the generic 3 times diļ¬erentiable function f : Rn ā’ R has a non-singular

Hessian at its critical pointsā™.

So far in this section we have looked at stationary points of f by studying

the local behaviour of the function. In this we have remained true to our

17th and 18th century predecessors. In a paper entitled On Hills and Dales,

Maxwell7 raises our eyes from the local and shows us the prospect of a global

theory.

Plausible statement 7.3.21. (Hill and dale theorem.) Suppose the

surface of the moon has a ļ¬nite number S of summits, B of bottoms and

P of passes (all heights being measured from the moonā™s centre). Then

S + B ā’ P = 2.

Plausible Proof. By digging out pits and piling up soil we may ensure that

all the bottoms are at the same height, that all the passes are at diļ¬erent

heights, but all higher than the bottoms, and that all the summits are at the

same height which is greater than the height of any pass. Now suppose that

it begins to rain and that the water level rises steadily (and that the level is

the same for each body of water). We write L(h) for the number of lakes (a

lake is the largest body of water that a swimmer can cover without going on

7

Maxwell notes that he was anticipated by Cayley.

165

Please send corrections however trivial to twk@dpmms.cam.ac.uk

Figure 7.4: A pass vanishes under water

to dry land), I(h) for the number of islands (an island is the largest body of

dry land that a walker can cover without going into the water) and P (h) for

the number of passes visible when the height of the water is h.

When the rain has just begun and the height h0 , say, of the water is

higher than the bottoms, but lower than the lowest pass, we have

L(h0 ) = B, I(h0 ) = 1, P (h0 ) = P. (1)

(Observe that there is a single body of dry land that a walker can get to

without going into the water so I(h0 ) = 1 even if the man in the street would

object to calling the surface of the moon with a few puddles an island.) Every

time the water rises just high enough to drown a pass, then either

(a) two arms of a lake join so an island appears, a pass vanishes and the

number of lakes remains the same, or

(b) two lakes come together so the number of lakes diminishes by one, a

pass vanishes and the number of islands remains the same.

We illustrate this in Figure 7.4. In either case, we see that

I(h) ā’ L(h) + P (h) remains constant

and so, by equation (1),

I(h) ā’ L(h) + P (h) = I(h0 ) ā’ L(h0 ) + P (h0 ) = 1 ā’ B + P. (2)

When the water is at a height h1 , higher than the highest pass but lower

than the summits, we have

L(h1 ) = 1, I(h1 ) = S, P (h1 ) = 0. (3)

(Though the man in the street would now object to us calling something a

lake when it is obviously an ocean with S isolated islands.) Using equations

(2) and (3), we now have

1 ā’ B + P = I(h1 ) ā’ L(h1 ) + P (h1 ) = S ā’ 1

and so B + S ā’ P = 2.

166 A COMPANION TO ANALYSIS

Figure 7.5: One- and two-holed doughnuts

Exercise 7.3.22. State and provide plausible arguments for plausible results

corresponding to Plausible Statement 7.3.21 when the moon is in the shape

of a one-holed doughnut, two-holed doughnut and an n-holed doughnut (see

Figure 7.5).

Notice that local information about the nature of a function at special

points provides global ā˜topologicalā™ information about the number of holes in

a doughnut.

If you know Eulerā™s theorem (memory jogger ā˜V-E+F=2ā™), can you con-

nect it with this discussion?

Exercise 7.3.23. The function f : R2 ā’ R is well behaved (say 3 times

diļ¬erentiable). We have f (x, y) = 0 for x2 + y 2 = 1 and f (x, y) > 0 for

x2 + y 2 < 1. State and provide a plausible argument for a plausible result

concerning the number of maxima, minima and saddle points (x, y) for f

with x2 + y 2 < 1.

I ļ¬nd the plausible argument just used very convincing but it is not clear

how we would go about converting it into an argument from ļ¬rst principles

(in eļ¬ect, from the fundamental axiom of analysis). Here are some of the

problems we must face.

(1) Do contour lines actually exist (that is do the points (x, y) with

f (x, y) = h actually lie on nice curves)8 ? We shall answer this question

locally by the implicit function theorem (Theorem 13.2.4) and our discussion

of the solution of diļ¬erential equations in Section 12.3 will shed some light

on the global problem.

(2) ā˜The largest body of water that a swimmer can cover without going

on to dry landā™ is a vivid but not a mathematical expression. In later work

8

The reader will note that though we have used contour lines as a heuristic tool we have

not used them in proofs. Note that, in speciļ¬c cases, we do not need a general theorem to

tell us that contour lines exist. For example, the contour lines of f (x, y) = a ā’2 x2 + bā’2 y 2

are given parametrically by (x, y) = (ah1/2 cos Īø, bh1/2 sin Īø) for h ā„ 0.

167

Please send corrections however trivial to twk@dpmms.cam.ac.uk

this problem is resolved by giving a formal deļ¬nition of a connected set.

(3) Implicit in our argument is the idea that a loop divides a sphere

into two parts. A result called the Jordan curve theorem gives the formal

statement of this idea but the proof turns out to be unexpectedly hard,

Another, less important, problem is to show that the hypothesis that

there are only a ā˜ļ¬nite number S of summits, B of bottoms and P of passesā™

applies to an interesting variety of cases. It is certainly not the case that a

function f : R ā’ R will always have only a ļ¬nite number of maxima in a

closed bounded interval. In the same way, it is not true that a moon need

have only a ļ¬nite number of summits.

Exercise 7.3.24. Reread Example 7.1.5. Deļ¬ne f : R ā’ R by

f (x) = (cos(1/x) ā’ 1) exp(ā’1/x2 ) if x = 0,

f (0) = 0

Show that f is inļ¬nitely diļ¬erentiable everywhere and that f has an inļ¬nite

number of distinct strict local maxima in the interval [ā’1, 1].

(Exercise K.42 belongs to the same circle of ideas.)

The answer, once again, is to develop a suitable notion of genericity but

we shall not do so here.

Some say will say that there is no need to answer these questions since

the plausible argument which establishes Plausible Statement 7.3.21 is in

some sense ā˜obviously correctā™. I would reply that the reason for attacking

these questions is their intrinsic interest. Plausible Statement 7.3.21 and the

accompanying discussion are the occasion for us to ask these questions, not

the reason for trying to answer them. I would add that we cannot claim to

understand Maxwellā™s result fully unless we can see either how it generalises

to higher dimensions or why it does not.

Students often feel that multidimensional calculus is just a question of

generalising results from one dimension to many. Maxwellā™s result shows that

the change from one to many dimensions introduces genuinely new phenom-

ena, whose existence cannot be guessed from a one dimensional perspective.

Chapter 8

The Riemann integral

8.1 Where is the problem ?

Everybody knows what area is, but then everybody knows what honey tastes

like. But does honey taste the same to you as it does to me? Perhaps the

question is unanswerable but, for many practical purposes, it is suļ¬cient

that we agree on what we call honey. In the same way, it is important that,

when two mathematicians talk about area, they should agree on the answers

to the following questions:-

(1) Which sets E actually have area?

(2) When a set E has area, what is that area?

One of the discoveries of 20th century mathematics is that decisions on (1)

and (2) are linked in rather subtle ways to the question:-

(3) What properties should area have?

As an indication of the ideas involved, consider the following desirable

properties for area.

(a) Every bounded set E in R2 has an area |E| with |E| ā„ 0.

(b) Suppose that E is a bounded set in R2 . If E is congruent to F (that

is E can be obtained from F by translation and rotation), then |E| = |F |.

(c) Any square E of side a has area |E| = a2 .

(d) If E1 , E2 , . . . are disjoint bounded sets in R2 whose union F = ā Ej

i=1

ā

is also bounded, then |F | = i=1 |Ej | (so ā˜the whole is equal to the sum of

its partsā™).

Exercise 8.1.1. Suppose that conditions (a) to (d) all hold.

(i) Let A be a bounded set in R2 and B ā A. By writing A = B āŖ (A \ B)

and using condition (d) together with other conditions, show that |A| ā„ |B|.

(ii) By using (i) and condition (c), show that, if A is a non-empty bounded

open set, in R2 then |A| > 0.

169

170 A COMPANION TO ANALYSIS

We now show that assuming all of conditions (a) to (d) leads to a con-

tradiction. We start with an easy remark.

Exercise 8.1.2. If 0 ā¤ x, y < 1, write x ā¼ y whenever x ā’ y ā Q. Show

that if x, y, z ā [0, 1) we have

(i) x ā¼ x,

(ii) x ā¼ y implies y ā¼ x,

(iii) x ā¼ y and y ā¼ z together imply x ā¼ z.

(In other words, ā¼ is an equivalence relation.)

Write

[x] = {y ā [0, 1) : y ā¼ x}.

(In other words, write [x] for the equivalence class of x.) By quoting the

appropriate theorem or direct proof, show that

(iv) [x] = [0, 1),

xā[0,1)

(v) if x, y ā [0, 1), then either [x] = [y] or [x] ā© [y] = ā….

We now consider a set A which contains exactly one element from each

equivalence class.

Exercise 8.1.3. (This is easy.) Show that if t ā [0, 1) then the equation

tā”a+q mod 1

has exactly one solution with a ā A, q rational and q ā [0, 1).

[Here t ā” x + q mod 1 means t ā’ x ā’ q ā Z.]

We are now in a position to produce our example. It will be easiest to

work in C identiļ¬ed with R2 in the usual way and to deļ¬ne

E = {r exp 2Ļia : 1 > r > 0, a ā A}.

Since Q is countable, it follows that its subset Q ā© [0, 1) is countable and we

can write

Q ā© [0, 1) = {qj : j ā„ 1}

with q1 , q2 , . . . all distinct. Set

Ej = {r exp 2Ļi(a + qj ) : 1 > r > 0, a ā A}.

171

Please send corrections however trivial to twk@dpmms.cam.ac.uk

Exercise 8.1.4. Suppose that conditions (a) to (d) all hold.

(i) Describe the geometric relation of E and Ej . Deduce that |E| = |Ej |.

(ii) Use Exercise 8.1.3 to show that Ej ā© Ek = ā… if j = k.

(iii) Use Exercise 8.1.3 to show that

ā

Ej = U

j=1

where U = {z : 0 < |z| < 1}.

(iv) Deduce that

ā

|Ej | = |U |.

j=1

Show from Exercise 8.1.1 (ii) that 0 < |U |.

(v) Show that (i) and (iv) lead to a contradiction if |E| = 0 and if |E| > 0.

Thus (i) and (iv) lead to a contradiction whatever we assume. It follows that

conditions (a) to (d) cannot all hold simultaneously.

Exercise 8.1.5. Deļ¬ne E and Eq as subsets of R2 without using complex

numbers.

The example just given is due to Vitali. It might be hoped that the

problem raised by Vitaliā™s example are due to the fact that condition (d)

involves inļ¬nite sums. This hope is dashed by the following theorem of

Banach and Tarski.

Theorem 8.1.6. The unit ball in R3 can be decomposed into a ļ¬nite number

of pieces which may be reassembled, using only translation and rotation, to

form 2 disjoint copies of the unit ball.

Exercise 8.1.7. Use Theorem 8.1.6 to show that the following four condi-

tions are not mutually consistent.

(a) Every bounded set E in R3 has an volume |E| with |E| ā„ 0.

(b) Suppose that E is a bounded set in R3 . If E is congruent to F (that

is E can be obtained from F by translation and rotation), then |E| = |F |.

(c) Any cube E of side a has volume |E| = a3 .

(d) If E1 and E2 are disjoint bounded sets in R3 , then |E1 āŖ E2 | = |E1 | +

|E2 |.

The proof of Theorem 8.1.6, which is a lineal descendant of Vitaliā™s ex-

ample, is too long to be given here. It is beautifully and simply explained in

172 A COMPANION TO ANALYSIS

a book [46] devoted entirely to ideas generated by the result of Banach and

Tarski1 .

The examples of Vitali and Banach and Tarski show that if we want a

well behaved notion of area we will have to say that only certain sets have

area. Since the notion of an integral is closely linked to that of area, (ā˜the

integral is the area under the curveā™) this means that we must accept that

only certain functions have integrals. It also means that that we must make

sure that our deļ¬nition does not allow paradoxes of the type discussed here.

8.2 Riemann integration

In this section we introduce a notion of the integral due to Riemann. For

the moment we only attempt to deļ¬ne our integral for bounded functions on

bounded intervals.

Let f : [a, b] ā’ R be a function such that there exists a K with |f (x)| ā¤ K

for all x ā [a, b]. [To see the connection with ā˜the area under the curveā™ it is

helpful to suppose initially that 0 ā¤ f (x) ā¤ K. However, all the deļ¬nitions

and proofs work more generally for ā’K ā¤ f (x) ā¤ K. The point is discussed

in Exercise K.114.] A dissection (also called a partition) D of [a, b] is a ļ¬nite

subset of [a, b] containing the end points a and b. By convention, we write

D = {x0 , x1 , . . . , xn } with a = x0 ā¤ x1 ā¤ x2 ā¤ Ā· Ā· Ā· ā¤ xn = b.

We deļ¬ne the upper sum and lower sum associated with D by

n

S(f, D) = (xj ā’ xjā’1 ) sup f (x),

xā[xjā’1 ,xj ]

j=1

n

s(f, D) = (xj ā’ xjā’1 ) inf f (x)

xā[xjā’1 ,xj ]

j=1

b

[Observe that, if the integral a f (t) dt exists, then the upper sum ought to

provide an upper bound and the lower sum a lower bound for that integral.]

Exercise 8.2.1. (i) Suppose that a ā¤ c ā¤ b. If D = {a, b} and D =

{a, c, b}, show that

S(f, D) ā„ S(f, D ) ā„ s(f, D ) ā„ s(f, D).

1

In more advanced work it is observed that our discussion depends on a principle

called the ā˜axiom of choiceā™. It is legitimate to doubt this principle. However, anyone who

doubts the axiom of choice but believes that every set has volume resembles someone who

disbelieves in Father Christmas but believes in ļ¬‚ying reindeer.

173

Please send corrections however trivial to twk@dpmms.cam.ac.uk

(ii) Let c = a, b. Show by examples that, in (i), we can have either

S(f, D) = S(f, D ) or S(f, D) > S(f, D ).

(iii) Suppose that a ā¤ c ā¤ b and D is a dissection. Show that, if D =

D āŖ {c}, then

S(f, D) ā„ S(f, D ) ā„ s(f, D ) ā„ s(f, D).

(iv) Suppose that D and D are dissections with D ā D. Show, using

(iii), or otherwise, that

S(f, D) ā„ S(f, D ) ā„ s(f, D ) ā„ s(f, D).

The result of Exercise 8.2.1 (iv) is so easy that it hardly requires proof.

None the less it is so important that we restate it as a lemma.

Lemma 8.2.2. If D and D are dissections with D ā D then

S(f, D) ā„ S(f, D ) ā„ s(f, D ) ā„ s(f, D).

The next lemma is again hardly more than an observation but it is the

key to the proper treatment of the integral.

Lemma 8.2.3 (Key integration property). If f : [a, b] ā’ R is bounded

and D1 and D2 are two dissections, then

S(f, D1 ) ā„ S(f, D1 āŖ D2 ) ā„ s(f, D1 āŖ D2 ) ā„ s(f, D2 ).

The inequalities tell us that, whatever dissection you pick and whatever

dissection I pick, your lower sum cannot exceed my upper sum. There is no

way we can put a quart into a pint pot2 and the Banach-Tarski phenomenon

is avoided.

Since S(f, D) ā„ ā’(b ā’ a)K for all dissections D we can deļ¬ne the upper

integral as I ā— (f ) = inf D S(f, D). We deļ¬ne the lower integral similarly as

Iā— (f ) = supD s(f, D). The inequalities tell us that these concepts behave

well.

Lemma 8.2.4. If f : [a, b] ā’ R is bounded, then I ā— (f ) ā„ Iā— (f ).

b

[Observe that, if the integral a f (t) dt exists, then the upper integral

ought to provide an upper bound and the lower integral a lower bound for

that integral.]

2

Or a litre into a half litre bottle. Any reader tempted to interpret such pictures

literally is directed to part (iv) of Exercise K.171.

174 A COMPANION TO ANALYSIS

If I ā— (f ) = Iā— (f ), we say that f is Riemann integrable and we write

b

f (x) dx = I ā— (f ).

a

We write R[a, b] or sometimes just R for the set of Riemann integrable func-

tions on [a, b].

Exercise 8.2.5. If k ā R show that the constant function given by f (t) = k

for all t is Riemann integrable and

b

k dx = k(b ā’ a).

a

The following lemma provides a convenient criterion for Riemann inte-

grability.

Lemma 8.2.6. (i) A bounded function f : [a, b] ā’ R is Riemann integrable

if and only if, given any > 0, we can ļ¬nd a dissection D with

S(f, D) ā’ s(f, D) < .

(ii) A bounded function f : [a, b] ā’ R is Riemann integrable with integral

I if and only if, given any > 0, we can ļ¬nd a dissection D with

S(f, D) ā’ s(f, D) < and |S(f, D) ā’ I| ā¤ .

Proof. (i) We need to prove necessity and suļ¬ciency. To prove necessity,

suppose that f is Riemann integrable with Riemann integral I (so that I =

I ā— (f ) = Iā— (f )). If > 0 then, by the deļ¬nition of I ā— (f ), we can ļ¬nd a

dissection D1 such that

I + /2 > S(f, D1 ) ā„ I.

Similarly, by the deļ¬nition of Iā— (f ), we can ļ¬nd a dissection D2 such that

I ā„ s(f, D2 ) > I ā’ /2.

Setting D = D1 āŖ D2 and using Lemmas 8.2.2 and 8.2.3, we have

I + /2 > S(f, D1 ) ā„ S(f, D) ā„ s(f, D) ā„ s(f, D2 ) > I ā’ /2,

so S(f, D) ā’ s(f, D) < as required.

175

Please send corrections however trivial to twk@dpmms.cam.ac.uk

To prove suļ¬ciency suppose that, given any > 0, we can ļ¬nd a dissection

D with

S(f, D) ā’ s(f, D) < .

Using the deļ¬nition of the upper and lower integrals I ā— (f ) and Iā— (f ) together

with the fact that I ā— (f ) ā„ Iā— (f ) (a consequence of our key Lemma 8.2.3), we

already know that

S(f, D) ā„ I ā— (f ) ā„ Iā— (f ) ā„ s(f, D),

so we may conclude that ā„ I ā— (f ) ā’ Iā— (f ) ā„ 0. Since is arbitrary, we have

I ā— (f ) ā’ Iā— (f ) = 0 so I ā— (f ) = Iā— (f ) as required.

(ii) Left to the reader.

Exercise 8.2.7. Prove part (ii) of Lemma 8.2.6.

Many students are tempted to use Lemma 8.2.6 (ii) as the deļ¬nition of

the Riemann integral. The reader should reļ¬‚ect that, without the inequality

, it is not even clear that such a deļ¬nition gives a unique value for I. (This

is only the ļ¬rst of a series of nasty problems that arise if we attempt to

develop the theory without ļ¬rst proving , so I strongly advise the reader

not to take this path.) We give another equivalent deļ¬nition of the Riemann

integral in Exercise K.113.

It is reasonably easy to show that the Riemann integral has the properties

which are normally assumed in elementary calculus.

Lemma 8.2.8. If f, g : [a, b] ā’ R are Riemann integrable, then so is f + g

and

b b b

f (x) + g(x) dx = f (x) dx + g(x) dx.

a a a

b b

Proof. Let us write I(f ) = a f (x) dx and I(g) = a g(x) dx. Suppose > 0

is given. By the deļ¬nition of the Riemann integral, we can ļ¬nd dissections

D1 and D2 of [a, b] such that

I(f ) + /4 >S(f, D1 ) ā„ I(f ) > s(f, D1 ) ā’ /4 and

I(g) + /4 >S(g, D2 ) ā„ I(g) > s(g, D2 ) ā’ /4.

and the deļ¬nition of I ā— (f )

If we set D = D1 āŖ D2 , then our key inequality

tell us that

I(f ) + /4 > S(f, D1 ) ā„ S(f, D) ā„ I(f ).

176 A COMPANION TO ANALYSIS

Using this and corresponding results, we obtain

I(f ) + /4 >S(f, D) ā„ I(f ) > s(f, D) ā’ /4 and

I(g) + /4 >S(g, D) ā„ I(g) > s(g, D) ā’ /4.

Now

n

S(f + g, D) = (xj ā’ xjā’1 ) sup (f (x) + g(x))

xā[xjā’1 ,xj ]

j=1

n

ā¤ (xj ā’ xjā’1 )( sup f (x) + sup g(x))

xā[xjā’1 ,xj ] xā[xjā’1 ,xj ]

j=1

= S(f, D) + S(g, D)

and similarly s(f +g, D) ā„ s(f, D)+s(g, D). Thus, using the ļ¬nal inequalities

of the last paragraph,

I(f ) + I(g) + /2 > S(f, D) + S(g, D) ā„ S(f + g, D)

ā„ s(f + g, D) ā„ s(f, D) + s(g, D) > I(f ) + I(g) ā’ /2.

Thus S(f + g, D) ā’ s(f + g, D) < and |S(f + g, D) ā’ (I(f ) + I(g))| < .

Exercise 8.2.9. How would you explain (NB explain, not prove) to someone

who had not done calculus but had a good grasp of geometry why the result

b b b

f (x) + g(x) dx = f (x) dx + g(x) dx

a a a

is true for well behaved functions. (I hope that you will agree with me that,

obvious as this result now seems to us, the ļ¬rst mathematicians to grasp this

fact had genuine insight.)

Exercise 8.2.10. (i) If f : [a, b] ā’ R is bounded and D is a dissection of

[a, b], show that S(ā’f, D) = ā’s(f, D).

(ii) If f : [a, b] ā’ R is Riemann integrable, show that ā’f is Riemann

integrable and

b b

(ā’f (x)) dx = ā’ f (x) dx.

a a

(iii) If Ī» ā R, Ī» ā„ 0, f : [a, b] ā’ R is bounded and D is a dissection of

[a, b], show that S(Ī»f, D) = Ī»S(f, D).

177

Please send corrections however trivial to twk@dpmms.cam.ac.uk

(iv) If Ī» ā R, Ī» ā„ 0 and f : [a, b] ā’ R is Riemann integrable, show that

Ī»f is Riemann integrable and

b b

Ī»f (x) dx = Ī» f (x) dx.

a a

(v) If Ī» ā R and f : [a, b] ā’ R is Riemann integrable, show that Ī»f is

Riemann integrable and

b b

Ī»f (x) dx = Ī» f (x) dx.

a a

Combining Lemma 8.2.8 with Exercise 8.2.10, we get the following result.

Lemma 8.2.11. If Ī», Āµ ā R and f, g : [a, b] ā’ R are Riemann integrable,

then Ī»f + Āµg is Riemann integrable and

b b b

Ī»f (x) + Āµg(x) dx = Ī» f (x) dx + Āµ g(x) dx.

a a a

In the language of linear algebra, R[a, b] (the set of Riemann integrable

functions on [a, b]) is a vector space and the integral is a linear functional

(i.e. a linear map from R[a, b] to R).

Exercise 8.2.12. (i) If E is a subset of [a, b], we deļ¬ne the indicator func-

tion IE : [a, b] ā’ R by IE (x) = 1 if x ā E, IE (x) = 0 otherwise. Show

directly from the deļ¬nition that, if a ā¤ c ā¤ d ā¤ b, then I[c,d] is Riemann

integrable and

b

I[c,d] (x) dx = d ā’ c.

a

(ii) If a ā¤ c ā¤ d ā¤ b, we say that the intervals (c, d), (c, d], [c, d), [c, d] all

have length d ā’ c. If I(j) is a subinterval of [a, b] of length |I(j)| and Ī» j ā R

show that the step function n Ī»j II(j) is Riemann integrable and

j=1

n n

b

Ī»j II(j) dx = Ī»j |I(j)|.

a j=1 j=1

Exercise 8.2.13. (i) If f, g : [a, b] ā’ R are bounded functions with f (t) ā„

g(t) for all t ā [a, b] and D is a dissection of [a, b], show that S(f, D) ā„

S(g, D).

178 A COMPANION TO ANALYSIS

(ii) If f, g : [a, b] ā’ R are Riemann integrable functions with f (t) ā„ g(t)

for all t ā [a, b], show that

b b

f (x) dx ā„ g(x) dx.

a a

(iii) Suppose that f : [a, b] ā’ R is a Riemann integrable function, K ā R

and f (t) ā„ K for all t ā [a, b]. Show that

b

f (x) dx ā„ K(b ā’ a).

a

State and prove a similar result involving upper bounds.

(iv) Suppose that f : [a, b] ā’ R is a Riemann integrable function, K ā R,

K ā„ 0 and |f (t)| ā¤ K for all t ā [a, b]. Show that

b

f (x) dx ā¤ K(b ā’ a).

a

Although part (iv) is weaker than part (iii), it generalises more easily and

we shall use it frequently in the form

|integral| ā¤ length Ć— sup.

Exercise 8.2.14. (i) Let M be a positive real number and f : [a, b] ā’ R

a function with |f (t)| ā¤ M for all t ā [a, b]. Show that |f (s)2 ā’ f (t)2 | ā¤

2M |f (s) ā’ f (t)| and deduce that

sup f (x)2 ā’ inf f (x)2 ā¤ 2M ( sup f (x) ā’ inf f (x)).

xā[a,b] xā[a,b]

xā[a,b] xā[a,b]

(ii) Let f : [a, b] ā’ R be a bounded function. Show that, if D is a

dissection of [a, b],

S(f 2 , D) ā’ s(f 2 D) ā¤ 2M (S(f, D) ā’ s(f, D)).

Deduce that, if f is Riemann integrable, so is f 2 .

(iii) By using the formula f g = 1 ((f + g)2 ā’ (f ā’ g)2 ), or otherwise,

4

deduce that that if f, g : [a, b] ā’ R are Riemann integrable, so is f g (the

product of f and g). (Compare Exercise 1.2.6.)

Exercise 8.2.15. (i) Consider a function f : [a, b] ā’ R. We deļ¬ne f+ , fā’ :

[a, b] ā’ R by

if f (t) ā„ 0

f+ (t) = f (t), fā’ (t) = 0

fā’ (t) = ā’f (t) if f (t) ā¤ 0.

f+ (t) = 0,

179

Please send corrections however trivial to twk@dpmms.cam.ac.uk

Check that f (t) = f+ (t) ā’ fā’ (t) and |f (t)| = f+ (t) + fā’ (t).

(ii) If f : [a, b] ā’ R is bounded and D is a dissection of [a, b], show that

S(f, D) ā’ s(f, D) ā„ S(f+ , D) ā’ s(f+ , D) ā„ 0.

(iii) If f : [a, b] ā’ R is Riemann integrable, show that f+ and fā’ are

Riemann integrable.

(iv) If f : [a, b] ā’ R is Riemann integrable, show that |f | is Riemann

integrable and

b b

|f (x)| dx ā„ f (x) dx .

a a

Exercise 8.2.16. In each of Exercises 8.2.10, 8.2.14 and 8.2.15 we used

a roundabout route to our result. For example, in Exercise 8.2.10 we ļ¬rst

proved that if f 2 is Riemann integrable whenever f is and then used this

result to prove that f g is Riemann integrable whenever f and g are. It is

natural to ask whether we can give a direct proof in each case. The reader

should try to do so. (In my opinion, the direct proofs are not much harder,

though they do require more care in writing out.)

Exercise 8.2.17. (i) Suppose that a ā¤ c ā¤ b and that f : [a, b] ā’ R is a

bounded function. Consider a dissection D1 of [a, c] given by

D1 = {x0 , x1 , . . . , xm } with a = x0 ā¤ x1 ā¤ x2 ā¤ Ā· Ā· Ā· ā¤ xm = c,

and a dissection D2 of [c, b] given by

D2 = {xm+1 , xm+2 , . . . , xn } with c = xm+1 ā¤ xm+2 ā¤ xm+3 ā¤ Ā· Ā· Ā· ā¤ xn = b.

If D is the dissection of [a, b] given by

D = {x0 , x1 , . . . , xn },

show that S(f, D) = S(f |[a,c] , D1 ) + S(f |[c,b] , D2 ). (Here f |[a,c] means the

restriction of f to [a, c].)

(ii) Show that f ā R[a, b] if and only if f |[a,c] ā R[a, c] and f |[c,b] ā R[c, b].

Show also that, if f ā R[a, b], then

b c b

f |[a,c] (x) dx + f |[c,b] (x) dx.

f (x) dx =

a a c

In a very slightly less precise and very much more usual notation we write

b c b

f (x) dx = f (x) dx + f (x) dx.

a a c

180 A COMPANION TO ANALYSIS

There is a standard convention that we shall follow which says that, if

b ā„ a and f is Riemann integrable on [a, b], we deļ¬ne

a b

f (x) dx = ā’ f (x) dx.

b a

Exercise 8.2.18. Suppose Ī² ā„ Ī± and f is Riemann integrable on [Ī±, Ī²].

Show that if a, b, c ā [Ī±, Ī²] then

b c b

f (x) dx = f (x) dx + f (x) dx.

a a c

[Note that a, b and c may occur in any of six orders.]

However, this convention must be used with caution.

Exercise 8.2.19. Suppose that b ā„ a, Ī», Āµ ā R, and f and g are Riemann

integrable. Which of the following statements are always true and which are

not? Give a proof or counterexample. If the statement is not always true,

ļ¬nd an appropriate correction and prove it.

a a a

(i) Ī»f (x) + Āµg(x) dx = Ī» f (x) dx + Āµ g(x) dx.

b b b

a a

(ii) If f (x) ā„ g(x) for all x ā [a, b], then f (x) dx ā„ g(x) dx.

b b

Riemann was unable to show that all continuous functions were integrable

(we have a key concept that Riemann did not and we shall be able to ļ¬ll

this gap in the next section). He did, however, have the result of the next

exercise. (Note that an increasing function need not be continuous. Consider

the Heaviside function H : R ā’ R given by H(x) = 0 for x < 0, H(x) = 1

for x ā„ 0.)

Exercise 8.2.20. Suppose f : [a, b] ā’ R is increasing. Let N be a strictly

positive integer and consider the dissection

D = {x0 , x1 , . . . , xN } with xj = a + j(b ā’ a)/N .

Show that

N

S(f, D) = f (xj )(b ā’ a)/N,

j=1

ļ¬nd s(f, D) and deduce that

S(f, D) ā’ s(f, D) = (f (b) ā’ f (a))(b ā’ a)/N.

Conclude that f is Riemann integrable.

181

Please send corrections however trivial to twk@dpmms.cam.ac.uk

Using Lemma 8.2.11 this gives the following result.

Lemma 8.2.21. If f : [a, b] ā’ R can be written as f = f1 ā’ f2 with f1 , f2 :

[a, b] ā’ R increasing, then f is Riemann integrable.

At ļ¬rst sight, Lemma 8.2.21 looks rather uninteresting but, in fact, it

covers most of the functions we normally meet.

Exercise 8.2.22. (i) If

f1 (t) = 0, f2 (t) = ā’t2 if t < 0

f1 (t) = t2 , f2 (t) = 0 if t ā„ 0,

show that f1 and f2 are increasing functions with t2 = f1 (t) ā’ f2 (t).

(ii) Show that, if f : [a, b] ā’ R has only a ļ¬nite number of local maxima

and minima, then it can be written in the form f = f1 ā’ f2 with f1 , f2 :

[a, b] ā’ R increasing.

Functions which are the diļ¬erence of two increasing functions are dis-

cussed in Exercise K.158, Exercises K.162 to K.166 and more generally in

the next chapter as ā˜functions of bounded variationā™. We conclude this sec-

tion with an important example of Dirichlet.

Exercise 8.2.23. If f : [0, 1] ā’ R is given by

f (x) = 1 when x is rational,

f (x) = 0 when x is irrational,

show that, whenever D is a dissection of [0, 1], we have S(f, D) = 1 and

s(f, D) = 0. Conclude that f is not Riemann integrable.

Exercise 8.2.24. (i) If f is as in Exercise 8.2.23, show that

N N

1 1

f (r/N ) ā’ 1 as N ā’ ā.

f (r/N ) = 1 and so

N N

r=1 r=1

(ii) Let g : [0, 1] ā’ R be given by

g(r/2n ) = 1 when 1 ā¤ r ā¤ 2n ā’ 1, n ā„ 1, and r and n are integers,

g(s/3n ) = ā’1 when 1 ā¤ s ā¤ 3n ā’ 1, n ā„ 1, and s and n are integers,

g(x) = 0 otherwise.

Discuss the behaviour of

N

1

g(r/N )

N r=1

as N ā’ ā in as much detail as you consider desirable.

182 A COMPANION TO ANALYSIS

8.3 Integrals of continuous functions

The key to showing that continuous functions are integrable, which we have

and Riemann did not, is the notion of uniform continuity and the theo-

rem (Theorem 4.5.5) which tells us that a continuous function on a closed

bounded subset of Rn , and so, in particular, on a closed interval, is uniformly

continuous3 .

Theorem 8.3.1. Any continuous function f : [a, b] ā’ R is Riemann inte-

grable.

Proof. If b = a the result is obvious, so suppose b > a. We shall show that f

is Riemann integrable by using the standard criterion given in Lemma 8.2.6.

To this end, suppose that > 0 is given. Since a continuous function on a

closed bounded interval is uniformly continuous, we can ļ¬nd a Ī“ > 0 such

that

|f (x) ā’ f (y)| ā¤ whenever x, y ā [a, b] and |x ā’ y| < Ī“.

bā’a

Choose an integer N > (b ā’ a)/Ī“ and consider the dissection

D = {x0 , x1 , . . . , xN } with xj = a + j(b ā’ a)/N .

If x, y ā [xj , xj+1 ], then |x ā’ y| < Ī“ and so

|f (x) ā’ f (y)| ā¤ .

bā’a

It follows that

f (x) ā’ f (x) ā¤

sup inf

bā’a

xā[xj ,xj+1 ]

xā[xj ,xj+1 ]

for all 0 ā¤ j ā¤ N ā’ 1 and so

N ā’1

S(f, D) ā’ s(f, D) = (xj+1 ā’ xj ) f (x) ā’

sup inf f (x)

xā[xj ,xj+1 ]

xā[xj ,xj+1 ]

j=0

N ā’1

bā’a

ā¤ =,

N bā’a

j=0

as required.

3

This is a natural way to proceed but Exercise K.118 shows that it is not the only one.

183

Please send corrections however trivial to twk@dpmms.cam.ac.uk

Slight extensions of this result are given in Exercise I.11. In Exercise K.122

we consider a rather diļ¬erent way of looking at integrals of continuous func-

tions.

Although there are many functions which are integrable besides the con-

tinuous functions, there are various theorems on integration which demand

that the functions involved be continuous or even better behaved. Most of

the results of this section have this character.

Lemma 8.3.2. If f : [a, b] ā’ R is continuous, f (t) ā„ 0 for all t ā [a, b] and

b

f (t) dt = 0,

a

it follows that f (t) = 0 for all t ā [a, b].

Proof. If f is a positive continuous function which is not identically zero, then

we can ļ¬nd an x ā [a, b] with f (x) > 0. Setting = f (x)/2, the continuity

of f tells us that there exists a Ī“ > 0 such that |f (x) ā’ f (y)| < whenever

|x ā’ y| ā¤ Ī“ and y ā [a, b]. We observe that

f (y) ā„ f (x) ā’ |f (x) ā’ f (y)| > f (x) ā’ = f (x)/2

whenever |x ā’ y| ā¤ Ī“ and y ā [a, b]. If we deļ¬ne h : [a, b] ā’ R by h(y) =

f (x)/2 whenever |x ā’ y| ā¤ Ī“ and y ā [a, b] and h(y) = 0 otherwise, then

f (t) ā„ h(t) for all t ā [a, b] and so

b b

f (t) dt ā„ h(t) dt > 0.

a a

Exercise 8.3.3. (i) Let a ā¤ c ā¤ b. Give an example of a Riemann integrable

function f : [a, b] ā’ R such that f (t) ā„ 0 for all t ā [a, b] and

b

f (t) dt = 0,

a

but f (c) = 0.

(ii) If f : [a, b] ā’ R is Riemann integrable, f (t) ā„ 0 for all t ā [a, b] and

b

f (t) dt = 0,

a

show that f (t) = 0 at every point t ā [a, b] where f is continuous.

184 A COMPANION TO ANALYSIS

(iii) We say that f : [a, b] ā’ R is right continuous at t ā [a, b] if f (s) ā’

f (t) as s ā’ t through values of s with b ā„ s > t. Suppose f is Riemann

integrable and is right continuous at every point t ā [a, b]. Show that if

f (t) ā„ 0 for all t ā [a, b] and

b

f (t) dt = 0,

a

it follows that f (t) = 0 for all t ā [a, b] with at most one exception. Give an

example to show that this exception may occur.

The reader should have little diļ¬culty in proving the following useful

related results.

Exercise 8.3.4. (i) If f : [a, b] ā’ R is continuous and

b

f (t)g(t) dt = 0,

a

whenever g : [a, b] ā’ R is continuous, show that f (t) = 0 for all t ā [a, b].

(ii) If f : [a, b] ā’ R is continuous and

b

f (t)g(t) dt = 0,

a

whenever g : [a, b] ā’ R is continuous and g(a) = g(b) = 0, show that f (t) = 0

for all t ā [a, b]. (We prove a slightly stronger result in Lemma 8.4.7.)

We now prove the fundamental theorem of the calculus which links the

processes of integration and diļ¬erentiation. Since the result is an important

one it is worth listing the properties of the integral that we use in the proof.

Lemma 8.3.5. Suppose Ī», Āµ ā R, f, g : [Ī±, Ī²] ā’ R are Riemann integrable

and a, b, c ā [Ī±, Ī²]. The following results hold.

b

1 dt = b ā’ a.

(i)

a

b b b

(ii) Ī»f (t) + Āµg(t) dt = Ī» f (t) dt + Āµ g(t) dt.

a a a

b c c

(iii) f (t) dt + f (t) dt = f (t) dt.

a b a

b

f (t) dt ā¤ |b ā’ a| sup |f (a + Īø(b ā’ a))|.

(iv)

0ā¤Īøā¤1

a

185

Please send corrections however trivial to twk@dpmms.cam.ac.uk

The reader should run through these results in her mind and make sure

that she can prove them (note that a, b and c can be in any order).

Theorem 8.3.6. (The fundamental theorem of the calculus.) Sup-

pose that f : (a, b) ā’ R is a continuous function and that u ā (a, b). If we

set

t

F (t) = f (x) dx,

u

then F is diļ¬erentiable on (a, b) and F (t) = f (t) for all t ā (a, b).

Proof. Observe that, if t + h ā (a, b) and h = 0 then

t+h t

F (t + h) ā’ F (t) 1

ā’ f (t) = f (x) dx ā’ f (x) dx ā’ hf (t)

h h u u

t+h t+h

1

f (x) dx ā’

= f (t) dx

h t t

t+h

1

(f (x) ā’ f (t)) dx

=

|h| t

ā¤ sup |f (t + Īøh) ā’ f (t)| ā’ 0

0ā¤Īøā¤1

as h ā’ 0 since f is continuous at t. (Notice that f (t) remains constant as x

varies.)

Exercise 8.3.7. (i) Using the idea of the integral as the area under a curve,

draw diagrams illustrating the proof of Theorem 8.3.6.

(ii) Point out, explicitly, each use of Lemma 8.3.5 in our proof of Theo-

rem 8.3.6.

(iii) Let H be the Heaviside function H : R ā’ R given by H(x) = 0 for

t

x < 0, H(x) = 1 for x ā„ 0. Calculate F (t) = 0 H(x) dx and show that F is

not diļ¬erentiable at 0. Where does our proof of Theorem 8.3.6 break down?

t

(iv) Let f (0) = 1, f (t) = 0 otherwise. Calculate F (t) = 0 f (x) dx and

show that F is diļ¬erentiable at 0 but F (0) = f (0). Where does our proof of

Theorem 8.3.6 break down?

Exercise 8.3.8. Suppose that f : (a, b) ā’ R is a function such that f is

Riemann integrable on every interval [c, d] ā (a, b). Let u ā (a, b) If we set

t

F (t) = f (x) dx

u

show that F is continuous on (a, b) and that, if f is continuous at some point

t ā (a, b), then F is diļ¬erentiable at t and F (t) = f (t).

186 A COMPANION TO ANALYSIS

Sometimes we think of the fundamental theorem in a slightly diļ¬erent way.

Theorem 8.3.9. Suppose that f : (a, b) ā’ R is continuous, that u ā (a, b)

and c ā R. Then there is a unique solution to the diļ¬erential equation

g (t) = f (t) [t ā (a, b)] such that g(u) = c.

Exercise 8.3.10. Prove Theorem 8.3.9. Make clear how you use Theo-

rem 8.3.6 and the mean value theorem. Reread section 1.1.

We call the solutions of g (t) = f (t) indeļ¬nite integrals (or, simply, inte-

grals) of f .

Yet another version of the fundamental theorem is given by the next

theorem.

Theorem 8.3.11. Suppose that g : (Ī±, Ī²) ā’ R has continuous derivative

and [a, b] ā (Ī±, Ī²). Then

b

g (t) dt = g(b) ā’ g(a).

a

Proof. Deļ¬ne U : (Ī±, Ī²) ā’ R by

t

g (x) dx ā’ g(t) + g(a).

U (t) =

a

By the fundamental theorem of the calculus and earlier results on diļ¬erenti-

ation, U is everywhere diļ¬erentiable with

U (t) = g (t) ā’ g (t) = 0

so, by the mean value theorem, U is constant. But U (a) = 0, so U (t) = 0

for all t and, in particular, U (b) = 0 as required.

[Remark: In one dimension, Theorems 8.3.6, 8.3.9 and 8.3.11 are so closely

linked that mathematicians tend to refer to them all as ā˜The fundamental

theorem of the calculusā™. However they generalise in diļ¬erent ways.

(1) Theorem 8.3.6 shows that, under suitable circumstances, we can re-

cover a function from its ā˜local averageā™ (see Exercise K.130).

(2) Theorem 8.3.9 says that we can solve a certain kind of diļ¬erential

equation. We shall obtain substantial generalisations of this result in Sec-

tion 12.2.

(3) Theorem 8.3.11 links the value of the derivative f on the whole of

[a, b] with the value of f on the boundary (that is to say, the set {a, b}). If

187

Please send corrections however trivial to twk@dpmms.cam.ac.uk

you have done a mathematical methods course you will already have seen a

similar idea expressed by the divergence theorem

Ā· u dV = u Ā· dS.

V ā‚V

This result and similar ones like Stokesā™ theorem turn out to be special cases

of a master theorem4 which links the behaviour of the derivative of a certain

mathematical object over the whole of some body with the behaviour of the

object on the boundary of that body.]

Theorems 8.3.6 and 8.3.11 show that (under appropriate circumstances)

integration and diļ¬erentiation are inverse operations and the the theories

of diļ¬erentiation and integration are subsumed in the greater theory of the

calculus. Under appropriate circumstances, if the graph of F has tangent

with slope f (x) at x

area under the graph of slope of tangent of F

= area under the graph of f

b b

F (x) dx = F (b) ā’ F (a).

= f (x) dx =

a a

Exercise 8.3.12. Most books give a slightly stronger version of Theorem 8.3.11

in the following form.

If f : [a, b] ā’ R has continuous derivative, then

b

f (t) dt = f (b) ā’ f (a).

a

Explain what this means (you will need to talk about ā˜leftā™ and ā˜rightā™ deriva-

tives) and prove it.

Recalling the chain rule (Lemma 6.2.10) which tells us that (Ī¦ ā—¦ g) (t) =

g (t)Ī¦ (g(t)), the same form of proof gives us a very important theorem.

Theorem 8.3.13. (Change of variables for integrals.) Suppose that f :

(Ī±, Ī²) ā’ R is continuous and g : (Ī³, Ī“) ā’ R is diļ¬erentiable with continuous

derivative. Suppose further that g (Ī³, Ī“) ā (Ī±, Ī²). Then, if c, d ā (Ī³, Ī“), we

have

g(d) d

f (s) ds = f (g(x))g (x) dx.

g(c) c

4

ArnolĀ“d calls it the Newton-Leibniz-Gauss-Green-Ostrogradskii-Stokes-PoincarĀ“ the-

e

orem but most mathematicians call it the generalised Stokesā™ theorem or just Stokesā™

theorem.

188 A COMPANION TO ANALYSIS

Exercise 8.3.14. (i) Prove Theorem 8.3.13 by considering

g(t) t

f (s) ds ā’

U (t) = f (g(x))g (x) dx.

g(c) c

(ii) Derive Theorem 8.3.11 from Theorem 8.3.13 by choosing f appropri-

ately.

(iii) Strengthen Theorem 8.3.13 along the lines of Exercise 8.3.12.

(iv) (An alternative proof.) If f is as in Theorem 8.3.13 explain why we

can ļ¬nd an F : (Ī±, Ī²) ā’ R with F = f . Obtain Theorem 8.3.13 by applying

the chain rule to F (g(x))g (x) = f (g(x))g (x).

Because the proof of Theorem 8.3.13 is so simple and because the main use

of the result in elementary calculus is to evaluate integrals, there is tendency

to underestimate the importance of this result. However, it is important for

later developments that the reader has an intuitive grasp of this result.

Exercise 8.3.15. (i) Suppose that f : R ā’ R is the constant function f (t) =

K and that g : R ā’ R is the linear function g(t) = Ī»t + Āµ. Show by direct

calculation that

g(d) d

f (s) ds = f (g(x))g (x) dx,

g(c) c

and describe the geometric content of this result in words.

(ii) Suppose now that f : R ā’ R and g : R ā’ R are well behaved

functions. By splitting [c, d] into small intervals on which f is ā˜almost con-

stantā™ and g is ā˜almost linearā™, give a heuristic argument for the truth of

Theorem 8.3.13. To see how this heuristic argument can be converted into a

rigorous one, consult Exercise K.118.

Exercise 8.3.16. There is one peculiarity in our statement of Theorem 8.3.13

which is worth noting. We do not demand that g be bijective. Suppose that

f : R ā’ R is continuous and g(t) = sin t. Show that, by choosing diļ¬erent

intervals (c, d), we obtain

sin Ī± Ī±

f (s) ds = f (sin x) cos x dx

0 0

Ī±+2Ļ Ļā’Ī±

= f (sin x) cos x dx = f (sin x) cos x dx.

0 0

Explain what is going on.

The extra ļ¬‚exibility given by allowing g not be bijective is one we are

usually happy to sacriļ¬ce in the interests of generalising Theorem 8.3.13.

189

Please send corrections however trivial to twk@dpmms.cam.ac.uk

Exercise 8.3.17. The following exercise is traditional.

(i) Show that integration by substitution, using x = 1/t, gives

b 1/a

dx dt

=

1 + x2 1 + t2

a 1/b

when b > a > 0.

(ii) If we set a = ā’1, b = 1 in the formula of (i), we obtain

1 1

dx ? dt

=ā’

1 + x2 1 + t2

ā’1 ā’1

Explain this apparent failure of the method of integration by substitution.

(iii) Write the result of (i) in terms of tanā’1 and prove it using standard

trigonometric identities.

In sections 5.4 and 5.6 we gave a treatment of the exponential and loga-

rithmic functions based on diļ¬erentiation. The reader may wish to look at

Exercise K.126 in which we use integration instead.

Another result which can be proved in much the same manner as Theo-

rems 8.3.11 and Theorem 8.3.13 is the lemma which justiļ¬es integration by

parts. (Recall the notation [h(x)]b = h(b) ā’ h(a).)

a

Lemma 8.3.18. Suppose that f : (Ī±, Ī²) ā’ R has continuous derivative and

g : (Ī±, Ī²) ā’ R is continuous. Let G : (Ī±, Ī²) ā’ R be an indeļ¬nite integral of

g. Then, if [a, b] ā (Ī±, Ī²), we have

b b

[f (x)G(x)]b ā’

f (x)g(x) dx = f (x)G(x) dx.

a

a a

Exercise 8.3.19. (i) Obtain Lemma 8.3.18 by diļ¬erentiating an appropriate

U in the style of the proofs of Theorems 8.3.11 and Theorem 8.3.13. Quote

carefully the results that you use.

(ii) Obtain Lemma 8.3.18 by integrating both sides of the equality (uv) =

u v + uv and choosing appropriate u and v. Quote carefully the results that

you use.

(iii) Strengthen Lemma 8.3.18 along the lines of Exercise 8.3.12.

Integration by parts gives a global Taylor theorem with a form that is

easily remembered and proved for examination.

Theorem 8.3.20. (A global Taylorā™s theorem with integral remain-

der.) If f : (u, v) ā’ R is n times continuously diļ¬erentiable and 0 ā (u, v),

then

nā’1

f (j) (0) j

f (t) = t + Rn (f, t)

j!

j=0

190 A COMPANION TO ANALYSIS

where

t

1

(t ā’ x)nā’1 f (n) (x) dx.

Rn (f, t) =

(n ā’ 1)! 0

Exercise 8.3.21. By integration by parts, show that

f (nā’1) (0) nā’1

Rn (f, t) = t + Rnā’1 (f, t).

(n ā’ 1)!

Use repeated integration by parts to obtain Theorem 8.3.20.

Exercise 8.3.22. Reread Example 7.1.5. If F is as in that example, identify

Rnā’1 (F, t).

Exercise 8.3.23. If f : (ā’a, a) ā’ R is n times continuously diļ¬erentiable

with |f (n) (t)| ā¤ M for all t ā (ā’a, a), show that

nā’1

f (j) (0) j M |t|n

f (t) ā’ tā¤ .

j! n!

j=0

Explain why this result is slightly weaker than that of Exercise 7.1.1 (v).

There are several variants of Theorem 8.3.20 with diļ¬erent expressions for

Rn (f, t) (see, for example, Exercise K.49 (vi)). However, although the theory

of the Taylor expansion is very important (see, for example, Exercise K.125

and Exercise K.266), these global theorems are not much used in relation to

speciļ¬c functions outside the examination hall. We discuss two of the reasons

why at the end of Section 11.5. In Exercises 11.5.20 and 11.5.22 I suggest

that it is usually easier to obtain Taylor series by power series solutions rather

than by using theorems like Theorem 8.3.20. In Exercise 11.5.23 I suggest

that power series are often not very suitable for numerical computation.

First steps in the calculus of variations ā™„

8.4

The most famous early problem in the calculus of variations is that of the

brachistochrone. It asks for the equation y = f (x) of the wire down which a

frictionless particle with initial velocity v will slide from one point (a, Ī±) to

another (b, Ī²) (so f (a) = Ī±, f (b) = Ī², a = b and Ī± > Ī²) in the shortest time.

It turns out that that time taken by the particle is

1/2

b

1 + f (x)2

1

J(f ) = dx

(2g)1/2 Īŗ ā’ f (x)

a

where Īŗ = v 2 /(2g) + Ī± and g is the acceleration due to gravity.

191

Please send corrections however trivial to twk@dpmms.cam.ac.uk

Exercise 8.4.1. If you know suļ¬cient mechanics, verify this. (Your argu-

ment will presumably involve arc length which has not yet been mentioned in

this book.)

This is a problem of minimising which is very diļ¬erent from those dealt

with in elementary calculus. Those problems ask us to choose a point x0 from

a one-dimensional space which minimises some function g(x). In section 7.3

we considered problems in which we sought to choose a point x0 from a

n-dimensional space which minimises some function g(x). Here we seek to

choose a function f0 from an inļ¬nite dimensional space to minimise a function

J(f ) of functions f .

Exercise 8.4.2. In the previous sentence we used the words ā˜inļ¬nite dimen-

sionalā™ somewhat loosely. However we can make precise statements along the

same lines.

(i) Show that the collection P of polynomials P with P (0) = P (1) = 0

forms a vector space over R with the obvious operations. Show that P is

inļ¬nite dimensional (in other words, has no ļ¬nite spanning set).

(ii) Show that the collection E of inļ¬nitely diļ¬erentiable functions f :

[0, 1] ā’ R with f (0) = f (1) forms a vector space over R with the obvious

operations. Show that E is inļ¬nite dimensional.

John Bernoulli published the brachistochrone problem as a challenge in

1696. Newton, Leibniz, Lā™HĖpital, John Bernoulli and James Bernoulli all

o

found solutions within a year5 . However, it is one thing to solve a particular

problem and quite another to ļ¬nd a method of attack for the general class

of problems to which it belongs. Such a method was developed by Euler

and Lagrange. We shall see that it does not resolve all diļ¬culties but it

represents a marvelous leap of imagination.

We begin by proving that, under certain circumstances, we can inter-

change the order of integration and diļ¬erentiation. (We will extend the

result in Theorem 11.4.21.)

Theorem 8.4.3. (Diļ¬erentiation under the integral.) Let (a , b ) Ć—

(c , d ) ā [a, b] Ć— [c, d]. Suppose that g : (a , b ) Ć— (c , d ) ā’ R is continuous

and that the partial derivative g,2 exists and is continuous. Then writing

b

G(y) = a g(x, y) dx we have G diļ¬erentiable on (c, d) with

b

G (y) = g,2 (x, y) dx.

a

5

They were giants in those days. Newton had retired from mathematics and submitted

his solution anonymously. ā˜Butā™ John Bernoulli said ā˜one recognises the lion by his paw.ā™

192 A COMPANION TO ANALYSIS

This result is more frequently written as

b b

d ā‚g

g(x, y) dx = (x, y) dx,

dy ā‚y

a a

and interpreted as ā˜the d clambers through the integral and curls upā™. If we

use the D notation we get

b

G (y) = D2 g(x, y) dx.

a

b

It may, in the end, be more helpful to note that a g(x, y) dx is a function of

the single variable y, but g(x, y) is a function of the two variables x and y.

Proof. We use a proof technique which is often useful in this kind of situation

(we have already used a simple version in Theorem 8.3.6, when we proved

the fundamental theorem of the calculus).

We ļ¬rst put everything under one integral sign. Suppose y, y + h ā (c, d)

and h = 0. Then

b b

G(y + h) ā’ G(y) 1

ā’ G(y + h) ā’ G(y) ā’

g,2 (x, y) dx = hg,2 (x, y) dx

|h|

h a a

b

1

g(x, y + h) ā’ g(x, y) ā’ hg,2 (x, y) dx

=

|h| a

In order to estimate the last integral we use the simple result (Exercise 8.2.13 (iv))

|integral| ā¤ length Ć— sup

which gives us

b

1

g(x, y + h) ā’ g(x, y) ā’ hg,2 (x, y) dx

|h| a

bā’a

ā¤ sup |g(x, y + h) ā’ g(x, y) ā’ hg,2 (x, y)|.

|h| xā[a,b]

We expect |g(x, y +h)ā’g(x, y)ā’hg,2 (x, y)| to be small when h is small be-

cause the deļ¬nition of the partial derivative tells us that g(x, y+h)ā’g(x, y) ā

hg,2 (x, y). In such circumstances, the mean value theorem is frequently use-

ful. In this case, setting f (t) = g(x, y + t) ā’ g(x, y), the mean value theorem

tells us that

|f (h)| = |f (h) ā’ f (0)| ā¤ |h| sup |f (Īøh)|

0ā¤Īøā¤1

193

Please send corrections however trivial to twk@dpmms.cam.ac.uk

and so

|g(x, y + h) ā’ g(x, y) ā’ hg,2 (x, y)| ā¤ |h| sup |g,2 (x, y + Īøh) ā’ g,2 (x, y)|.

0ā¤Īøā¤1

There is one further point to notice. Since we are taking a supremum

over all x ā [a, b], we shall need to know, not merely that we can make

|g,2 (x, y + Īøh) ā’ g,2 (x, y)| small at a particular x by taking h suļ¬ciently

small, but that we can make |g,2 (x, y + Īøh) ā’ g,2 (x, y)| uniformly small for

all x. However, we know that g,2 is continuous on [a, b] Ć— [c, d] and that a

function which is continuous on a closed bounded set is uniformly continuous

and this will enable us to complete the proof.

Let > 0. By Theorem 4.5.5, g,2 is uniformly continuous on [a, b] Ć— [c, d]

and so we can ļ¬nd a Ī“( ) > 0 such that

|g,2 (x, y) ā’ g,2 (u, v)| ā¤ /(b ā’ a)

whenever (xā’u)2 +(y ā’v)2 < Ī“( ) and (x, y), (u, v) ā [a, b]Ć—[c, d]. It follows

that, if y, y + h ā (c, d) and |h| < Ī“( ), then

sup |g,2 (x, y + Īøh) ā’ g,2 (x, y)| ā¤ /(b ā’ a)

0ā¤Īøā¤1

for all x ā [a, b]. Putting all our results together, we have shown that

b

G(y + h) ā’ G(y)

ā’ g,2 (x, y) dx <

h a

whenever y, y + h ā (c, d) and 0 < |h| < Ī“( ) and the result follows.

Exercise 8.4.4. Because I have tried to show where the proof comes from,

the proof above is not written in a very economical way. Rewrite it more

economically.

A favourite examinerā™s variation on the theme of Theorem 8.4.3 is given in

Exercise K.132.

Exercise 8.4.5. In what follows we will use a slightly diļ¬erent version of

Theorem 8.4.3.

Suppose g : [a, b] Ć— [c, d] is continuous and that the partial derivative g ,2

b

exists and is continuous. Then, writing G(y) = a g(x, y) dx, we have G

diļ¬erentiable on [c, d] with

b

G (y) = g,2 (x, y) dx.

a

Explain what this means in terms of left and right derivatives and prove

it.

194 A COMPANION TO ANALYSIS

The method of Euler and Lagrange applies to the following class of prob-

lems. Suppose that F : R3 ā’ R has continuous second partial derivatives.

We consider the set A of functions f : [a, b] ā’ R which are diļ¬erentiable

with continuous derivative and are such that f (a) = Ī± and f (b) = Ī². We

write

b

J(f ) = F (t, f (t), f (t)) dt.

a

and seek to minimise J, that is to ļ¬nd an f0 ā A such that

J(f0 ) ā¤ J(f )

whenever f ā A.

In section 7.3, when we asked if a particular point x0 from an n-dimensional

space minimised g : Rn ā’ R, we examined the behaviour of g close to x0 . In

other words, we looked at g(x0 + Ī·u) when u was an arbitrary vector and Ī·

was small. The idea of Euler and Lagrange is to look at

Gh (Ī·) = J(f0 + Ī·h)

where h : [a, b] ā’ R is diļ¬erentiable with continuous derivative and is such

that h(a) = 0 and h(b) = 0 (we shall call the set of such functions E). We

observe that Gh is a function from R and that Gh has a minimum at 0 if J

is minimised by f0 . This observation, combined with some very clever, but

elementary, calculus gives the celebrated Euler-Lagrange equation.

Theorem 8.4.6. Suppose that F : R3 ā’ R has continuous second partial

derivatives. Consider the set A of functions f : [a, b] ā’ R which are diļ¬er-

entiable with continuous derivative and are such that f (a) = Ī± and f (b) = Ī².

We write

b

J(f ) = F (t, f (t), f (t)) dt.

a

If f ā A such that

J(f ) ā¤ J(g)

whenever g ā A then

d

F,2 (t, f (t), f (t)) = F,3 (t, f (t), f (t)).

dt

195

Please send corrections however trivial to twk@dpmms.cam.ac.uk

Proof. We use the notation of the paragraph preceding the statement of

the theorem. If h ā E (that is to say h : [a, b] ā’ R is diļ¬erentiable with

continuous derivative and is such that h(a) = 0 and h(b) = 0) then the chain

rule tells us that the function gh : R2 ā’ R given by

gh (Ī·, t) = F (t, f (t) + Ī·h(t), f (t) + Ī·h (t))

has continuous partial derivative

gh,1 (Ī·, t) = h(t)F,2 (t, f (t) + Ī·h(t), f (t) + Ī·h (t)) + h (t)F,3 (t, f (t) + Ī·h(t), f (t) + Ī·h (t)).

Thus by Theorem 8.4.3, we may diļ¬erentiate under the integral to show that

Gh is diļ¬erentiable everywhere with

Gh (Ī·) =

b

h(t)F,2 (t, f (t) + Ī·h(t), f (t) + Ī·h (t)) + h (t)F,3 (t, f (t) + Ī·h(t), f (t) + Ī·h (t)) dt.

a

If f minimises J, then 0 minimises Gh and so Gh (0) = 0. We deduce that

b

0= h(t)F,2 (t, f (t), f (t)) + h (t)F,3 (t, f (t), f (t)) dt

a

b b

= h(t)F,2 (t, f (t), f (t)) dt + h (t)F,3 (t, f (t), f (t)) dt.

a a

Using integration by parts and the fact that h(a) = h(b) = 0 we obtain

b b

d

b

ā’

h (t)F,3 (t, f (t), f (t)) dt = [h(t)F,3 (t, f (t), f (t))]a h(t) F,3 (t, f (t), f (t)) dt

dt

a a

ńņš. 6 |