. 7
( 19)


=’ h(t) F,3 (t, f (t), f (t)) dt.

Combining the results of the last two sentences, we see that
h(t) F,2 (t, f (t), f (t)) ’
0= F,3 (t, f (t), f (t)) dt.

Since this result must hold for all h ∈ A, we see that
F,2 (t, f (t), f (t)) ’ F,3 (t, f (t), f (t)) = 0
for all t ∈ [a, b] (for details see Lemma 8.4.7 below) and this is the result we
set out to prove.

In order to tie up loose ends, we need the following lemma.

Lemma 8.4.7. Suppose f : [a, b] ’ R is continuous and
f (t)h(t) dt = 0,

whenever h : [a, b] ’ R is an in¬nitely di¬erentiable function with h(a) =
h(b) = 0. Then f (t) = 0 for all t ∈ [a, b].

Proof. By continuity, we need only prove that f (t) = 0 for all t ∈ (a, b).
Suppose that, in fact, f (x) = 0 for some x ∈ (a, b). Without loss of generality
we may suppose that f (x) > 0 (otherwise, consider ’f ). Since (a, b) is open
and f is continuous we can ¬nd a δ > 0 such that [x ’ δ, x + δ] ⊆ (a, b) and
|f (t) ’ f (x)| < f (x)/2 for t ∈ [x ’ δ, x + δ]. This last condition tells us that
f (t) > f (x)/2 for t ∈ [x ’ δ, x + δ].
In Example 7.1.6 we constructed an in¬nitely di¬erentiable function E :
R ’ R with E(t) = 0 for t ¤ 0 and E(t) > 0 for t > 0. Setting h(t) =
E(t ’ x + δ)E(’t + x + δ) when t ∈ [a, b], we see that h is an in¬nitely
di¬erentiable function with h(t) > 0 for t ∈ (x ’ δ, x + δ) and h(t) = 0
otherwise (so that, in particular h(a) = h(b) = 0). By standard results on
the integral,
b x+δ x+δ
f (t)h(t) dt ≥
f (t)h(t) dt = (f (x)/2)h(t) dt
a x’δ x’δ
f (x)
= h(t) dt > 0,
2 x’δ

so we are done.

Exercise 8.4.8. State explicitly the ˜standard results on the integral™ used in
the last sentence of the previous proof and show how they are applied.

Theorem 8.4.6 is often stated in the following form. If the function y :
[a, b] ’ R minimises J then
‚F d ‚F
= .
‚y dx ‚y

This is concise but can be confusing to the novice6 .
The Euler-Lagrange equation can only be solved explicitly in a small
number of special cases. The next exercise (which should be treated as an
It certainly confused me when I met it for the ¬rst time.
Please send corrections however trivial to twk@dpmms.cam.ac.uk

exercise in calculus rather than analysis) shows how, with the exercise of
some ingenuity, we can solve the brachistochrone problem with which we
started. Recall that this asked us to minimise
1 + f (x)2
J(f ) = dx.
(2g)1/2 κ ’ f (x)

Exercise 8.4.9. We use the notation and assumptions of Theorem 8.4.6.
(i) Suppose that F (u, v, w) = G(v, w) (often stated as ˜t does not appear
explicitly in F =F (t, y, y )™). Show that the Euler-Lagrange equation becomes
G,1 (f (t), f (t)) = G,2 (f (t), f (t))
and may be rewritten
G(f (t), f (t)) ’ f (t)G,2 (f (t), f (t)) = 0.
Deduce that

G(f (t), f (t)) ’ f (t)G,2 (f (t), f (t)) = c
where c is a constant. (This last result is often stated as F ’ y = c.)
(ii) (This is not used in the rest of the question.) Suppose that F (u, v, w) =
G(u, w). Show that

G,2 (t, f (t)) = c

where c is a constant.
(iii) By applying (i), show that solutions of the Euler-Lagrange equation
associated with the brachistochrone are solutions of
((κ ’ f (x))(1 + f (x)2 ))1/2
where c is a constant. Show that this equation can be rewritten as
B + f (x)
f (x) = .
A ’ f (x)
(iv) We are now faced with ¬nding the curve
dy B+y
= .

If we are su¬ciently ingenious (or we know the answer), we may be led to
try and express this curve in parametric form by setting

y= cos θ.
2 2
Show that
dx A+B
= (1 + cos θ),
dθ 2
and conclude that our curve is (in parametric form)

x = a + k(θ ’ sin θ), y = b ’ k cos θ

for appropriate constants a, b and k. Thus any curve which minimises the
time of descent must be a cycloid.

It is important to observe that we have shown that any minimising func-
tion satis¬es the Euler-Lagrange equation and not that any function sat-
isfying the Euler-Lagrange equation is a minimising function. Exactly the
same argument (or replacing J by ’J), shows that any maximising function
satis¬es the Euler-Lagrange equation. Further, if we re¬‚ect on the simpler
problem discussed in section 7.3, we see that the Euler-Lagrange equation
will be satis¬ed by functions f such that

Gh (·) = J(f + ·h)

has a minimum at · = 0 for some h ∈ E and a maximum at · = 0 for others.

Exercise 8.4.10. With the notation of this section show that, if f satis¬es
the Euler-Lagrange equations, then Gh (0) = 0.

To get round this problem, examiners ask you to ˜¬nd the values of f
which make J stationary™ where the phrase is equivalent to ˜¬nd the val-
ues of f which satisfy the Euler-Lagrange equations™. In real life, we use
physical intuition or extra knowledge about the nature of the problem to
¬nd which solutions of the Euler-Lagrange equations represent maxima and
which minima.
Mathematicians spent over a century seeking to ¬nd an extension to the
Euler-Lagrange method which would enable them to distinguish true maxima
and minima. However, they were guided by analogy with the one dimensional
(if f (0) = 0 and f (0) > 0 then 0 is minimum) and ¬nite dimensional
case and it turns out that the analogy is seriously defective. In the end,
Please send corrections however trivial to twk@dpmms.cam.ac.uk

Figure 8.1: A problem for the calculus of variations

Weierstrass produced examples which made it plain what was going on. We
discuss a version of one of them.
Consider the problem of minimising
(1 ’ (f (x))4 )2 + f (x)2 dx
I(f ) =

where f : [0, 1] ’ R is once continuously di¬erentiable and f (0) = f (1) = 0.

Exercise 8.4.11. We look at

Gh (·) = I(·h).

Show that Gh (·) = 1 + Ah · 2 + Bh · 4 + Ch · 8 where Ah , Bh , Ch depend on
h. Show that, if h is not identically zero, Ah > 0 and deduce that Gh has a
strict minimum at 0 for all non-zero h ∈ E.

We are tempted to claim that ˜I(f ) has a local minimum at f = 0™.
Now look at the function gn [n a strictly positive integer] illustrated in
Figure 8.1 and de¬ned by

2r 2r 1
gn (x) = x ’ for x ’ ¤ ,
2n 2n 4n
2r + 1 2r + 1 1
’x for x ’ ¤
gn (x) = ,
2n 2n 4n

whenever r is an integer and x ∈ [0, 1]. Ignoring the ¬nite number of points
where gn is not di¬erentiable, we see that gn (x) = ±1 at all other points,
and so
gn (x)2 dx ’ 0 as n ’ ∞.
I(gn ) =

Figure 8.2: The same problem, smoothed

It is clear that we can ¬nd a similar sequence of functions fn which is con-
tinuously di¬erentiable by ˜rounding the sharp bits™ as in Figure 8.2.
The reader who wishes to dot the i™s and cross the t™s can do the next

> 0 and let k : [0, 1] ’ R be the function
Exercise 8.4.12. (i) Let 1/2 >
such that

k(x) = ’1 x for 0 ¤ x ¤ ,
for ¤ x ¤ 1 ’ ,
k(x) = 1
k(x) = ’1 (1 ’ x) for 1 ’ ¤ x ¤ 1.

Sketch k.
(ii) Let kn (x) = (’1)[2nx] k(2nx ’ 2[nx]) for x ∈ [0, 1]. (Here [2nx] means
the integer part of 2nx.) Sketch the function kn .
(iii) Let
Kn (x) = kn (t) dt

for x ∈ [0, 1]. Sketch Kn . Show that 0 ¤ Kn (x) ¤ 1/(2n) for all x. Show that
Kn is once di¬erentiable with continuous derivative. Show that |Kn (x)| ¤ 1
for all x and identify the set of points where |Kn (x)| = 1.
(iv) Show that there exists a sequence of continuously di¬erentiable func-
tions fn : [0, 1] ’ R, with fn (0) = fn (1) = 0, such that

I(fn ) ’ 0 as n ’ ∞.

[This result is slightly improved in Example K.134.]

This example poses two problems. The ¬rst is that in some sense the fn
are close to f0 = 0 with I(fn ) < I(f0 ), yet the Euler-Lagrange approach of
Please send corrections however trivial to twk@dpmms.cam.ac.uk

Exercise 8.4.11 seemed to show that I(f0 ) was smaller than those I(f ) with
f close to f0 . One answer to this seeming paradox is that, in Exercise 8.4.11,
we only looked at Gh (·) = I(·h) as · became small, so we only looked at
certain paths approaching f0 and not at all possible modes of approach. As
· becomes small, not only does ·h become small but so does ·h . However,
as n becomes large, fn becomes small but fn does not. In general when
the Euler-Lagrange method looks at a function f it compares it only with
functions which are close to f and have derivative close to f . This does
not a¬ect the truth of Theorem 8.4.6 (which says that the Euler-Lagrange
equation is a necessary condition for a minimum) but makes it unlikely that
the same ideas can produce even a partial converse.
Once we have the notion of a metric space we can make matters even
clearer. (See Exercise K.199 to K.201.)

Exercise 8.4.13. This exercise looks back to Section 7.3. Let U be an open
subset of R2 containing (0, 0). Suppose that f : U ’ R has second order
partial derivatives on U and these partial derivatives are continuous at (0, 0).
Suppose further that f,1 (0, 0) = f,2 (0, 0) = 0. If u ∈ R2 we write Gu (·) =
f (·u).
(i) Show that Gu (0) = 0 for all u ∈ R2 .
(ii) Let e1 = (1, 0) and e2 = (0, 1) Suppose that Ge1 (0) > 0 and Ge2 (0) >
0. Show, by means of an example, that (0, 0) need not be a local minimum
for f . Does there exist an f with the properties given which attains a local
minimum at (0, 0)? Does there exist an f with the properties given which
attains a local maximum at (0, 0)?
(iii) Suppose that Gu (0) > 0 whenever u is a unit vector. Show that f
attains a local minimum at (0, 0).

The second problem raised by results like Exercise 8.4.12 is also very

Exercise 8.4.14. Use Exercise 8.3.4 to show that I(f ) > 0 whenever f :
[0, 1] ’ R is a continuously di¬erentiable function.
Conclude, using the discussion above, that the set

{I(f ) : f continuously di¬erentiable}

has an in¬mum (to be identi¬ed) but no minimum.

Exercise 8.1. Here is a simpler (but less interesting) example of a varia-
tional problem with no solution, also due to Weierstrass. Consider the set

E of functions f : [’1, 1] ’ R with continuous derivative and such that
f (’1) = ’1, f (1) = 1. Show that
x2 f (x)2 dx = 0
f ∈E ’1

x2 f0 (x)2 dx = 0.
but there does not exist any f0 ∈ E with ’1

The discovery that that they had been talking about solutions to prob-
lems which might have no solutions came as a severe shock to the pure
mathematical community. Of course, examples like the one we have been
discussing are ˜arti¬cial™ in the sense that they have been constructed for
the purpose but unless we can come up with some criterion for distinguish-
ing ˜arti¬cial™ problems from ˜real™ problems this takes us nowhere. ˜If we
have actually seen one tiger, is not the jungle immediately ¬lled with tigers,
and who knows where the next one lurks.™ The care with which we proved
Theorem 4.3.4 (a continuous function on a closed bounded set is bounded
and attains its bounds) and Theorem 4.4.4 (Rolle™s theorem, considered as
the statement that, if a di¬erentiable function f on an open interval (a, b)
attains a maximum at x, then f (x) = 0) are distant echos of that shock. On
the other hand, the new understanding which resulted revivi¬ed the study
of problems of maximisation and led to much new mathematics.
It is always possible to claim that Nature (with a capital N) will never set
˜arti¬cial™ problems and so the applied mathematician need not worry about
these things. ˜Nature is not troubled by mathematical di¬culties.™ However,
a physical theory is not a description of nature (with a small n) but a model
of nature which may well be troubled by mathematical di¬culties. There
are at least two problems in physics where the model has the characteristic
features of our ˜arti¬cial™ problem. In the ¬rst, which asks for a description
of the electric ¬eld near a very sharp charged needle, the actual experiment
produces sparking. In the second, which deals with crystallisation as a system
for minimising an energy function not too far removed from I, photographs
reveal patterns not too far removed from Figure 8.1!

8.5 Vector-valued integrals
So far we have dealt only with the integration of functions f : [a, b] ’ R.
The general programme that we wish to follow would direct us to consider
the integration of functions f : E ’ Rm where E is a well behaved subset of
Rn . In this section we shall take the ¬rst step by considering the special case
of a well behaved function f : [a, b] ’ Rm . Since C can be identi¬ed with R2 ,
Please send corrections however trivial to twk@dpmms.cam.ac.uk

our special case contains, as a still more special (but very important case),
the integration of well behaved complex-valued functions f : [a, b] ’ C.
The de¬nition is simple.
De¬nition 8.5.1. If f : [a, b] ’ Rm is such that fj : [a, b] ’ R is Rie-
mann integrable for each j, then we say that f is Riemann integrable and
f (x) dx = y where y ∈ Rm and
yj = fj (x) dx

for each j.
In other words,
b b
f (x) dx = fj (x) dx.
a a

It is easy to obtain the properties of this integral directly from its de¬ni-
tion and the properties of the one dimensional integral. Here is an example.
Lemma 8.5.2. If ± : Rm ’ Rp is linear and f : [a, b] ’ Rm is Riemann
integrable, then so is ±f and
b b
(±f )(x) dx = ± f (x) dx .
a a

Proof. Let ± have matrix representation (aij ). By Lemma 8.2.11,
(±f )i = aij fj

is Riemann integrable and
bm m b
aij fj (x) dx = aij fj (x) dx.
a a
j=1 j=1

Comparing this with De¬nition 8.5.1, we see that we have the required result.

Taking ± to be any orthogonal transformation of Rm to itself, we see
that our de¬nition of the integral is, in fact, coordinate independent. (Re-
member, it is part of our programme that nothing should depend on the
particular choice of coordinates we use. The reader may also wish to look at
Exercise K.137.)
Choosing a particular orthogonal transformation, we obtain the following
nice result.

Theorem 8.5.3. If f : [a, b] ’ Rm is Riemann integrable then
f (x) dx ¤ (b ’ a) sup f (x) .

This result falls into the standard pattern

size of integral ¤ length — sup.

Proof. If y is a vector in Rm , we can always ¬nd a rotation ± of Rm such
that ±y lies along the x1 axis, that is to say, (±y)1 ≥ 0 and (±y)j = 0 for
2 ¤ j ¤ m. Let y = a f (x) dx. Then
b b
f (x) dx = ± f (x) dx
a a
= ± f (x) dx
a 1
= (±f (x))1 dx
¤ (b ’ a) sup |(±f (x))1 |

¤ (b ’ a) sup ±f (x)

= (b ’ a) sup f (x) .

Exercise 8.5.4. Justify each step in the chain of equalities and inequalities
which concluded the preceding proof.

Exercise 8.5.5. Show that the collection R of Riemann integrable functions
f : [a, b] ’ Rm forms a real vector space with the natural operations. If we
Tf = f (x) dx

and f ∞ = supt∈[a,b] f (t) , show that T : R ’ R is a linear map and
T f ¤ (b ’ a) f ∞ .
Chapter 9

Developments and limitations
of the Riemann integral ™

9.1 Why go further?
Let us imagine a conversation in the 1880™s between a mathematician opposed
to the ˜new rigour™ and a mathematician who supported it. The opponent
might claim that the de¬nition of the Riemann integral given in section 8.2
was dull and gave rise to no new theorems. The supporter might say, as
this book does, that de¬nitions are necessary in order that we know when
we have proved something and to understand what we have proved when we
have proved it. He would, however, have to admit both the dullness and the
lack of theorems. Both sides would regretfully agree that there was probably
little more to say about the matter.
Twenty years later, Lebesgue, building on work of Borel and others,
produced a radically new theory of integration. From the point of view
of Lebesgue™s theory, Riemann integration has a profound weakness. We
saw in Lemma 8.2.11 and Exercise 8.2.14 that we cannot leave the class of
Riemann integrable functions if we only perform algebraic operations (for
example the product of two Riemann integrable functions is again Riemann
integrable). However we can leave the class of Riemann integrable functions
by performing limiting operations.

Exercise 9.1.1. Let fn : [0, 1] ’ R be de¬ned by fn (r2’n ) = 1 if r is an
integer with 0 ¤ r ¤ 2n , fn (x) = 0, otherwise.
(i) Show that fn is Riemann integrable.
(ii) Show that there exists an f : [0, 1] ’ R, which you should de¬ne
explicitly, such that fn (x) ’ f (x) as n ’ ∞, for each x ∈ [0, 1].
(iii) Show, however, that f is not Riemann integrable.


[See also Exercise K.138.]

The class of Lebesgue integrable functions includes every Riemann inte-
grable function but behaves much better when we perform limiting opera-
tions. As an example, which does not give the whole picture but shows the
kind of result that can be obtained, contrast Exercise 9.1.1 with the following

Lemma 9.1.2. Let fn : [a, b] ’ R be a sequence of Lebesgue integrable
functions with |fn (x)| ¤ M for all x ∈ [0, 1] and all n. If fn (x) ’ f (x) as
n ’ ∞ for each x ∈ [0, 1], then f is Lebesgue integrable and
b b
fn (x) dx ’ f (x) dx.
a a

It is important to realise that mathematicians prize the Lebesgue inte-
gral, not because it integrates more functions (most functions that we meet
explicitly are Riemann integrable), but because it gives rise to beautiful the-
orems and, at a deeper level, to beautiful theories way beyond the reach of
the Riemann integral.
Dieudonn´ dismisses the Riemann integral with scorn in [13], Chap-
ter VIII.

It may well be suspected that, had it not been for its pres-
tigious name, this [topic] would have been dropped long ago
[from elementary analysis courses], for (with due reverence to
Riemann™s genius) it is certainly clear to any working mathemati-
cian that nowadays such a ˜theory™ has at best the importance of
a mildly interesting exercise in the general theory of measure and
integration. Only the stubborn conservatism of academic tradi-
tion could freeze it into a regular part of the curriculum, long
after it had outlived its historical importance.

Stubborn academic conservatives like the present writer would reply that,
as a matter of observation, many working mathematicians1 do not use and
have never studied Lebesgue integration and its generalisation to measure
theory. Although measure theory is now essential for the study of all branches
of analysis and probability, it is not needed for most of number theory, alge-
bra, geometry and applied mathematics.
Of course, it depends on who you consider to be a mathematician. A particular French
academic tradition begins by excluding all applied mathematicians, continues by excluding
all supporters of the foreign policy of the United States and ends by restricting the title
to pupils of the Ecole Normale Sup´rieure.
Please send corrections however trivial to twk@dpmms.cam.ac.uk

It is frequently claimed that Lebesgue integration is as easy to teach as
Riemann integration. This is probably true, but I have yet to be convinced
that it is as easy to learn. Under these circumstances, it is reasonable to
introduce Riemann integration as an ad hoc tool to be replaced later by a
more powerful theory, if required. If we only have to walk 50 metres, it makes
no sense to buy a car.
On the other hand, as the distance to be traveled becomes longer, walking
becomes less attractive. We could walk from London to Cambridge but few
people wish to do so. This chapter contains a series of short sections showing
how the notion of the integral can be extended in various directions. I hope
that the reader will ¬nd them interesting and instructive but, for the reasons
just given, she should not invest too much time and e¬ort in their contents
which, in many cases, can be given a more elegant, inclusive and e¬cient
exposition using measure theory.
I believe that, provided it is not taken too seriously, this chapter will be
useful to those who do not go on to do measure theory by showing that
the theory of integration is richer than most elementary treatments would
suggest and to those who will go on to do measure theory by opening their
minds to some of the issues involved.

Improper integrals ™
We have de¬ned Riemann integration for bounded functions on bounded
intervals. However, the reader will already have evaluated, as a matter of
routine, so called ˜improper integrals™2 in the following manner
1 1
x’1/2 dx = lim [2x1/2 ]1 = 2,
x dx = lim
’0+ ’0+

∞ R
x’2 dx = lim [’x’1 ]R = 1.
x dx = lim 1
R’∞ R’∞
1 1

A full theoretical treatment of such integrals with the tools at our disposal
is apt to lead into into a howling wilderness of ˜improper integrals of the ¬rst
kind™, ˜Cauchy principal values™ and so on. Instead, I shall give a few typical
There is nothing particularly improper about improper integrals (at least, if they are
absolutely convergent, see page 211), but this is what they are traditionally called. Their
other traditional name ˜in¬nite integrals™ removes the imputation of moral obliquity but
is liable to cause confusion in other directions.

theorems, de¬nitions and counterexamples from which the reader should be
able to construct any theory that she needs to justify results in elementary
De¬nition 9.2.1. If f : [a, ∞) ’ R is such that f |[a,X] ∈ R[a, X] for each

X > a and a f (x) dx ’ L as X ’ ∞, then we say that a f (x) dx exists
with value L.
Lemma 9.2.2. Suppose f : [a, ∞) ’ R is such that f |[a,X] ∈ R[a, X] for

each X > a. If f (x) ≥ 0 for all x, then a f (x) dx exists if and only if there
exists a K such that a f (x) dx ¤ K for all X.
Proof. As usual we split the proof into two parts dealing with ˜if™ and ˜only
if™ separately.
∞ X
Suppose ¬rst that a f (x) dx exists, that is to say a f (x) dx tends to
a limit as X ’ ∞. Let un = a f (x) dx when n is an integer with n ≥ a.
Since f is positive, un is an increasing sequence. Since un tends to a limit, it
must be bounded, that is to say, there exists a K such that un ¤ K for all
n ≥ a. If X ≥ a we choose an integer N ≥ X and observe that
f (x) dx ¤ f (x) dx = uN ¤ K
a a

as required.
Suppose, conversely, that there exists a K such that a f (x) dx ¤ K for
all X ≥ a. De¬ning un = a f (x) dx as before, we observe that the un form
an increasing sequence bounded above by K. By the fundamental axiom it
follows that un tends to a limit L, say. In particular, given > 0, we can
¬nd an n0 ( ) such that L ’ < un ¤ L for all n ≥ n0 ( ).
If X is any real number with X > n0 ( ) + 1, we can ¬nd an integer n
with n + 1 ≥ X > n. Since n ≥ n0 ( ), we have
L ’ < un ¤ f (x) dx ¤ un+1 ¤ L

and |L ’ f (x) dx ’ L as X ’ ∞, as required.
f (x) dx| < . Thus
a a
Exercise 9.2.3. Show that 0 sin(2πx) dx tends to a limit as n ’ ∞ through
integer values, but 0 sin(2πx) dx does not tend to a limit as X ’ ∞.
We use Lemma 9.2.2 to prove the integral comparison test.
Lemma 9.2.4. Suppose f : [1, ∞) ’ R is a decreasing continuous positive

function. Then ∞ f (n) exists if and only if 1 f (x) dx does.
Please send corrections however trivial to twk@dpmms.cam.ac.uk

Just as with sums we sometimes say that ˜ 1 f (x) dx converges™ rather

than ˜ 1 f (x) dx exists™. The lemma then says ˜ ∞ f (n) converges if and

only if 1 f (x) dx does™.
The proof of Lemma 9.2.4 is set out in the next exercise.

Exercise 9.2.5. Suppose f : [1, ∞) ’ R is a decreasing continuous positive
(i) Show that
f (n) ≥ f (x) dx ≥ f (n + 1).

(ii) Deduce that
N N +1
N +1
f (n) ≥ f (x) dx ≥ f (n).
1 2

(iii) By using Lemma 9.2.2 and the corresponding result for sums, deduce
Lemma 9.2.4.

Exercise 9.2.6. (i) Use Lemma 9.2.4 to show that ∞ n’± converges if
± > 1 and diverges if ± ¤ 1.
(ii) Use the inequality established in Exercise 9.2.5 to give a rough esti-
mate of the size of N required to give N n’1 > 100.
(iii) Use the methods just discussed to do Exercise 5.1.10.

Exercise 9.2.7. (Simple version of Stirling™s formula.) The ideas of
Exercise 9.2.5 have many applications.
(i) Suppose g : [1, ∞) ’ R is an increasing continuous positive function.
Obtain inequalities for g corresponding to those for f in parts (i) and (ii) of
Exercise 9.2.5.
(ii) By taking g(x) = log x in part (i), show that
log(N ’ 1)! ¤ log x dx ¤ log N !

and use integration by parts to conclude that

log(N ’ 1)! ¤ N log N ’ N + 1 ¤ log N ! .

(iii) Show that log N ! = N log N ’ N + θ(N )N where θ(N ) ’ 0 as
N ’ ∞.
[A stronger result is proved in Exercise K.141.]

We have a result corresponding to Theorem 4.6.12

Lemma 9.2.8. Suppose f : [a, ∞) ’ R is such that f |[a,X] ∈ R[a, X] for
∞ ∞
each X > a. If a |f (x)| dx exists, then a f (x) dx exists.

It is natural to state Lemma 9.2.8 in the form ˜absolute convergence of
the integral implies convergence™.

Exercise 9.2.9. Prove Lemma 9.2.8 by using the argument of Exercise 4.6.14 (i).

Exercise 9.2.10. Prove the following general principle of convergence for
Suppose f : [a, ∞) ’ R is such that f |[a,X] ∈ R[a, X] for each X > a.

Show that a f (x) dx exists if and only if, given any > 0, we can ¬nd an
X0 ( ) > a such that
f (x) dx <

whenever Y ≥ X ≥ X0 ( ).

Exercise 9.2.11. (i) Following the ideas of this section and Section 8.5,

provide the appropriate de¬nition of a f (x) dx for a function f : [a, ∞) ’
Rm .
(ii) By taking components and using Exercise 9.2.10, or otherwise, prove
a general principle of convergence for such integrals.
(iii) Use part (ii) and the method of proof of Theorem 4.6.12 to prove the
following generalisation of Lemma 9.2.8.
Suppose f : [a, ∞) ’ Rm is such that f |[a,X] ∈ R[a, X] for each X > a. If
∞ ∞
f (x) dx exists then a f (x) dx exists.

Exercise 9.2.12. Suppose f : [a, b) ’ R is such that f |[a,c] ∈ R[a, c] for
each a < c < b. Produce a de¬nition along the lines of De¬nition 9.2.1 of
what it should mean for a f (x) dx to exist with value L.
State and prove results analogous to Lemma 9.2.2 and Lemma 9.2.8.

Additional problems arise when there are two limits involved.

Example 9.2.13. If », µ > 0 then
x »
dx ’ log
1 + x2 µ

as R ’ ∞.
Please send corrections however trivial to twk@dpmms.cam.ac.uk

Proof. Direct calculation, which is left to the reader.
A pure mathematician gets round this problem by making a de¬nition along
these lines.
De¬nition 9.2.14. If f : R ’ R is such that f |[’X,Y ] ∈ R[’X, Y ] for each

X, Y > 0, then ’∞ f (x) dx exists with value L if and only if the following
condition holds. Given > 0 we can ¬nd an X0 ( ) > 0 such that
f (x) dx ’ L < .

for all X, Y > X0 ( ).
Exercise 9.2.15. Let f : R ’ R be such that f |[’X,Y ] ∈ R[X, Y ] for
∞ ∞
each X, Y > 0. Show that ’∞ f (x) dx exists if and only if 0 f (x) dx =
R 0 0
limR’∞ 0 f (x) dx and ’∞ f (x) dx = limS’∞ ’S f (x) dx exist. If the inte-
grals exist, show that
∞ ∞
f (x) dx = f (x) dx + f (x) dx.
’∞ ’∞ 0

The physicist gets round the problem by ignoring it. If she is a real
physicist with correct physical intuition this works splendidly3 but if not,
Speaking broadly, improper integrals E f (x) dx work well when they are
absolutely convergent, that is to say, E |f (x)| dx < ∞, but are full of traps
for the unwary otherwise. This is not a weakness of the Riemann integral but
inherent in any mathematical situation where an object only exists ˜by virtue
of the cancellation of two in¬nite objects™. (Recall Littlewood™s example on
page 81.)
Example 9.2.16. Suppose we de¬ne the PV (principle value) integral by
∞ R
PV f (x) dx = lim f (x) dx
’∞ ’R

whenever the right hand side exists. Show, by considering Example 9.2.13, or
otherwise, that the standard rule for change of variables fails for PV integrals.
In [8], Boas reports the story of a friend visiting the Princeton common room ˜ . . .
where Einstein was talking to another man, who would shake his head and stop him;
Einstein then thought for a while, then started talking again; was stopped again; and so
on. After a while, . . . my friend was introduced to Einstein. He asked Einstein who the
other man was. “Oh,” said Einstein, “that™s my mathematician.” ™

Integrals over areas ™
At ¬rst sight, the extension of the idea of Riemann integration from functions
de¬ned on R to functions de¬ned on Rn looks like child™s play. We shall do
the case n = 2 since the general case is a trivial extension.
Let R = [a, b] — [c, d] and consider f : R ’ R such that there exists a K
with |f (x)| ¤ K for all x ∈ R. We de¬ne a dissection D of R to be a ¬nite
collection of rectangles Ij = [aj , bj ] — [cj , dj ] [1 ¤ j ¤ N ] such that
(i) Ij = R,
(ii) Ii © Ij is either empty or consists of a segment of a straight line
[1 ¤ j < i ¤ N ].
If D = {Ij : 1 ¤ j ¤ N } and D = {Ik : 1 ¤ k ¤ N } are dissections we
write D § D for the set of non-empty rectangles of the form Ij © Ik . If every
Ik ∈ D is contained in some Ij ∈ D we write D D.
We de¬ne the upper sum and lower sum associated with D by
S(f, D) = |Ij | sup f (x),
s(f, D) = |Ij | inf f (x)

where |Ij | = (bj ’ aj )(dj ’ cj ), the area of Ij .
Exercise 9.3.1. (i) Suppose that D and D are dissections with D D.
Show, using the method of Exercise 8.2.1, or otherwise, that

S(f, D) ≥ S(f, D ) ≥ s(f, D ) ≥ s(f, D).

(ii) State and prove a result corresponding to Lemma 8.2.3.
(iii) Explain how this enables us to de¬ne upper and lower integrals and
hence complete the de¬nition of Riemann integration. We write the integral

f (x) dA

when it exists.
(iv) Develop the theory of Riemann integration on R as far as you can.
(You should be able to obtain results like those in Section 8.2 as far as the
end of Exercise 8.2.15.) You should prove that if f is continuous on R then
it is Riemann integrable.
Please send corrections however trivial to twk@dpmms.cam.ac.uk

We can do rather more than just prove the existence of

f (x) dA

when f is continuous on the rectangle R.
Theorem 9.3.2. (Fubini™s theorem for continuous functions.) Let
R = [a, b] — [c, d]. If f : R ’ R is continuous, then the functions F1 : [a, b] ’
R and F2 : [c, d] ’ R de¬ned by
d b
F1 (x) = f (x, s) ds and F2 (y) = f (t, y) dt
c a

are continuous and
b d
F1 (x) dx = F2 (y) dy = f (x) dA.
a c R

This result is more usually written as
b d d b
f (x, y) dy dx = f (x, y) dx dy = f (x) dA,
a c c a R

or, simply,
b d d b
f (x, y) dy dx = f (x, y) dx dy = f (x) dA.
a c c a [a,b]—[c,d]

(See also Exercises K.152, K.154 and K.155.)
We prove Theorem 9.3.2 in two exercises.
Exercise 9.3.3. (We use the notation of Theorem 9.3.2.) If |f (x, s) ’
f (w, s)| ¤ for all s ∈ [c, d] show that |F1 (x) ’ F1 (w)| ¤ (d ’ c). Use
the uniform continuity of f to conclude that F1 is continuous.
For the next exercise we recall the notion of an indicator function IE for a
set E. If E ⊆ R, then IE : R ’ R is de¬ned by IE (a) = 1 if a ∈ E, IE (a) = 0
Exercise 9.3.4. We use the notation of Theorem 9.3.2. In this exercise
interval will mean open, half open or closed interval (that is intervals of the
form, (±, β), [±, β), (±, β] or [±, β]) and rectangle will mean the product of
two intervals. We say that g satis¬es the Fubini condition if
b d d b
g(x, y) dy dx = g(x, y) dy dx = g(x) dA.
a c c a R

> 0, we can ¬nd rectangles Rj ⊆ R and »j ∈ R
(i) Show that, given
such that, writing
» j IR j ,

we have H(x) ’ ¤ F (x) ¤ H(x) + for all x ∈ R.
(ii) Show by direct calculation that IB satis¬es the Fubini condition when-
ever B is a rectangle. Deduce that H satis¬es the Fubini condition and use
(i) (carefully) to show that F does.
All this looks very satisfactory, but our treatment hides a problem. If we
look at how mathematicians actually use integrals we ¬nd that they want
to integrate over sets which are more complicated than rectangles with sides
parallel to coordinate axes. (Indeed one of the guiding principles of this book
is that coordinate axes should not have a special role.) If you have studied
mathematical methods you will have come across the formula for change of
‚(u, v)
f (u, v) du dv = f (u(x, y), v(x, y)) dx dy,
‚(x, y)


E = {(u(x, y), v(x, y)) : (x, y) ∈ E}.

Even if you do not recognise the formula, you should see easily that any
change of variable formula will involve changing not only the integrand but
the set over which we integrate.
It is not hard to come up with an appropriate de¬nition for integrals over
a set E.
De¬nition 9.3.5. Let E be a bounded set and f : E ’ R a bounded func-
tion. Choose a < b and c < d such that R = [a, b] — [c, d] contains E and
˜ ˜ ˜ ˜
de¬ne f : R ’ R by f (x) = f (x) if x ∈ E, f (x) = 0 otherwise. If R f (x) dA
exists, we say that E f (x) dA exists and

f (x) dA = f (x) dA.

Exercise 9.3.6. Explain brie¬‚y why the de¬nition is independent of the
choice of R.
This formula is included as a memory jogger only. It would require substantial sup-
porting discussion to explain the underlying conventions and assumptions.
Please send corrections however trivial to twk@dpmms.cam.ac.uk

The most important consequence of this de¬nition is laid bare in the next
Exercise 9.3.7. Let R = [a, b] — [c, d] and E ⊆ R. Let R be the set of
functions f : R ’ R which are Riemann integrable. Then E f (x) dA exists
for all f ∈ R if and only if IE ∈ R.
I (x) dA
If we think about the meaning of we are led to the following
de¬nition5 .
De¬nition 9.3.8. A bounded set E in R2 has Riemann area 1 dA if that
integral exists.
Recall that, if R = [a, b] — [c, d], we write |R| = (b ’ a)(d ’ c).
Exercise 9.3.9. Show that a bounded set E has Riemann area |E| if and
only if, given any , we can ¬nd disjoint rectangles Ri = [ai , bi ] — [ci , di ]
[1 ¤ i ¤ N ] and (not necessarily disjoint) rectangles Rj = [aj , bj ] — [cj , dj ]
[1 ¤ j ¤ M ] such that
Ri ⊆ E ⊆ |Ri | ≥ |E| ’ and |Rj | ¤ |E| + .
Rj ,
i=1 j=1 i=1 j=1

Exercise 9.3.10. Show that, if E has Riemann area and f is de¬ned and
Riemann integrable on some rectangle R = [a, b] — [c, d] containing E, then
f (x) dA exists and

f (x) dA ¤ sup |f (x)||E|.

In other words
size of integral ¤ area — sup .
Our discussion tells us that in order to talk about

f (x) dA

we need to know not only that f is well behaved (Riemann integrable) but
that E is well behaved (has Riemann area). Just as the functions f which
occur in ˜¬rst mathematical methods™ courses are Riemann integrable, so the
sets E which appear in such courses have Riemann area, though the process
of showing this may be tedious.
Like most of the rest of this chapter, this is not meant to be taken too seriously. What
we call ˜Riemann area™ is traditionally called ˜content™. The theory of content is pretty
but was rendered obsolete by the theory of measure.

Exercise 9.3.11. (The reader may wish to think about how to do this exer-
cise without actually writing down all the details.)
(i) Show that a rectangle whose sides are not necessarily parallel to the
axis has Riemann area and that this area is what we expect.
(ii) Show that a triangle has Riemann area and that this area is what we
(iii) Show that a polygon has Riemann area and that this area is what
we expect. (Of course, the answer is to cut it up into a ¬nite number of
triangles, but can this always be done?)
However, if we want to go further, it becomes rather hard to decide which
sets are nice and which are not. The problem is already present in the
one-dimensional case, but hidden by our insistence on only integrating over
De¬nition 9.3.12. A bounded set E in R has Riemann length if, taking any
[a, b] ⊇ E, we have IE ∈ R([a, b]). We say then that E has Riemann length
IE (t) dt.
|E| =

Exercise 9.3.13. (i) Explain why the de¬nition just given is independent of
the choice of [a, b].
(ii) Show that

IA∪B = IA + IB ’ IA IB .

Hence show that, if A and B have Riemann length, so does A ∪ B. Prove
similar results for A © B and A \ B.
(iii) By reinterpreting Exercise 9.1.1 show that we can ¬nd An ⊆ [0, 1]
such that An has Riemann length for each n but ∞ An does not.
(iv) Obtain results like (ii) and (iii) for Riemann area.
It also turns out that the kind of sets we have begun to think of as nice,
that is open and closed sets, need not have Riemann area.
Lemma 9.3.14. There exist bounded closed and open sets in R which do not
have Riemann length. There exist bounded closed and open sets in R2 which
do not have Riemann area.
The proof of this result is a little complicated so we have relegated it to
Exercise K.156.
Any belief we may have that we have a ˜natural feeling™ for how area
behaves under complicated maps is ¬nally removed by an example of Peano.
Please send corrections however trivial to twk@dpmms.cam.ac.uk

Theorem 9.3.15. There exists a continuous surjective map f : [0, 1] ’
[0, 1] — [0, 1].
Thus there exists a curve which passes through every point of a square!
A proof depending on the notion of uniform convergence is given in Exer-
cise K.224.
Fortunately all these di¬culties vanish like early morning mist in the light
of Lebesgue™s theory.

The Riemann-Stieltjes integral ™
In this section we discuss a remarkable extension of the notion of integral due
to Stieltjes. The reader should ¬nd the discussion gives an excellent revision
of many of the ideas of Chapter 8.
Before doing so, we must dispose of a technical point. When authors talk
about the Heaviside step function H : R ’ R they all agree that H(t) = 0
for t < 0 and H(t) = 1 for t > 0. However, some take H(0) = 0, some take
H(0) = 1 and some take H(0) = 1/2. Usually this does not matter but it is
helpful to have consistency.
De¬nition 9.4.1. Let E ⊆ R We say that a function f : E ’ R is a right
continuous function if, for all x ∈ E, f (t) ’ f (x) whenever t ’ x through
values of t ∈ E with t > x.
Exercise 9.4.2. Which de¬nition of the Heaviside step function makes H
right continuous?
In the discussion that follows, G : R ’ R will be a right continuous
increasing function. (Exercise K.158 sheds some light on the nature of such
functions, but is not needed for our discussion.) We assume further that
there exist A and B with G(t) ’ A as t ’ ’∞ and G(t) ’ B as t ’ ∞.
Exercise 9.4.3. If F : R ’ R is an increasing function show that the fol-
lowing two statements are equivalent:-
(i) F is bounded.
(ii) F (t) tends to (¬nite) limits as t ’ ’∞ and as t ’ ∞.
We shall say that any ¬nite set D containing at least two points is a
dissection of R. By convention we write

D = {x0 , x1 , . . . , xn } with x0 < x1 < x2 < · · · < xn .

(Note that we now demand that the xj are distinct.)

Now suppose f : R ’ R is a bounded function. We de¬ne the upper
Stieltjes G sum of f associated with D by
SG (f, D) =(G(x0 ) ’ A) sup f (t) + (G(xj ) ’ G(xj’1 )) sup f (t)
t¤x0 t∈(xj’1 ,xj ]

+ (B ’ G(xn )) sup f (t)

(Note that we use half open intervals, since we have to be more careful about
overlap than when we dealt with Riemann integration.)
Exercise 9.4.4. (i) De¬ne the lower Stieltjes G sum sG (f, D) in the appro-
priate way.
(ii) Show that, if D and D are dissections of R, then SG (f, D) ≥ sG (f, D ).
(iii) De¬ne the upper Stieltjes G integral by I — (G, f ) = inf D S(f, D).
Give a similar de¬nition for the lower Stieltjes G integral I— (G, f ) and show
that I — (G, f ) ≥ I— (G, f ).
If I — (G, f ) = I— (G, f ), we say that f is Riemann-Stieltjes integrable with
respect to G and we write

f (x) dG(x) = I — (G, f ).

Exercise 9.4.5. (i) State and prove a criterion for Riemann-Stieltjes inte-
grability along the lines of Lemma 8.2.6.
(ii) Show that the set RG of functions which are Riemann-Stieltjes in-
tegrable with respect to G forms a vector space and the integral is a linear
functional (i.e. a linear map from RG to R).
(iii) Suppose that f : R ’ R is Riemann-Stieltjes integrable with respect
to G, that K ∈ R and |f (t)| ¤ K for all t ∈ R. Show that

f (x) dG(x) ¤ K(B ’ A).

(iv) Show that, if f, g : R ’ R are Riemann-Stieltjes integrable with
respect to G, so is f g (the product of f and g).
(v) If f : R ’ R is Riemann-Stieltjes integrable with respect to G, show
that |f | is also and that

|f (x)| dG(x) ≥ f (x) dG(x) .

(vi) Prove that, if f : R ’ R is a bounded continuous function, then f is
Riemann-Stieltjes integrable with respect to G. [Hint: Use the fact that f is
uniformly continuous on any [’R, R]. Choose R su¬ciently large.]
Please send corrections however trivial to twk@dpmms.cam.ac.uk

The next result is more novel, although its proof is routine (it resembles
that of Exercise 9.4.5 (ii)).
Exercise 9.4.6. Suppose that F, G : R ’ R are right continuous increasing
bounded functions and », µ ≥ 0. Show that, if f : R ’ R is Riemann-
Stieltjes integrable with respect to both F and G, then f is Riemann-Stieltjes
integrable with respect to »F + µG and

f (x) d(»F + µG)(x) = » f (x) dF (x) + µ f (x) dG(x).

Exercise 9.4.7. (i) If a ∈ R, show, by choosing appropriate dissections, that
I(’∞,a] is Riemann-Stieltjes integrable with respect to G and

I(’∞,a] (x) dG(x) = G(a) ’ A.

(ii) If a ∈ R, show that I(’∞,a) is Riemann-Stieltjes integrable with respect
to G if and only if G is continuous at a. If G is continuous at a show that

I(’∞,a) (x) dG(x) = G(a) ’ A.

(iii) If a < b, show that I(a,b] is Riemann-Stieltjes integrable with respect
to G and

I(a,b] (x) dG(x) = G(b) ’ G(a).

(iv) If a < b, show that I(a,b) is Riemann-Stieltjes integrable with respect
to G if and only if G is continuous at b.
Combining the results of Exercise 9.4.7 with Exercise 9.4.5, we see that,
if f is Riemann-Stieltjes integrable with respect to G, we may de¬ne

I(a,b] (x)f (x) dG(x)
f (x) dG(x) =

and make similar de¬nitions for integrals like f (x) dG(x)

Exercise 9.4.8. Show that, if G is continuous and f is Riemann-Stieltjes
integrable with respect to G, then we can de¬ne [a,b] f (x) dG(x) and that

f (x) dG(x) = f (x) dG(x).
(a,b] [a,b]

Remark: When we discussed Riemann integration, I said that, in mathe-
matical practice, it was unusual to come across a function that was Lebesgue
integrable but not Riemann integrable. In Exercise 9.4.7 (iv) we saw that the
function I(a,b) , which we come across very frequently in mathematical prac-
tice, is not Riemann-Stieltjes integrable with respect to any right continuous
increasing function G which has a discontinuity at b. In the Lebesgue-Stieltjes
theory, I(a,b) is always Lebesgue-Stieltjes integrable with respect to G. (Ex-
ercise K.161 extends Exercise 9.4.7 a little.)
The next result has an analogous proof to the fundamental theorem of
the calculus (Theorem 8.3.6).
Exercise 9.4.9. Suppose that G : R ’ R is an increasing function with con-
tinuous derivative. Suppose further that f : R ’ R is a bounded continuous
function. If we set

I(t) = f (x) dG(x),

show that then I is di¬erentiable and I (t) = f (t)G (t) for all t ∈ R.
Using the mean value theorem, in the form which states that the only
function with derivative 0 is a constant, we get the following result.
Exercise 9.4.10. Suppose that G : R ’ R is an increasing function with
continuous derivative. If f : R ’ R is a bounded continuous function, show
f (x) dG(x) = f (x)G (x) dx.
(a,b] a

Show also that

f (x) dG(x) = f (x)G (x) dx,
R ’∞

explaining carefully the meaning of the right hand side of the equation.
However, there is no reason why we should restrict ourselves even to
continuous functions when considering Riemann-Stieltjes integration.
Exercise 9.4.11. (i) If c ∈ R, de¬ne Hc : R ’ R by Hc (t) = 0 if t < c,
Hc (t) = 1 if t ≥ c. Show, by ¬nding appropriate dissections, that, if f : R ’
R is a bounded continuous function, we have

f (x) dHc (x) = f (c)
Please send corrections however trivial to twk@dpmms.cam.ac.uk

when c ∈ (a, b]. What happens if c ∈ (a, b]
(ii) If a < c1 < c2 < · · · < cm < b and »1 , »2 , . . . , »m ≥ 0, ¬nd a right
continuous function G : [a, b] ’ R such that, if f : (a, b] ’ R is a bounded
continuous function, we have
f (x) dG(x) = »j f (cj ).
(a,b] j=1

Exercise 9.4.11 shows that Riemann-Stieltjes integration provides a frame-
work in which point masses may be considered along with continuous densi-
ties6 .
The reader may agree with this but still doubt the usefulness of Riemann-
Stieltjes point of view. The following discussion may help change her mind.
What is a real-valued random variable? It is rather hard to give a proper
mathematical de¬nition with the mathematical apparatus available in 18807 .
However any real-valued random variable X is associated with a function

P (x) = Pr{X ¤ x}.

Exercise 9.4.12. Convince yourself that P : R ’ R is a right continuous
increasing function with P (t) ’ 0 as t ’ ’∞ and P (t) ’ 1 as t ’ ∞.
(Note that, as we have no proper de¬nitions, we can give no proper proofs.)

Even if we have no de¬nition of a random variable, we do have a de¬nition
of a Riemann-Stieltjes integral. So, in a typical mathematician™s trick, we
turn everything upside down.
Suppose P : R ’ R is a right continuous increasing function with P (t) ’
0 as t ’ ’∞ and P (t) ’ 1 as t ’ ∞. We say that P is associated with a
real-valued random variable X if

IE (x) dP (x)
Pr{X ∈ E} =

when IE is Riemann-Stieltjes integrable with respect to P . (Thus, for exam-
ple, E could be (’∞, a] or (a, b].) If the reader chooses to read Pr{X ∈ E}
as ˜the probability that X ∈ E™ that is up to her. So far as we are concerned,
Pr{X ∈ E} is an abbreviation for R IE (x) dP (x).
Note that, although we have justi¬ed the concept of a ˜delta function™, we have not
justi¬ed the concept of ˜the derivative of the delta function™. This requires a further
generalisation of our point of view to that of distributions.
The Holy Roman Empire was neither holy nor Roman nor an empire. A random
variable is neither random nor a variable.

In the same way we de¬ne the expectation Ef (X) by

Ef (X) = f (x) dP (x)

when f is Riemann-Stieltjes integrable with respect to P . The utility of
this de¬nition is greatly increased if we allow improper Riemann-Stieltjes
integrals somewhat along the lines of De¬nition 9.2.14.

De¬nition 9.4.13. Let G be as throughout this section. If f : R ’ R, and
R, S > 0 we de¬ne fRS : R ’ R by

if R ≥ f (t) ≥ ’S
fRS (t) = f (t)
fRS (t) = ’S if ’S > f (t),
fRS (t) = R if f (t) > R.

If fRS is Riemann-Stieltjes integrable with respect to G for all R, S > 0, and
we can ¬nd an L such that, given > 0, we can ¬nd an R0 ( ) > 0 such that

fRS (x) dG(x) ’ L < .

for all R, S > R0 ( ), then we say that f is Riemann-Stieltjes integrable with
respect to G with Riemann-Stieltjes integral

f (x) dG(x) = L.

|f (x)| dG(x)
(As before, we add a warning that care must be exercised if R
fails to converge.)

Lemma 9.4.14. (Tchebychev™s inequality.) If P is associated with a
real-valued random variable X and EX 2 exists then
EX 2
Pr{X > a or ’ a ≥ X} ¤ 2 .
Proof. Observe that

x2 ≥ a2 IR\(’a,a] (x)

for all x and so

a2 IR\(’a,a] (x) dG(x).
x2 dG(x) ≥
Please send corrections however trivial to twk@dpmms.cam.ac.uk


IR\(’a,a] (x) dG(x).
x2 dG(x) ≥ a2

In other words,
EX 2 ≥ a2 Pr{X ∈ (’a, a]},
which is what we want to prove.
Exercise 9.4.15. (i) In the proof of Tchebchyev™s theorem we used various
simple results on improper Riemann-Stieltjes integrals without proof. Identify
these results and prove them.
(ii) If P (t) = ( π ’ tan’1 x)/π, show that EX 2 does not exist. Show that
this is also the case if we choose P given by
P (t) = 0 if t < 1
P (t) = 1 ’ 2’n if 2n ¤ t < 2n+1 , n ≥ 0 an integer.
Exercise 9.4.16. (Probabilists call this result ˜Markov™s inequality™. Ana-
lysts simply call it a ˜Tchebychev type inequality™.) Suppose φ : [0, ∞) ’ R
is an increasing continuous positive function. If P is associated with a real-
valued random variable X and Eφ(X) exists, show that
Pr{X ∈ (’a, a]} ¤
/ .
In elementary courses we deal separately with discrete random variables
(typically, in our notation, P is constant on each interval [n, n + 1)) and con-
tinuous random variables8 (in our notation, P has continuous derivative, this
derivative is the ˜density function™). It is easy to construct mixed examples.
Exercise 9.4.17. The height of water in a river is a random variable Y with
Pr{Y ¤ y} = 1 ’ e’y for y ≥ 0. The height is measured by a gauge which
registers X = min(Y, 1). Find Pr{X ¤ x} for all x.
Are there real-valued random variables which are not just a simple mix
of discrete and continuous? In Exercise K.225 (which depends on uniform
convergence) we shall show that there are.
The Riemann-Stieltjes formalism can easily be extended to deal with two
random variables X and Y by using a two dimensional Riemann-Stieltjes
integral with respect to a function
P (x, y) = Pr{X ¤ x, Y ¤ y}.
See the previous footnote on the Holy Roman Empire.

In the same way we can deal with n random variables X1 , X2 , . . . , Xn . How-
ever, we cannot deal with in¬nite sequences X1 , X2 , . . . of random variables
in the same way. Modern probability theory depends on measure theory.
In the series of exercises starting with Exercise K.162 and ending with
Exercise K.168 we see that the Riemann-Stieltjes integral can be generalised

How long is a piece of string? ™
The topic of line integrals is dealt with quickly and e¬ciently in many texts.
The object of this section is to show why the texts deal with the matter in
the way they do. The reader should not worry too much about the details
and reserve such matters as ˜learning de¬nitions™ for when she studies a more
e¬cient text.
The ¬rst problem that meets us when we ask for the length of a curve
is that it is not clear what a curve is. One natural way of de¬ning a curve
is that it is a continuous map γ : [a, b] ’ Rm . If we do this it is helpful to
consider the following examples.
: [0, 1] ’ R2 with γ 1 (t) = (cos 2πt, sin 2πt)
: [1, 2] ’ R2 with γ 2 (t) = (cos 2πt, sin 2πt)
: [0, 2] ’ R2 with γ 3 (t) = (cos πt, sin πt)
: [0, 1] ’ R2 γ 4 (t) = (cos 2πt2 , sin 2πt2 )
: [0, 1] ’ R2 γ 5 (t) = (cos 2πt, ’ sin 2πt)
: [0, 1] ’ R2 with γ 6 (t) = (cos 4πt, sin 4πt)

Exercise 9.5.1. Trace out the curves γ 1 to γ 6 . State in words how the
curves γ 1 , γ 4 , γ 5 and γ 6 di¬er.
Exercise 9.5.2. (i) Which of the curves γ 1 to γ 6 are equivalent and which
are not, under the following de¬nitions.
(a) Two curves „ 1 : [a, b] ’ R2 and „ 2 : [c, d] ’ R2 are equivalent
if there exist real numbers A and B with A > 0 such that Ac + B = a,
Ad + B = b and „ 1 (At + b) = „ 2 (t) for all t ∈ [c, d].
(b) Two curves „ 1 : [a, b] ’ R2 and „ 2 : [c, d] ’ R2 are equivalent if
there exists a strictly increasing continuous surjective function θ : [c, d] ’
[a, b] such that „ 1 (θ(t)) = „ 2 (t) for all t ∈ [c, d].
(c) Two curves „ 1 : [a, b] ’ R2 and „ 2 : [c, d] ’ R2 are equivalent
if there exists a continuous bijective function θ : [c, d] ’ [a, b] such that
„ 1 (θ(t)) = „ 2 (t) for all t ∈ [c, d].
Please send corrections however trivial to twk@dpmms.cam.ac.uk

(d) Two curves „ 1 : [a, b] ’ R2 and „ 2 : [c, d] ’ R2 are equivalent if
„ 1 ([a, b]) = „ 2 ([c, d]).
(ii) If you know the de¬nition of an equivalence relation verify that con-
ditions (a) to (d) do indeed give equivalence relations.
Naturally we demand that ˜equivalent curves™ (that is curves which we
consider ˜identical™) should have the same length. I think, for example, that
a de¬nition which gave di¬erent lengths to the curves described by γ 1 and
γ 2 would be obviously unsatisfactory. However, opinions may di¬er as to
when two curves are ˜equivalent™. At a secondary school level, most people
would say that the appropriate notion of equivalence is that given as (d) in
Exercise 9.5.2 and thus the curves γ 1 and γ 6 should have the same length.
Most of the time, most mathematicians9 would say that the curves γ 1 and
γ 6 are not equivalent and that, ˜since γ 6 is really γ 1 done twice™, γ 6 should
have twice the length of γ 1 . If the reader is dubious she should replace the
phrase ˜length of curve™ by ˜distance traveled along the curve™.
The following chain of ideas leads to a natural de¬nition of length. Sup-
pose γ : [a, b] ’ Rm is a curve (in other words γ is continuous). As usual,
we consider dissections

D = {t0 , t1 , t2 , . . . , tn }

with a = t0 ¤ t1 ¤ t2 ¤ · · · ¤ tn = b. We write
L(γ, D) = γ(tj’1 ) ’ γ(tj ) ,

where a ’ b is the usual Euclidean distance between a and b.
Exercise 9.5.3. (i) Explain why L(γ, D) may be considered as the ˜length
of the approximating curve obtained by taking straight line segments joining
each γ(tj’1 ) to γ(tj )™.
(ii) Show that, if D1 and D2 are dissections with D1 ⊆ D2 ,

L(γ, D2 ) ≥ L(γ, D1 ).

Deduce that, if D3 and D4 are dissections, then

L(γ, D3 ∪ D4 ) ≥ max(L(γ, D3 ), L(γ, D4 )).

The two parts of Exercise 9.5.3 suggest the following de¬nition.
But not all mathematicians and not all the time. One very important de¬nition of
length associated with the name Hausdor¬ agrees with the school level view.

De¬nition 9.5.4. We say that a curve γ : [a, b] ’ Rm is recti¬able if
there exists a K such that L(γ, D) ¤ K for all dissections D. If a curve is
recti¬able, we write

length(γ) = sup L(γ, D)

the supremum being taken over all dissections of [a, b].

Not all curves are recti¬able.

Exercise 9.5.5. (i) Let f : [0, 1] ’ R be the function given by the conditions
f (0) = 0, f is linear on [2’n’2 3, 2’n ] with f (2’n’2 3) = 0 and f (2’n ) =
(n + 1)’1 , f is linear on [2’n’1 , 2’n’2 3] with f (2’n’2 3) = 0 and f (2’n’1 ) =
(n + 2)’1 [n ≥ 0].
Sketch the graph of f and check that f is continuous. Show that the curve
γ : [0, 1] ’ R2 given by γ(t) = (t, f (t)) is not recti¬able.
(ii) Let g : [’1, 1] ’ R be the function given by the conditions g(0) = 0,
g(t) = t2 sin |t|± for t = 0, where ± is real. Show that g is di¬erentiable
everywhere, but that, for an appropriate choice of ±, the curve „ : [’1, 1] ’
R2 given by „ (t) = (t, g(t)) is not recti¬able.

Exercise 9.5.6. (i) By using the intermediate value theorem, show that a
continuous bijective function θ : [c, d] ’ [a, b] is either strictly increasing or
strictly decreasing.
(ii) Suppose that γ : [a, b] ’ Rm is a recti¬able curve and θ : [c, d] ’ [a, b]
is a continuous bijection. Show that γ —¦ θ (where —¦ denotes composition) is
a recti¬able curve and

length(γ —¦ θ) = length(γ).

(iii) Let „ : [’1, 1] ’ R2 given by „ (t) = (sin πt, 0). Show that length(„ ) =
4. Comment brie¬‚y.

The next exercise is a fairly obvious but very useful observation.

Exercise 9.5.7. Suppose that γ : [a, b] ’ Rm is recti¬able. Show that, if
a ¤ t ¤ b, then the restriction γ|[a,t] : [a, t] ’ Rm is recti¬able. If we write

lγ (t) = length(γ|[a,t] ),

show that lγ : [a, b] ’ R is an increasing function with lγ (a) = 0.

With a little extra e¬ort we can say rather more about lγ .
Please send corrections however trivial to twk@dpmms.cam.ac.uk

Exercise 9.5.8. We use the hypotheses and notation of Exercise 9.5.7.
(i) Suppose that γ has length L and that

D = {t0 , t1 , t2 , . . . , tn }

with a = t0 ¤ t1 ¤ t2 ¤ · · · ¤ tn = b is a dissection such that


. 7
( 19)