|u (tj )|1,∞ + ˜|u (tn+1 )|1,∞

+ Ch„ ˜|u (0)|1,∞ +

j=1

n

|f j+˜ |1,∞

+

j=0

tn+1 tn+1

¤θ (Rh ’ I)u (s)

0

+C ds + „ u (s) ds

0,h 0,h 0,h

0 0

|u (s)|1,∞ + |f (s)|1,∞ .

+ Ch sup sup

s∈(0,tn+1 ) s∈(0,tn+1 )

This is the basic estimate. The ¬nal estimate is easily obtained by the

same approach as in the ¬nite element method. In summary, we have the

following result.

Theorem 7.33 In addition to the assumptions of Theorem 6.15, consider

the ¬nite volume method on Donald diagrams. Furthermore, let u0h ∈ Vh ,

7.6. Order of Convergence Estimates 341

u0 ∈ V ©H 2 („¦), f ∈ C([0, T ], C 1 („¦)), ˜ ∈ [ 1 , 1]. Then if u(t) is su¬ciently

2

smooth, the following estimate is valid:

u(tn ) ’ U n ¤ u0h ’ u0 + Ch u0 + u(tn )

0,h 0,h 2 2

tn

ds + sup |u (s)|1,∞

+ u (s) 2

s∈(0,tn )

0

tn

+ sup |f (s)|1,∞ + C„ u (s) ds .

0,h

s∈(0,tn ) 0

Exercise

7.17 Verify Remark 7.32.

8

Iterative Methods for Nonlinear

Equations

In the same way as linear (initial-) boundary value problems by the dis-

cretization techniques discussed in this book lead to (sequences of) linear

equations, we get nonlinear equations of similar type from nonlinear prob-

lems. Two of them will be treated in this chapter. As in the Sections 1.2,

3.4, 7.3, and 6.2.4, we have to answer the question of the quality of the

approximation, and as in Section 2.5 and Chapter 5, the question of the

approximative resolution of the systems of equations. We will focus on the

latter in this chapter.

In general, the problem may be formulated in di¬erent equivalent

settings, namely:

x∈U

Find with f (x) = b . (8.1)

x∈U

Find with f (x) = 0 . (8.2)

Then x is called a root of (8.2) and a zero of f .

x ∈ U with f (x) = x .

Find (8.3)

Then x is called a ¬xed point.

Here U ‚ Rm , f : U ’ Rm is a mapping, and b ∈ Rm . The transition from

one formulation to another follows by rede¬ning f in evident ways.

In most cases, a root or a ¬xed point cannot be calculated (with ex-

act arithmetic) in a ¬nite number of operations, but only by an iterative

method, i.e., by a mapping

¦:U ’U,

8. Iterative Methods for Nonlinear Equations 343

so that (as in (5.7)) for the sequence

x(k+1) := ¦ x(k) (8.4)

with given x(0) we get

x(k) ’ x for k ’ ∞ . (8.5)

Here x is the solution of (8.1), (8.2), or (8.3).

As we already stated in Section 5.1, in the case of a continuous ¦ it

follows from (8.4), (8.5) that the limit x satis¬es

x = ¦(x) . (8.6)

This means that (8.6) should imply that x is a solution of (8.1), (8.2), or

(8.3). The extension of the de¬nition of consistency in Section 5.1 requires

the inverse implication.

Concerning the error level that we should achieve in relation to the ap-

proximation error of the discretization, the statements in the introduction

of Chapter 5 still hold. In addition to the criteria of comparison for lin-

ear stationary methods we now have to take into account the following:

Methods may, if they do at all, converge only locally, which leads to the

following de¬nition:

De¬nition 8.1 If in the above situation (8.5) holds for all x(0) ∈ U (i.e.,

for arbitrary starting values), then (x(k) )k is called globally convergent. If

an open U ‚ U exists such that (8.5) holds for x(0) ∈ U, then (x(k) )k

˜ ˜

˜

is called locally convergent. In the latter case U is called the range of the

iteration.

On the other hand, we may observe a faster convergence than the linear

convergence introduced in (5.3):

De¬nition 8.2 Let (x(k) )k be a sequence in Rm , x ∈ Rm , and · a norm

on Rm . The sequence (x(k) )k converges linearly to x with respect to · if

there exists a C with 0 < C < 1 such that

x(k+1) ’ x ¤ C x(k) ’ x for all k ∈ N .

The sequence (x(k) )k converges with order of convergence p > 1 to x if

x(k) ’ x for k ’ ∞ and if there exists a C > 0 such that

p

x(k+1) ’ x ¤ C x(k) ’ x for all k ∈ N .

The sequence (x(k) )k converges superlinearly to x if

x(k+1) ’ x

lim = 0.

x(k) ’ x

k’∞

The case p = 2 is also called quadratic convergence. Thus, while a linearly

converging method guarantees a reduction of the error by a constant factor

C, this reduction is improved step by step in the case of superlinear or

344 8. Iterative Methods for Nonlinear Equations

higher-order convergence. When we encounter quadratic convergence, for

example, the number of signi¬cant digits is doubled in every step (minus

a ¬xed number), so that usually only a small number of iterations will be

necessary. For this reason variants of the quadratically converging Newton

method (Section 8.2) are attractive. But the restriction of local convergence

may require modi¬cations to enlarge the range of convergence.

To evaluate the complexity of a numerical method the number of ele-

mentary operations for an iteration has to be considered. By an elementary

operation we want also to understand the evaluation of functions like the

sine, although this is much more costly than an ordinary ¬‚oating-point

operation. A typical subproblem during an iteration cycle is the solution

of a system of linear equations, analogously to the simpler systems in the

form (5.10) occurring in linear stationary problems. Besides the e¬ort to

assemble this system of equations, we have to account for the work to

solve it, which can be done with one of the methods described in Sec-

tion 2.5 and Chapter 5, i.e., in particular, again with an iterative method.

We call this a secondary or inner iteration, which is attractive because of

the sparse structure of the matrices originating from the discretization, as

already discussed in Chapter 5. Here an inexact variant may be useful, with

which the inner iteration is performed only up to a precision that conserves

the convergence properties of the outer iteration. The numerical cost for

the assembling may, in fact, be more expensive than the cost for the in-

ner iteration. Hence methods with low cost for the assembling (but worse

convergence) should also be considered. Keeping this in mind, we devote

an introductory chapter to the ¬xed-point iterations, which are, roughly

speaking, methods in which the iteration ¦ coincides with the mapping f .

8.1 Fixed-Point Iterations

For the ¬xed-point formulation (8.3) the choice ¦ := f is evident according

to (8.6); in other words, the ¬xed-point iteration reads

x(k+1) := f x(k) . (8.7)

To diminish the distance of two succeeding members of the sequence, i.e.,

¦(x(k+1) ) ’ ¦(x(k) ) = x(k+2) ’ x(k+1) < x(k+1) ’ x(k) ,

it is su¬cient that the iteration function (here ¦ = f ) be contractive (see

Appendix A.4).

Su¬cient conditions for a contraction are given by the following lemma:

Lemma 8.3 Let U ‚ Rm be open and convex, and g : U ’ Rm

continuously di¬erentiable. If

sup Dg(x) =: L < 1

x∈U

8.1. Fixed-Point Iterations 345

holds, where · in Rm,m is compatible with · in Rm , then g is contracting

in U .

2

Proof: Exercise 8.1.

Therefore, if U ‚ Rm is open, f : U ‚ Rm ’ Rm is continuously

di¬erentiable, and if there exists some x ∈ U with Df (˜) < 1, then there

˜ x

˜ of x with

exists a closed convex neighbourhood U ˜

Df (x) ¤ L < 1 for x ∈ U ˜

and, for example, L = Df (˜) + 1 (1 ’ Df (˜) ), guaranteeing the

x x

2

contractivity of f in U.

The unique existence of a ¬xed point and the convergence of (8.7) is

guaranteed if the set U where f is a contraction is mapped into itself:

Theorem 8.4 (Banach™s ¬xed-point theorem) Let U ‚ Rm , U = …,

and U be closed. Let f : U ’ Rm be contractive with Lipschitz constant

L < 1 and f [U ] ‚ U . Then we have:

(1) There exists one and only one ¬xed point x ∈ U of f .

(2) For arbitrary x(0) ∈ U the ¬xed point iteration (8.7) converges to x,

and we have

L

x(k) ’ x ¤ x(k) ’ x(k’1)

1’L

(a posteriori error estimate)

Lk

¤ x(1) ’ x(0)

1’L

(a priori error estimate).

Proof: The sequence x(k+1) := f (x(k) ) is well-de¬ned because of f [U ] ‚ U .

We prove that (x(k) )k is a Cauchy sequence (see Appendix A.4).

x(k+1) ’ x(k) = f (x(k) ) ’ f (x(k’1) ) ¤ L x(k) ’ x(k’1)

¤ L2 x(k’1) ’ x(k’2) ¤ · · · ¤ Lk x(1) ’ x(0) , (8.8)

so that for any k, l ∈ N

x(k+l) ’ x(k)

¤ x(k+l) ’ x(k+l’1) + x(k+l’1) ’ x(k+l’2) + · · · + x(k+1) ’ x(k)

¤ (Lk+l’1 + Lk+l’2 + · · · + Lk ) x(1) ’ x(0)

= Lk (1 + L + · · · + Ll’1 ) x(1) ’ x(0)

∞

1

¤L Ll x(1) ’ x(0) = Lk x(1) ’ x(0) .

k

1’L

l=0

346 8. Iterative Methods for Nonlinear Equations

Thus we have x(k+l) ’ x(k) ’ 0 for k ’ ∞; i.e., (x(k) )k is a Cauchy

sequence and thus converges to some x ∈ Rm because of the completeness

of Rm . Due to the closedness of U we conclude that x ∈ U . Since we have

x(k+1) ’ x , f x(k) ’ f (x) for k ’ ∞ ,

x is also a ¬xed point of f .

The ¬xed point is unique, because for ¬xed points x, x,

¯

x ’ x = f (x) ’ f (¯) ¤ L x ’ x ,

¯ x ¯

which immediately implies x = x because of L < 1. Moreover, we have

¯

x(k) ’ x f (x(k’1) ) ’ f (x) ¤ L x(k’1) ’ x

=

¤L x(k’1) ’ x(k) + x(k) ’ x ,

and thus from (8.8),

L L

x(k) ’ x ¤ x(k) ’ x(k’1) ¤ Lk’1 x(1) ’ x(0) .

1’L 1’L

2

Remark 8.5 The theorem can be generalized: Since we used only the com-

pleteness of Rm , the proposition holds even in a Banach space (X, · ),

where U ‚ X is a closed subset.

This enables us to de¬ne iterative schemes directly in the function space

for nonlinear boundary value problems, which means that the resulting

(linear) problems in the iteration step are to be discretized. So instead of

proceeding in the order discretization“iteration, we can apply the sequence

iteration“discretization. This leads in general to di¬erent schemes, even

if the approaches have been the same. We will always refer to the ¬rst

strategy.

According to Lemma 8.3 we can often construct a closed U such that

f is contractive on U . It remains to verify that f [U ] ‚ U . For this, the

following lemma is helpful:

Lemma 8.6 Let U ‚ Rm , f : U ’ Rm . If there exists a y ∈ U and a

r > 0 with

B r (y) ‚ U ,

with f contractive on B r (y) with Lipschitz constant L < 1, so that

y ’ f (y) ¤ r(1 ’ L) ,

then f has one and only one ¬xed point in B r (y), and (8.7) converges.

2

Proof: Exercise 8.2.

8.1. Fixed-Point Iterations 347

In the setting of Theorem 8.4 the ¬xed-point iteration is thus globally

convergent in U . In the setting of Lemma 8.6 it is locally convergent in

U (globally in B r (y)). We see that in the situation of Theorem 8.4 the

sequence (x(k) ) has, because of

x(k+1) ’ x = f (x(k) ) ’ f (x) ¤ L x(k) ’ x ,

a linear order of convergence (and in general not better).

A su¬cient condition for local convergence of the corresponding order is

given by the following theorem:

Theorem 8.7 Let U ‚ Rm be open, ¦ : U ’ U continuous, the sequence

(x(k) ) de¬ned by x(k+1) := ¦ x(k) for a given x(0) ∈ U . If there exists

some x ∈ U , an open V ‚ U with x ∈ V , and constants C, p with p ≥ 1,

¯ ¯

C ≥ 0, and C < 1 for p = 1, such that for all x ∈ V ,

¦(x) ’ x ¤ C x ’ x p

¯ ¯

holds, then the iteration de¬ned by ¦ converges locally to x of order at least

¯

p, and x is a ¬xed point of ¦.

¯

Proof: Choose W = Br (¯) ‚ V, with r > 0 su¬ciently small, such that

x

W ‚ V and

Crp’1 =: L < 1 .

If x(k) ∈ W , then we conclude because of

p

x(k+1) ’ x = ¦ x(k) ’ x ¤ C x(k) ’ x < Crp < r

¯ ¯ ¯

that x(k+1) ∈ W , too. This means that for x(0) ∈ W we have that x(k) ∈ W

for all k ∈ N. Furthermore, we have

p

x(k+1) ’ x ¤ C x(k) ’ x < C rp’1 x(k) ’ x = L x(k) ’ x ,

¯ ¯ ¯ ¯

i.e.,

x(k) ’ x for k ’ ∞,

¯

and consequently,

x = lim x(k+1) = lim ¦ x(k) = ¦(¯) .

¯ x

k’∞ k’∞

2

The special case of a scalar equation shows that we can expect at most

linear convergence for ¦ = f :

Corollary 8.8 Let U ‚ R be an open subset, ¦ on U p-times continuously

di¬erentiable, and x ∈ U a ¬xed point of ¦.

¯

If ¦ (¯) = 0, |¦ (¯)| < 1 for p = 1 and ¦ (¯) = · · · = ¦(p’1) (¯) = 0,

x x x x

(p)

¦ (¯) = 0 for p > 1, then the iteration de¬ned by ¦ is locally convergent

x

to x with order of convergence p, but not better.

¯

348 8. Iterative Methods for Nonlinear Equations

Proof: Taylor™s expansion of ¦ at x gives, for x ∈ U ,

¯

¦(p) (ξ)

(x ’ x)p with ξ ∈ (x, x) ,

¦(x) = ¦(¯) +

x ¯ ¯

p!

and in the case p = 1 we have |¦ (ξ)| < 1 for su¬ciently small |x’ x|. Thus,

¯

there exists a neighbourhood V of x such that |¦(x) ’ x| ¤ C|x ’ x|p for

¯ ¯ ¯

all x ∈ V and C < 1 for p = 1. Theorem 8.7 implies order of convergence p.

The example ¦(x) = Lxp with L < 1 for p = 1 with the ¬xed point x = 0

2

shows that no improvement is possible.

Exercises

8.1 Prove Lemma 8.3 with the help of the mean value theorem.

8.2 Prove Lemma 8.6.

8.2 Newton™s Method and Its Variants

8.2.1 The Standard Form of Newton™s Method

In the following we want to study the formulation stated in (8.2), i.e., the

problem of ¬nding the solutions of

f (x) = 0 .

The simplest method of Chapter 5, the Richardson iteration (cf. (5.28)),

suggests the direct application of the ¬xed-point iteration for, e.g., ¦(x) :=

’f (x) + x. This approach succeeds only if, in the case of a di¬erentiable

f , the Jacobian I ’ Df (x) is small in the sense of Lemma 8.3 close to the

solution. Here we denote by Df (x) = (‚j fi (x))ij the Jacobi or functional

matrix of f . A relaxation method similar to (5.30) leads to the damped

variants, which will be treated later.

The method in its corrector formulation, analogously to (5.10) with

δ (k) := x(k+1) ’ x(k) ,

is

δ (k) = ’f x(k) , (8.9)

or in its relaxation formulation with relaxation parameter ω > 0,

δ (k) = ’ωf (x(k) ) .

Now we want to introduce another approach to de¬ne ¦:

Let x(0) be an approximation of a zero. An improved approximation is

probably given by the following:

8.2. Newton™s Method and Variants 349

• Replace f by a simple function g that approximates f near x(0) and

whose zero is to be determined.

• Find x(1) as the solution of g(x) = 0.

Newton™s method needs the di¬erentiability of f , and one chooses the

approximating a¬ne-linear function given by Df (x(0) ), i.e.,

g(x) = f x(0) + Df x(0) x ’ x(0) .

Under the assumption that Df (x(0) ) is nonsingular, the new iterate x(1) is

determined by solving the system of linear equations

Df x(0) x(1) ’ x(0) = ’f x(0) , (8.10)

or formally by

’1

x(1) := x(0) ’ Df x(0) f x(0) .

This suggests the following de¬nition:

¦(f )(x) = x ’ Df (x)’1 f (x) . (8.11)

Here ¦ is well-de¬ned only if Df (x) is nonsingular. Then x ∈ Rm is a zero

of f if and only if x is a ¬xed point of ¦. When executing the iteration,

’1

we do not calculate Df x(k) but only the system of equations similar

to (8.10).

Thus, the kth iteration of Newton™s method reads as follows: Solve

Df x(k) δ (k) = ’f x(k) (8.12)

and set

x(k+1) := x(k) + δ (k) . (8.13)

Equation (8.13) has the same form as (5.10) with W = Df (x(k) ), with

the residual at x(k)

d(k) := f x(k) .

Thus the subproblem of the kth iteration is easier in the sense that it con-

sists of a system of linear equations (with the same structure of dependence

as f ; see Exercise 8.6). In the same sense the system of equations (5.10)

in the case of linear stationary methods is “easier” to solve than the orig-

inal problem of the same type. Furthermore, W is in general di¬erent for

di¬erent k.

An application of (8.12), (8.13) to Ax = b, i.e., Df (x) = A for all x ∈ Rm

results in (5.10) with W = A, a method converging in one step, which just

reformulates the original problem:

A x ’ x(0) = ’ Ax(0) ’ b .

350 8. Iterative Methods for Nonlinear Equations

The range of the iteration may be very small, as can be shown already

by one-dimensional examples. But in this neighbourhood of the solution

we have, e.g., for m = 1, the following:

Corollary 8.9 Let f ∈ C 3 (R) and let x be a simple zero of f (i.e., f (¯) =

¯ x

0). Then Newton™s method converges locally to x, of order at least 2.

¯

Proof: There exists an open neighbourhood V of x such that f (x) = 0 for

¯

all x ∈ V ; i.e., ¦ is well-de¬ned by (8.11), continuous on V , and x is a ¬xed

¯

point of ¦. According to Corollary 8.8 it su¬ces to show that ¦ (¯) = 0:

x

f (x)2 ’ f (x)f (x) f (x)

¦ (x) = 1 ’ = f (x) =0 for x = x,

¯

f (x)2 f (x)2

and ¦ exists continuously, because f ∈ C 3 (R). 2

In the following we want to develop a general local theorem of conver-

gence for Newton™s method (according to L.V. Kantorovich). It necessitates

only the Lipschitz continuity of Df and ensures the existence of a zero, too.

Here we always suppose a ¬xed norm on Rm and consider a compatible

norm on Rm,m . As a prerequisite we need the following lemma:

Lemma 8.10 Let C0 ‚ Rm be convex, open, f : C0 ’ Rm di¬erentiable,

and suppose there exists γ > 0 such that

Df (x) ’ Df (y) ¤ γ x ’ y for all x, y ∈ C0 . (8.14)

Then for all x, y ∈ C0 ,

γ

f (x) ’ f (y) ’ Df (y)(x ’ y) ¤ x’y 2

.

2

Proof: Let • : [0, 1] ’ Rm be de¬ned by •(t) := f (y + t(x ’ y)), for

arbitrary, ¬xed x, y ∈ C0 . Then • is di¬erentiable on [0, 1] and

• (t) = Df (y + t(x ’ y))(x ’ y) .

Thus for t ∈ [0, 1] we have

• (t) ’ • (0) (Df (y + t(x ’ y)) ’ Df (y)) (x ’ y)

=

¤ Df (y + t(x ’ y)) ’ Df (y) x’y ¤ γt x’y 2

.

For

1

(• (t) ’ • (0)) dt

∆ := f (x)’f (y)’Df (y)(x’y) = •(1)’•(0)’• (0) =

0

we also get

1 1

1

∆¤ • (t) ’ • (0) dt ¤ γ x ’ y γ x’y

2 2

t dt = .

2

0 0

2

8.2. Newton™s Method and Variants 351

Now we are able to conclude local, quadratic convergence:

Theorem 8.11 Let C ‚ Rm be convex, open and f : C ’ Rm di¬eren-

tiable.

For x(0) ∈ C there exist ±, β, γ > 0 such that

h := ± β γ/2 < 1 ,

r := ±/(1 ’ h) ,

Br x(0) ‚ C .

¯

Furthermore, we require:

(i) Df is Lipschitz continuous on C0 = Br+µ x(0) for some µ > 0 with

constant γ in the sense of (8.14).

(ii) For all x ∈ Br x(0) there exists Df (x)’1 and Df (x)’1 ¤ β.

’1

¤ ±.

Df x(0) f x(0)

(iii)

Then:

(1) The Newton iteration

’1

x(k+1) := x(k) ’ Df x(k) f x(k)

is well-de¬ned and

x(k) ∈ Br x(0) for all k ∈ N .

(2) x(k) ’ x for k ’ ∞ and f (¯) = 0.

¯ x

h2 ’1

k

βγ (k) 2

x(k+1) ’ x ¤ ’x x(k) ’ x ¤±

(3) ¯ x ¯ and ¯

1 ’ h2k

2

for k ∈ N .

Proof: (1): To show that x(k+1) is well-de¬ned it is su¬cient to verify

x(k) ∈ Br x(0) (‚ C) for all k ∈ N .

By induction we prove the extended proposition

’1

k’1

x(k) ∈ Br x(0) x(k) ’ x(k’1) ¤ ± h2 for all k ∈ N . (8.15)

and

The proposition (8.15) holds for k = 1, because according to (iii),

’1

x(1) ’ x(0) = Df x(0) ¤ ± < r.

f x(0)

Let (8.15) be valid for l = 1, . . . , k. Then x(k+1) is well-de¬ned, and by the

application of the Newton iteration for k ’ 1 we get

’1

x(k+1) ’ x(k) = Df x(k) ¤ β f (x(k) )

f x(k)

= β f x(k) ’ f x(k’1) ’ Df x(k’1) x(k) ’ x(k’1)

βγ (k) 2

¤ x ’ x(k’1)

2

352 8. Iterative Methods for Nonlinear Equations

according to Lemma 8.10 with C0 = Br x(0) , and

βγ (k) βγ 2 2k ’2

2

= ±h2 ’1 .

k

x(k+1) ’ x(k) ¤ x ’ x(k’1) ¤ ±h

2 2

Thus the second part of (8.15) holds for k + 1, and also

x(k+1) ’ x(0) ¤ x(k+1) ’ x(k) + x(k) ’ x(k’1) + · · · + x(1) ’ x(0)

¤ ± h2 ’1 + h2 ’1

k k’1

+ · · · + h7 + h3 + h + 1

< ±/(1 ’ h) = r .

Hence (8.15) holds for k + 1.

(2): Using (8.15) we are able to verify that (x(k) )k is a Cauchy sequence,

because for l ≥ k we have

x(l+1) ’ x(k) ¤ x(l+1) ’ x(l) + x(l) ’ x(l’1) + · · · + x(k+1) ’ x(k)

3

’1

k k k

¤ ± h2 + ···

1 + h2 + h2 (8.16)

’1

k

± h2

’0 for k ’ ∞ ,

< k

1 ’ h2

since h < 1. Hence there exists x = limk’∞ x(k) and x ∈ B r x(0) .

¯ ¯

Furthermore, f (¯) = 0, because we can conclude from x(k) ∈ Br x(0)

x

that

Df x(k) ’ Df x(0) ¤ γ x(k) ’ x(0) < γr ;

thus

¤ γr + Df x(0)

Df x(k) =: K

and from f x(k) = ’Df x(k) x(k+1) ’ x(k) , we obtain

¤ K x(k+1) ’ x(k) ’ 0

f x(k)

for k ’ ∞. Thus we also have

f (¯) = lim f (x(k) ) = 0 .

x

k’∞

(3): With l ’ ∞ in (8.16) we can prove the second part in (3); the ¬rst

part follows from

’1

x(k+1) ’ x = x(k) ’ Df x(k) f x(k) ’ x

¯ ¯

’1

= x(k) ’ x ’ Df x(k) f x(k) ’ f (¯)

¯ x

’1

f (¯) ’ f x(k) ’ Df x(k) x ’ x(k)

= Df x(k) x ¯ ,

which implies, according to Lemma 8.10 with C0 = Br+µ x(0) ‚ C,

γ (k) 2

x(k+1) ’ x ¤ β x ’x .

¯ ¯

2

2

8.2. Newton™s Method and Variants 353

The termination criterion (5.15), which is oriented at the residual, may

also be used for the nonlinear problem (and not just for the Newton

iteration). This can be deduced in analogy to (5.16):

Theorem 8.12 Let the following be valid:

There exists a zero x of f such that Df (¯) is nonsingular and

¯ x

(8.17)

Df is Lipschitz continuous in an open neighbourhood C of x.¯

> 0 there exists a δ > 0 such that for x, y ∈ Bδ (¯),

Then for every x

x ’ x ¤ (1 + ) κ(Df (¯)) f (x) y’x .

f (y) ¯ x ¯

2

Proof: See [22, p. 69, p. 72] and Exercise 8.4.

Here κ is the condition number in a matrix norm that is consistent with

the chosen vector norm. For x = x(k) and y = x(0) we get (locally) the

generalization of (5.16).

8.2.2 Modi¬cations of Newton™s Method

Modi¬cations of Newton™s method aim in two directions:

• Reduction of the cost of the assembling and the solution of the sys-

tem of equations (8.12) (without a signi¬cant deterioration of the

properties of convergence).

• Enlargement of the range of convergence.

We can account for the ¬rst aspect by simplifying the matrix in (8.12)

(modi¬ed or simpli¬ed Newton™s method). The extreme case is the replace-

ment of Df x(k) by the identity matrix; this leads us to the ¬xed-point

iteration (8.9). If the mapping f consists of a nonlinear and a linear part,

f (x) := Ax + g(x) = 0 , (8.18)

then the system of equations (8.12) of the Newton iteration reads as

δ (k) = ’f x(k) .

A + Dg x(k)

A straightforward simpli¬cation in this case is the ¬xed-point iteration

A δ (k) = ’f x(k) . (8.19)

It may be interpreted as the ¬xed-point iteration (8.9) of the system that

is preconditioned with A, i.e., of

A’1 f (x) = 0.

In (8.19) the matrix is identical in every iteration step; therefore, it has to

be assembled only once, and if we use a direct method (cf. Section 2.5),

the LU factorization has to be carried out only once. Thus with forward

354 8. Iterative Methods for Nonlinear Equations

and backward substitution we have only to perform methods with lower

computational cost. For iterative methods we cannot rely on this advantage,

but we can expect that x(k+1) is close to x(k) , and consequently δ (k,0) = 0

constitutes a good initial guess. Accordingly, the assembling of the matrix

gets more important with respect to the overall computational cost, and

savings during the assembling become relevant.

We get a system of equations similar to (8.19) by applying the chord

method (see Exercise 8.3), where the linear approximation of the initial

iterate is maintained, i.e.,

Df x(0) δ (k) = ’f x(k) . (8.20)

If the matrix B x(k) , which approximates Df x(k) , is changing in each

iteration step, i.e.,

B x(k) δ (k) = ’f x(k) , (8.21)

then the only advantage can be a possibly easier assembling or solvability of

the system of equations. If the partial derivatives ‚j fi (x) are more di¬cult

to evaluate than the function fi (y) itself (or possibly not evaluable at all),

then the approximation of Df (x(k) ) by di¬erence quotients can be taken

into consideration. This corresponds to

1

f (x + hej ) ’ f (x)

B x(k) ej = (8.22)

h

for column j of B x(k) with a ¬xed h > 0. The number of computations for

the assembling of the matrix remains the same: m2 for the full matrix and

analogously for the sparse matrix (see Exercise 8.6). Observe that numerical

di¬erentiation is an ill-posed problem, which means that we should ideally

choose h ∼ δ 1/2 , where δ > 0 is the error level in the evaluation of f . Even

then we can merely expect

Df x(k) ’ B x(k) ¤ Cδ 1/2

(see [22, pp. 80 f.]). Thus in the best case we can expect only half of the

signi¬cant digits of the machine precision. The second aspect of facilitated

solvability of (8.21) occurs if there appear “small” entries in the Jaco-

bian, due to a problem-dependent weak coupling of the components, and

these entries may be skipped. Take, for example, a Df x(k) with a block

structure as in (5.39):

Aij ∈ Rmi ,mj ,

Df x(k) = Aij ,

ij

such that the blocks Aij may be neglected for j > i. Then there results a

nested system of equations of the dimensions m1 , m2 , . . . , mp .

The possible advantages of such simpli¬ed Newton™s methods have to

be weighted against the disadvantage of a deterioration in the order of

convergence: Instead of an estimation like that in Theorem 8.11, (3), we

8.2. Newton™s Method and Variants 355

have to expect an additional term

B x(k) ’ Df x(k) x(k) ’ x .

This means only linear or ” by successive improvement of the approxi-

mation ” superlinear convergence (see [22, pp. 75 ¬.]). If we have a good

initial iterate, it may often be advantageous to perform a small number of

steps of Newton™s method. So in the following we will treat again Newton™s

method, although the subsequent considerations can also be transferred to

its modi¬cations.

If the linear problems (8.12) are solved with an iterative scheme, we

have the possibility to adjust the accuracy of the algorithm in order to

reduce the number of inner iterations, without a (severe) deterioration of

the convergence of the outer iteration of the Newton iteration. So dealing

with such inexact Newton™s methods, we determine instead of δ (k) from

˜

(8.12) only δ (k) , which ful¬ls (8.12) only up to an inner residual r(k) , i.e.,

˜

Df x(k) δ (k) = ’f x(k) + r(k) .

The new iterate is given by

˜

x(k+1) := x(k) + δ (k) .

˜

The accuracy of δ (k) is estimated by the requirement

r(k) ¤ ·k f x(k) (8.23)

with certain properties for the sequence (·k )k that still have to be de-

termined. Since the natural choice of the initial iterate for solving (8.12)

is δ (k,0) = 0, (8.23) corresponds to the termination criterion (5.15).

Conditions for ·k can be deduced from the following theorem:

Theorem 8.13 Let (8.17) hold and consider compatible matrix and vector

norms. Then there exists for every > 0 a δ > 0 such that for x(k) ∈ Bδ (¯),

x

’1

x(k+1) ’ x ¤ x(k) ’ Df x(k) f x(k) ’ x

¯ ¯

(8.24)

+ (1 + ) κ Df (¯) ·k x(k) ’ x .

x ¯

Proof: By the choice of δ we can ensure the nonsingularity of Df (x(k) ).

From

’1 ’1 (k)

˜

δ (k) = ’Df x(k) f x(k) + Df x(k) r

it follows that

¯˜

x(k+1) ’ x x(k) ’ x + δ (k)

¯ =

’1 ’1 (k)

¤ x(k) ’ x ’ Df x(k) f x(k) + Df x(k)

¯ r .

The assertion can be deduced from the estimation

’1 (k)

¤ (1 + )1/2 Df (¯)’1

Df x(k) r(k)

r x

356 8. Iterative Methods for Nonlinear Equations

(1 + )1/2 Df (¯)’1 ·k (1 + )1/2 Df (¯)

¤ x(k) ’ x .

x x ¯

2

Here we used Exercise 8.4 (2), (3) and (8.23).

The ¬rst part of the approximation corresponds to the error of the ex-

act Newton step, which can be estimated using the same argument as in

Theorem 8.11, (3) (with Exercise 8.4, (2)) by

γ (k)

’1 2

f x(k) ’ x ¤ (1 + )1/2 Df (¯)’1

x(k) ’ Df x(k) x ’x

¯ x ¯ .

2

This implies the following result:

Corollary 8.14 Let the assumptions of Theorem 8.13 be satis¬ed. Then

there exist δ > 0 and · > 0 such that for x(0) ∈ Bδ (¯) and ·k ¤ · for all

¯ x ¯

k ∈ N for the inexact Newton™s method the following hold:

(1) The sequence x(k) converges linearly to x.

¯

k

(2) If ·k ’ 0 for k ’ ∞, then x(k) converges superlinearly.

k

(3) If ·k ¤ K f x(k) for a K > 0, then x(k) converges quadratical-

k

ly.

2

Proof: Exercise 8.5.

The estimation (8.24) suggests that we carefully choose a very ¬ne level

of accuracy · of the inner iteration to guarantee the above statements of

¯

convergence. This is particularly true for ill-conditioned Df (¯) (which is

x

common for discretization matrices: See (5.34)). In fact, the analysis in the

weighted norm · = Df (¯) · shows that only ·k ¤ · < 1 has to be

x ¯

ensured (cf. [22, pp. 97 ¬.]). With this and on the basis of

2 2

·k = ± f x(k) / f x(k’1)

˜

for some ± ¤ 1 we can construct ·k in an adaptive way (see [22, p. 105]).

Most of the iterative methods introduced in Chapter 5 do not require the

explicit knowledge of the matrix Df x(k) . It su¬ces that the operation

Df x(k) y be feasible for vectors y, in general for fewer than m of them;

i.e., the directional derivative of f in x(k) in direction y is needed. Thus

in case a di¬erence scheme for the derivatives of f should be necessary or

reasonable, it is more convenient to choose directly a di¬erence scheme for

the directional derivative.

Since we cannot expect convergence of Newton™s method in general,

we require indicators for the convergence behaviour of the iteration. The

solution x is in particular also the solution of

¯

for x ∈ Rm .

2

Minimize f (x)

8.2. Newton™s Method and Variants 357

Let x(0) , „ > 0, ·0 , ˜ ∈ (0, 1), k = 0, i = 0 be given.

¯

˜

δ (k,0) := 0 , i := 1 .

(1)

˜ ˜

(2) Determine the ith iterate δ (k,i) for Df (x(k) )δ (k) = ’f (x(k) )

and calculate

˜

r(i) := Df (x(k) )δ (k,i) + f (x(k) ) .

(3) If r(i) ¤ ·k f (x(k) ) , then go to (4),

else set i := i + 1 and go to (2).

˜ ˜

δ (k) := δ (k,i) .

(4)

˜

x(k+1) := x(k) + δ (k) .

(5)

(6) If f (x(k+1) ) > ˜ f (x(k) ) , interrupt.

(7) If f (x(k+1) ) ¤ „ f (x(0) ) , end.

Else calculate ·k+1 , set k := k + 1, and go to (1).

Table 8.1. Inexact Newton™s method with monotonicity test.

Thus we could expect a descent of the sequence of iterates (x(k) ) in this

functional, i.e.,

f (x(k+1) ) ¤ ˜ f (x(k) )

¯ ¯

for a ˜ < 1.

If this monotonicity test is not ful¬lled, the iteration is terminated. Such

an example of an inexact Newton™s method is given in Table 8.1.

In order to avoid the termination of the method due to divergence, the

continuation methods have been developed. They attribute the problem

f (x) = 0 to a family of problems to provide successively good initial iter-

ates. The approach presented at the end of Section 8.3 for time-dependent

problems is similar to the continuation methods. Another approach (which

can be combined with the latter) modi¬es the (inexact) Newton™s method,

so that the range of convergence is enlarged: Applying the damped (inex-

act) Newton™s method means reducing the step length of x(k) to x(k+1) as

long as we observe a decrease conformable to the monotonicity test. One

strategy of damping, termed Armijo™s rule, is described in Table 8.2 and

replaces the steps (1), (5), and (6) in Table 8.1.

Thus damping Newton™s method means also a relaxation similar to

(5.30), where ω = »k is being adjusted to the iteration step as in (5.41).

In the formulation of Table 8.2 the iteration may eventually not terminate

if in (5) »k is successively reduced. This must be avoided in a practical

implementation of the method. But except for situations where divergence

is obvious, this situation will not appear, because we have the following

theorem:

358 8. Iterative Methods for Nonlinear Equations

Let additionally ±, β ∈ (0, 1) be given.

˜

δ (k,0) := 0 , i := 1 , »k := 1.

(1)

˜

(5) If f (x(k) + »k δ (k) ) ≥ (1 ’ ±»k ) f (x(k) ) , set »k := β»k

and go to (5).

˜

x(k+1) := x(k) + »k δ (k) .

(6)

Table 8.2. Damped inexact Newton step according to Armijo™s rule.

Theorem 8.15 Let ±, β, γ > 0 exist such that conditions (i), (ii) of Theo-

rem 8.11 on k∈N Br (x(k) ) hold for the sequence (x(k) )k de¬ned according

to Table 8.2. Let ·k ¤ · for an · < 1 ’ ±.

¯ ¯

¯ ¯

= 0, there exists a » > 0 such that »k ≥ » for all k ∈ N. If

(0)

Then if f x

furthermore x(k) k is bounded, then there exists a zero x, satisfying (8.17)

¯

and

x(k) ’ x for k ’ ∞ .

¯

There exists a k0 ∈ N such that for k ≥ k0 the relation

»k = 1

holds.

2

Proof: See [22, pp. 139 ¬.].

We see that in the ¬nal stage of the iteration we again deal with the

(inexact) Newton™s method with the previously described behaviour of

convergence.

Finally, the following should be mentioned: The problem f (x) = 0 and

Newton™s method are a¬ne-invariant in the sense that a transition to

Af (x) = 0 with a nonsingular A ∈ Rm,m changes neither the problem

nor the iteration method, since

D(Af )(x)’1 Af (x) = Df (x)’1 f (x) .

Among the assumptions of Theorem 8.11, (8.14) is not a¬ne-invariant. A

possible alternative would be

Df (y)’1 (Df (x) ’ Df (y)) ¤ γ x ’ y ,

which ful¬ls the requirement. With the proof of Lemma 8.10 it follows that

γ

Df (y)’1 (f (x) ’ f (y) ’ Df (y)(x ’ y)) ¤ x’y 2.

2

With this argument a similar variant of Theorem 8.11 can be proven.

8.2. Newton™s Method and Variants 359

The test of monotonicity is not a¬ne-invariant, so probably the natural

test of monotonicity

’1 ’1

¤ ˜ Df x(k)

¯

Df x(k) f x(k+1) f x(k)

has to be preferred. The vector on the right-hand side has already been

calculated, being, except for the sign, the Newton correction δ (k) . But for

¯

the vector in the left-hand side, ’δ (k+1) , the system of equations

¯

Df x(k) δ (k+1) = ’f x(k+1)

additionally has to be resolved.

Exercises

8.3 Consider the chord method as described in (8.20). Prove the

convergence of this method to the solution x under the following

¯

assumptions:

(1) Let (8.17) with B r (¯) ‚ C hold,

x

’1

¤β,

Df (x(0) )

(2)

(3) 2βγr < 1 ,

(4) x(0) ∈ B r (¯) .

x

8.4 Let assumption (8.17) hold. Prove for compatible matrix and vector

norms that for every > 0 there exists a δ > 0 such that for every x ∈

Bδ (¯),

x

(1) Df (x) ¤ (1 + )1/2 Df (¯) ,

x

(2) Df (x)’1 ¤ (1 + )1/2 Df (¯)’1

x

(I ’ M )’1 ¤ 1/(1 ’ M )

(employ for M < 1),

(3) (1 + )’1/2 Df (¯)’1 ’1

x ’ x ¤ f (x)

x ¯

¤ (1 + )1/2 Df (¯) x’x ,

x ¯

(4) Theorem 8.12.

8.5 Prove Corollary 8.14.

8.6 Let U ‚ Rm be open and convex. Consider problem (8.2) with con-

tinuously di¬erentiable f : U ’ Rm . For i = 1, . . . , m let Ji ‚ {1, . . . , m}

be de¬ned by

‚j fi (x) = 0 for j ∈ Ji and every x ∈ U .

/

360 8. Iterative Methods for Nonlinear Equations

Then the operator f is sparsely occupied if li := |Ji | < m, or sparsely

occupied in the strict sense if li ¤ l for all i = 1, . . . , m and l < m is

independent of m for a sequence of problems (8.2) of dimension m.

Then the evaluation of Df (x) and its approximation according to (8.22)

m

both need k=1 lk evaluations of ‚j fi or of fl , respectively. What is the

computational e¬ort for a di¬erence approximation

f (x + hδ/ δ ) ’ f (x)

δ

h

of the directional derivative Df (x)δ ?

8.3 Semilinear Boundary Value Problems for

Elliptic and Parabolic Equations

In this section we treat semilinear problems as the simplest nonlinear case,

where nonlinearities do not occur in parts containing derivatives. Hence we

want to examine di¬erential equations of the form (0.33) that satisfy (0.42)

and (0.43).

Stationary Problems

As a stationary problem we consider the di¬erential equation

for x ∈ „¦

Lu(x) + ψ(u(x)) = 0 (8.25)

with the linear elliptic di¬erential operator L according to (3.12) and linear

boundary conditions on ‚„¦ according to (3.18)“(3.20). Here ψ : R ’ R

denotes a mapping that is supposed to be continuously di¬erentiable.

A Galerkin discretization in Vh ‚ V with H0 („¦) ‚ V ‚ H 1 („¦) according

1

to the type of boundary condition and Vh = span {•1 , . . . , •M } with the

M

approximative solution uh ∈ Vh in the representation uh = i=1 ξi •i gives

Sξ + G(ξ) = b (8.26)

with the sti¬ness matrix S = (a (•j , •i ) )i,j and a vector b that con-

tains the contributions of the inhomogeneous boundary conditions. Here

the nonlinear mapping G : RM ’ RM is de¬ned by

M

G(ξ) := (Gj (ξ))j with Gj (ξ) := ψ ξi •i •j dx .

„¦ i=1

Note that this notation di¬ers from that in Section 2.2 and the subsequent

chapters: There we denoted S by Ah and b ’ G(ξ) by q h . For reasons of

brevity we omit the index h.

For the moment we want to suppose that the mapping G can be evaluated

exactly. The system of equations (8.26) with

and g(ξ) := G(ξ) ’ b

A := S

8.3. Semilinear Elliptic and Parabolic Problems 361

is of the type introduced in (8.18) in the variable ξ. Thus we may apply,

besides the Newton iteration, the ¬xed-point iteration, introduced in (8.19),

and the variants of Newton™s method, namely the modi¬ed and inexact

versions with their already discussed advantages and drawbacks. We have

to examine the question of how the properties of the matrix will change by

¯ ¯

the transition from A to A + DG(ξ), where ξ stands for the current iterate.

We have

¯

DG(ξ) = ψ (¯)•i •j dx ,

u (8.27)

ij

„¦

M¯

¯

where u = P ξ = i=1 ξi •i ∈ Vh denotes the function belonging to the

¯

¯ ¯

representing vector ξ. This means that DG(ξ) is symmetric and positive

semide¬nite, respectively de¬nite, if the following condition for ± = 0,

respectively ± > 0, holds:

There exists some ± ≥ 0 such that ψ (u) ≥ ± for all u ∈ R . (8.28)

More precisely, we have for · ∈ RM , if (8.28) is valid,

2 2

¯ ψ (¯) |P ·| dx ≥ ± P ·

·T DG(ξ)· = u .

0

„¦

For such a monotone nonlinearity the properties of de¬niteness of the sti¬-

ness matrix S may be “enforced”. If, on the other hand, we want to make

use of the properties of an M-matrix that can be ensured by the conditions

(1.32) or (1.32)— , then it is not clear whether these properties are conserved

¯ ¯

after addition of DG(ξ). This is due to the fact that DG(ξ) is a sparse ma-

trix of the same structure as S, but it also entails a spatial coupling that

is not contained in the continuous formulation (8.25).

Numerical Quadrature

Owing to the above reason, the use of a node-oriented quadrature rule for

the approximation of G(ξ) is suggested, i.e., a quadrature formula of the

type

M

ωi f (ai ) for f ∈ C(„¦)

¯

Q(f ) := (8.29)

i=1

with weights ωi ∈ R. Such a quadrature formula results from

for f ∈ C(„¦) ,

¯

Q(f ) := I(f ) dx (8.30)

„¦

where

M

I : C(„¦) ’ Vh ,

¯ I(f ) := f (ai )•i ,

i=1

is the interpolation operator of the degrees of freedom. For this considera-

tion we thus assume that only Lagrangian elements enter the de¬nition of

362 8. Iterative Methods for Nonlinear Equations

Vh . In the case of (8.30) the weights in (8.29) are hence given by

ωi = •i dx .

„¦

This corresponds to the local description (3.116). More speci¬cally, we get,

for example, for the linear approach on simplices as a generalization of the

composite trapezoidal rule,

1

|K| ,

ωi = (8.31)

d+1

K∈Th

with ai ∈K

with d denoting the spatial dimension and Th the underlying triangulation.

Approximation of the mapping G by a quadrature rule of the type (8.29)

gives

˜ ˜ ˜

G(ξ) = Gj (ξ) with Gj (ξ) = ωj ψ(ξj ) ,

j

˜

because of •j (ai ) = δij . We see that the approximation G has the property

˜ ˜

that Gj depends only on ξj . We call such a G a diagonal ¬eld. Qualitatively,

this corresponds better to the continuous formulation (8.25) and leads to

˜¯

the fact that DG(ξ) is diagonal:

˜¯ ¯

DG(ξ)ij = ωj ψ (ξj )δij . (8.32)

If we impose that all quadrature weights ωi are positive, which is the case

in (8.31) and also in other examples in Section 3.5.2, all of the above con-

˜¯ ˜¯

siderations about the properties of DG(ξ) and S + DG(ξ) remain valid;

—

additionally, if S is an M-matrix, because the conditions (1.32) or (1.32)

˜¯

are ful¬lled, then S + DG(ξ) remains an M-matrix, too. This is justi¬ed

by the following fact (compare [34] and [5]; cf. (1.33) for the notation):

If A is an M-matrix and B ≥ A with bij ¤ 0 for i = j,

(8.33)

then B is an M-matrix as well.

Conditions of Convergence

Comparing the requirements for the ¬xed-point iteration and Newton™s

method stated in the (convergence) Theorems 8.4 and 8.11, we observe that

the conditions in Theorem 8.4 can be ful¬lled only in special cases, where

S ’1 DG(ξ) is small according to a suitable matrix norm (see Lemma 8.3).

˜¯

But it is also di¬cult to draw general conclusions about requirement (iii)

in Theorem 8.11, which together with h < 1 quanti¬es the closeness of the

initial iterate to the solution. The postulation (i), on the other hand, is

met for (8.27) and (8.32) if ψ is Lipschitz continuous (see Exercise 8.7).

Concerning the postulation (ii) we have the following: Let ψ be monotone

nondecreasing (i.e., (8.28) holds with ± ≥ 0) and let S be symmetric and

positive de¬nite, which is true for a problem without convection terms

8.3. Semilinear Elliptic and Parabolic Problems 363

(compare (3.27)). Then we have in the spectral norm

S ’1 = 1/»min(S) .

2

Here »min (S) > 0 denotes the smallest eigenvalue of S. Hence

(S + DG(ξ))’1 = 1/»min S + DG(ξ) ¤ 1/»min(S) = S ’1

¯ ,

2 2

and consequently, requirement (ii) is valid with β = S ’1 2 .

Concerning the choice of the initial iterate, there is no generally successful

strategy. We may choose the solution of the linear subproblem, i.e.,

Sξ(0) = b . (8.34)

Should it fail to converge even with damping, then we may apply, as a

generalization of (8.34), the continuation method to the family of problems

f (», ξ) := S + »G(ξ) ’ b = 0

with continuation parameter » ∈ [0, 1]. If all these problems have solutions

ξ = ξ» so that Df (ξ; ») exists and is nonsingular in a neighbourhood of

ξ» , and if there exists a continuous solution trajectory without bifurcation,

then [0, 1] can be discretized by 0 = »0 < »1 < · · · < »N = 1, and solutions

ξ»i of f (ξ; »i ) = 0 can be obtained by performing a Newton iteration with

the (approximative) solution for » = »i’1 as starting iterate. Since the ξ »i

for i < N are just auxiliary means, they should be obtained rather coarsely,

i.e., with one or two Newton steps. The stated conditions are ful¬lled under

the supposition (8.28). If this condition of monotonicity does not hold, we

may encounter a bifurcation of the continuous solution (see, for example,

[29, pp. 28 ¬.]).

Instationary Problems

The elliptic boundary value problem (8.25) corresponds to the parabolic

initial value problem

‚t u(x, t) + Lu(x, t) + ψ(u(x, t)) = 0 for (x, t) ∈ QT (8.35)

with linear boundary conditions according to (3.18)“(3.20) and the initial

condition

for x ∈ „¦ .

u(x, 0) = u0 (x) (8.36)

We have already met an example for (8.35), (8.36) in (0.32). Analo-

gously to (8.26) and (7.45), the Galerkin discretization in Vh (i.e., the

semidiscretization) leads to the nonlinear system of ordinary di¬erential

equations

d

ξ(t) + Sξ(t) + G(ξ(t)) = β(t) for t ∈ (0, T ] ,

B ξ(0) = ξ0

dt

M

for the representing vector ξ(t) of the approximation uh (·, t) = i=1 ξi (t)•i ,

M

where u0h = i=1 ξ0i •i is an approximation of the initial value u0 (see

364 8. Iterative Methods for Nonlinear Equations

Section 7.2). The matrix B is the mass matrix

B= •j , •i ,

0 ij

and β(t) contains the contributions of the inhomogeneous boundary

conditions analogously to b in (8.26).

To obtain the fully discrete scheme we use the one-step-theta method as

in Section 7.3. Here we allow the time step size „n to vary in each step, in

particular determined by a time step control before the execution of the

nth time step. So, if the approximation U n is known for t = tn , then the

approximation U n+1 for t = tn+1 := tn + „n is given in generalization of

(7.72) as the solution of

1 ’ U n , vh + a ˜U n+1 + (1 ’ ˜)U n , vh

n+1

„n U

(8.37)

0

, vh = ˜β(tn+1 ) + (1 ’ ˜)β(tn ).

n+˜

+ψ

Here ˜ ∈ [0, 1] is the ¬xed parameter of implicity. For the choice of ψ n+˜

we have two possibilities:

ψ n+˜ = ˜ψ(U n+1 ) + (1 ’ ˜)ψ(U n ) (8.38)

or

ψ n+˜ = ψ ˜U n+1 + (1 ’ ˜)U n . (8.39)

In the explicit case, i.e., ˜ = 0, (8.37) represents a linear system of equa-

tions for U n+1 (with the system matrix B) and does not have to be treated

further here. In the implicit case ˜ ∈ (0, 1] we obtain again a nonlinear

system of the type (8.18), i.e.,

Aξ + g(ξ) = 0 ,

in the variable ξ = ξn+1 , where ξn+1 is the representation vector of U n+1 :

M n+1

U n+1 = i=1 ξi •i . Now we have for the variant (8.38),

A := B + ˜„n S , (8.40)

:= ˜„n G(ξ) ’ b ,

g(ξ) (8.41)

with

(B ’ (1 ’ ˜)„n S) ξ n ’ (1 ’ ˜)„n G(ξ n )

b :=

(8.42)

+ ˜β(tn+1 ) + (1 ’ ˜)β(tn ) .

For the variant (8.39) g changes to

g(ξ) := „n G (˜ξ + (1 ’ ˜)ξn ) ’ b ,

and in the de¬nition of b the second summation term drops out. The vector

ξn is the representation vector of the already known approximation U n .

8.3. Semilinear Elliptic and Parabolic Problems 365

Numerical Quadrature

As in the stationary case we can approximate g by a quadrature rule of the

form (8.29), which leads to

g(ξ) = ˜„n G(ξ) ’ b

˜

˜

in (8.38) and to

g (ξ) = „n G (˜ξ + (1 ’ ˜)ξn ) ’ b

˜

˜

in (8.39). The functional matrices of g and g are thus equal for (8.38) and

˜

(8.39), except to the point where ψ is being evaluated. Consequently, it

su¬ces in the following to refer to (8.38). Based on the same motivation,

a quadrature rule of the form (8.29) can be applied to the mass matrix

B. Such a mass lumping results in a diagonal approximation of the mass

matrix

˜

B = diag(ωi ) .

In contrast to the stationary case we get the factor ˜„n in front of the

nonlinearity, where the time step size „n may be chosen arbitrarily small.

Of course, we have to take into account that the number of time steps

necessary to achieve a ¬xed time T is respectively raised. All of the above

¯

considerations about the matrix properties of A + Dg(ξ) are conserved,

where A is no longer the sti¬ness matrix, but represents the linear combina-

tion (8.40) with the mass matrix. This reduces the requirements concerning

the V -ellipticity of a (see (3.27)) and thus the positive de¬niteness of A.

Admittedly, A is not necessarily an M-matrix if S is one, because the

conditions (1.32) or (1.32)— are not valid. Here the approximation B is ad-

˜

vantageous, because using nonnegative weights will conserve this property

due to (8.33).

Conditions of Convergence

Clear di¬erences arise in answering the question of how to ensure the con-

vergence of the iteration schemes. Even for the ¬xed-point iteration it is

true that the method converges globally if only the time step size „n is

chosen small enough. We want to demonstrate this in the following by an

example of a quadrature with nonnegative weights in the mass matrix and

the nonlinearity. Therefore, the Lipschitz constant of A’1 g is estimated

according to Lemma 8.3. Let the norm be a matrix norm induced by a

p-norm | · |p and let A be nonsingular. We get

’1

A’1 I + ˜„n B ’1 S B ’1 ˜„n sup |ψ (s)| B

¤ ˜ ˜ ˜

sup D˜(ξ)

g

s∈R

ξ∈R M

’1

I + ˜„n B ’1 S

¤ ˜„n sup |ψ (s)| κ(B)

˜ ˜

s∈R

’1

I + ˜„n B ’1 S

˜

=: C„n .

366 8. Iterative Methods for Nonlinear Equations

Thus we assume the boundedness of ψ on R (which may even be weakened).

For a given ‘ ∈ (0, 1) choose „n su¬ciently small such that

˜„n B ’1 S ¤ ‘

˜

holds. With Lemma (A3.11) it follows that

1

’1

¤

˜

I + ˜„n BS ,

1’‘

and thus we obtain

C„n

γ=

1’‘

as a Lipschitz constant for A’1 g. We see that by choosing „n su¬ciently

small, the contraction property of A’1 g can be guaranteed. From this fact

a (heuristic) step size control can be deduced that reduces the step size

when a lack of convergence is detected and repeats the step, and in case of

satisfactory convergence increases the time step size.

Nevertheless, in general, Newton™s method is to be preferred: Here we

can expect that the quality of the initial iterate ξ(0) = ξ n for time step

(n + 1) improves the smaller we choose „n . The step size control mentioned

above may thus be chosen here, too (in conjunction with the enlargement

of the range of convergence via damping). Nonetheless, a problem only to

be solved in numerical practice consists in coordinating the control param-

eters of the time step control, the damping strategy, and eventually the

termination of the inner iteration in such a way that overall, an e¬cient

algorithm is obtained.

Exercises

˜

8.7 Study the Lipschitz property of DG de¬ned by (8.27) and of DG

de¬ned by (8.32), provided ψ is Lipschitz.

8.8 Decide whether A’1 g is contractive in case of (8.40)“(8.42).

8.9 The boundary value problem

’u + eu = 0 in (0, 1), u(0) = u(1) = 0,

is to be discretized by a ¬nite element method using continuous, piecewise

linear functions on equidistant grids. Quadrature is to be done with the

trapezoidal rule.

(a) Compute the matrix Ah ∈ Rm,m and the nonlinear vectorvalued

function Fh : Rm ’ Rm , in a matrix-vector notation

Ah Uh + Fh (Uh ) = 0

8.3. Semilinear Elliptic and Parabolic Problems 367

of the discretization. Here Uh ∈ Rm denotes the vector of un-

known nodal values of the approximative solution and, for uniqueness

of the representation, the elements of Ah are independent of the

discretization parameter h.

(b) Study the convergence of the iterative procedure

(k+1) (k) (k)

= (2 + h2 )I ’ Ah Uh ’ Fh Uh

(2 + h2 )Uh

(±) ,

(k+1) (k+1) (k)

= (2I ’ Ah ) Uh .

(β) 2Uh + Fh Uh

9

Discretization Methods

for Convection-Dominated Problems

9.1 Standard Methods and Convection-Dominated

Problems

As we have seen in the introductory Chapter 0, the modelling of transport

and reaction processes in porous media results in di¬erential equations of

the form

‚t u ’ ∇ · (K∇u ’ cu) = f ,

which is a special case of the form (0.33). Similar equations occur in the

modelling of the heat transport in ¬‚owing water, the carrier transport

in semiconductors, and the propagation of epidemics. These application-

speci¬c equations often share the property that their so-called global P´clet

e

number

c ∞ diam(„¦)

Pe := (9.1)

K ∞

is signi¬cantly larger than one. For example, representative values range

from 25 (transport of a dissolved substance in ground water) up to about

107 (modelling of semiconductors). In such cases, the equations are called

convection-dominated.

Therefore, in what follows, the Dirichlet boundary value problem intro-

duced in Section 3.2 will be looked at from the point of view of large global

P´clet numbers, whereas in Section 9.4, the initial boundary value problem

e

from Chapter 7 will be considered from this aspect.

9.1. Standard Methods 369

Let „¦ ‚ Rd denote a bounded domain with a Lipschitz continuous

boundary. Given a function f : „¦ ’ R, a function u : „¦ ’ R is to

be determined such that

Lu =f in „¦,

(9.2)

u =0 on “,

where again

Lu := ’∇ · (K∇u) + c · ∇u + ru ,

with su¬ciently smooth coe¬cients

K : „¦ ’ Rd,d , c : „¦ ’ Rd , r : „¦ ’ R.

Unfortunately, standard discretization methods (¬nite di¬erence, ¬nite

element, and ¬nite volume methods) fail when applied to convection-

dominated equations. At ¬rst glance, this seems to be a contradiction to the

theory of these methods presented in the preceding chapters, because there

we did not have any restriction on the global P´clet number. This apparent

e

contradiction may be explained as follows: On the one hand, the theoretical

results are still true for the convection-dominated case, but on the other

hand, some assumptions of the statements therein (such as “for su¬ciently

small h”) lack sharpness. This, in turn, may lead to practically unrealistic

conditions (cf. the later discussion of the estimate (9.13)). For example, it

may happen that the theoretically required step sizes are so small that the

resulting discrete problems are too expensive or even untreatable.

So one can ask whether the theory is insu¬cient or not. The following

example will show that this is not necessarily the case.

Example 9.1 Given a constant di¬usion coe¬cient k > 0, consider the

boundary value problem

(’ku + u) = 0 in „¦ := (0, 1) ,

u(0) = u(1) ’ 1 = 0 .

Its solution is

1 ’ exp (x/k)

u(x) = .

1 ’ exp (1/k)

A rough sketch of the graph (Figure 9.1) shows that this function has a

signi¬cant boundary layer at the right boundary of the interval even for

the comparatively small global P´clet number Pe = 100. In the larger

e

subinterval (about (0, 0.95)) it is very smooth (nearly constant), whereas

in the remaining small subinterval (about (0.95, 1)) the absolute value of

its ¬rst derivative is large.

Given an equidistant grid of width h = 1/(M + 1), M ∈ N, a dis-

cretization by means of symmetric di¬erence quotients yields the di¬erence

370 9. Discretization of Convection-Dominated Problems

1

0.8

0.6

0.4

0.2

0

0 0.2 0.4 0.6 0.8 1

Figure 9.1. Solution for k = 0.01.

equations

ui’1 ’ 2ui + ui+1 ui+1 ’ ui’1

’k i ∈ {1, . . . , M } =: Λ ,

+ = 0,

h2 2h

u0 = uM+1 ’ 1 = 0 .

Collecting the coe¬cients and multiplying the result by 2h, we arrive at

2k 4k 2k

’ ’ 1 ui’1 + ui + ’ i ∈ Λ.

+ 1 ui+1 = 0 ,

h h h

If we make the ansatz ui = »i , the di¬erence equations can be solved

exactly:

i

1’ 2k+h

2k’h

ui = .

M+1

1’ 2k+h

2k’h

In the case 2k < h, which is by no means unrealistic (e.g., for the typical

value k = 10’7 ), the numerical solution considerably oscillates, in contrast

to the behaviour of the exact solution u. These oscillations do not disappear

until h < 2k is reached, but this condition is very restrictive for small values

of k.

But even if the condition h < 2k is satis¬ed, undesirable e¬ects can

be observed. For example, in the special case h = k we have at the node

aM = M h that

1 ’ exp (M h/k) 1 ’ exp (M ) exp (’M ) ’ 1

u(aM ) = = =

1 ’ exp (1/k) 1 ’ exp (M + 1) exp (’M ) ’ exp (1)

’ exp (’1) for h ’ 0 ,

9.1. Standard Methods 371

whereas the numerical solution at this point asymptotically behaves like

(note that » = (2k + h)/(2k ’ h) = 3)

»’M ’ 1

1 ’ »M 1 1