square in Figure 3.14, which is a¬ected by a severe non uniformity of the

triangle size. The grid consists of NT = 112 triangles and N = 73 vertices,

120 3. Direct Methods for the Solution of Linear Systems

1 1

0.9 0.9

0.8 0.8

0.7 0.7

0.6 0.6

0.5 0.5

0.4 0.4

0.3 0.3

0.2 0.2

0.1 0.1

0 0

0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

FIGURE 3.14. Triangulation before (left) and after (right) the regularization

of which Nb = 32 are on the boundary. The size of each of the two linear

systems (3.78) is thus equal to 41 and their solution is carried out by the

LU factorization of matrix A in its original form (1) and using its sparse

format (2), obtained using the Cuthill-McKee inverse reordering algorithm

described in Section 3.9.1.

In Figure 3.15 the sparsity patterns of A are displayed, without and with

reordering; the integer nz = 237 denotes the number of nonzero entries

in the matrix. Notice that in the second case there is a decrease in the

bandwidth of the matrix, to which corresponds a large reduction in the

operation count from 61623 to 5552. The ¬nal con¬guration of the grid is

displayed in Figure 3.14 (right), which clearly shows the e¬ectiveness of the

regularization procedure.

0 0

5 5

10 10

15 15

20 20

25 25

30 30

35 35

40 40

0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40

nz = 237 nz = 237

FIGURE 3.15. Sparsity patterns of matrix A without and with reordering (left

and right, respectively)

3.15 Exercises 121

3.15 Exercises

1. For any square matrix A∈ Rn—n , prove the following relations

1 1

K2 (A) ¤ K1 (A) ¤ nK2 (A), K∞ (A) ¤ K2 (A) ¤ nK∞ (A),

n n

1

K1 (A) ¤ K∞ (A) ¤ n2 K1 (A).

n2

They allow us to conclude that if a matrix is ill-conditioned in a certain

norm it remains so even in another norm, up to a factor depending on n.

2. Check that the matrix B ∈ Rn—n : bii = 1, bij = ’1 if i < j, bij = 0 if

i > j, has determinant equal to 1, yet K∞ (B) is large (equal to n2n’1 ).

3. Prove that K(AB) ¤ K(A)K(B), for any two square nonsingular matrices

A,B∈ Rn—n .

4. Given the matrix A ∈ R2—2 , a11 = a22 = 1, a12 = γ, a21 = 0, check that

for γ ≥ 0, K∞ (A) = K1 (A) = (1 + γ)2 . Next, consider the linear system

Ax = b where b is such that x = (1 ’ γ, 1)T is the solution. Find a bound

for δx ∞ / x ∞ in terms of δb ∞ / b ∞ when δb = (δ1 , δ2 )T . Is the

problem well- or ill-conditioned?

5. Consider the matrix A ∈ Rn—n , with entries aij = 1 if i = j or j = n,

aij = ’1 if i > j, zero otherwise. Show that A admits an LU factorization,

with |lij | ¤ 1 and unn = 2n’1 .

6. Consider matrix (3.68) in Example 3.7. Prove that the matrices L and U

have entries very large in module. Check that using GEM with complete

pivoting yields the exact solution.

7. Devise a variant of GEM that transforms a nonsingular matrix A ∈ Rn—n

directly into a diagonal matrix D. This process is commonly known as the

Gauss-Jordan method. Find the Gauss-Jordan transformation matrices Gi ,

i = 1, . . . , n, such that Gn . . . G1 A = D.

8. Let A be a sparse matrix of order n. Prove that the computational cost of

the LU factorization of A is given by (3.61). Prove also that it is always

less than

n

1

mk (A) (mk (A) + 3) .

2

k=1

9. Prove that, if A is a symmetric and positive de¬nite matrix, solving the

linear system Ax = b amounts to computing x= n (ci /»i )vi , where »i

i=1

are the eigenvalues of A and vi are the corresponding eigenvectors.

10. (From [JM92]). Consider the following linear system

1001 1000 x1 b1

= .

1000 1001 x2 b2

Using Exercise 9, explain why, when b = (2001, 2001)T , a small change

δb = (1, 0)T produces large variations in the solution, while, conversely,

122 3. Direct Methods for the Solution of Linear Systems

when b = (1, ’1)T , a small variation δx = (0.001, 0)T in the solution

induces a large change in b.

[Hint : expand the right hand side on the basis of the eigenvectors of the

matrix.]

11. Characterize the ¬ll-in for a matrix A ∈ Rn—n having nonzero entries only

on the main diagonal and on the ¬rst column and last row. Propose a

permutation that minimizes the ¬ll-in.

[Hint : it su¬ces to exchange the ¬rst row and the ¬rst column with the

last row and the last column, respectively.]

12. Consider the linear system Hn x = b, where Hn is the Hilbert matrix of

order n. Estimate, as a function of n, the maximum number of signi¬cant

digits that are expected when solving the system by GEM.

13. Given the vectors

v1 = [1, 1, 1, ’1]T , v2 = [2, ’1, ’1, 1]T

v3 = [0, 3, 3, ’3]T , v4 = [’1, 2, 2, 1]T

generate an orthonormal system using the Gram-Schmidt algorithm, in

either its standard and modi¬ed versions, and compare the obtained results.

What is the dimension of the space generated by the given vectors?

14. Prove that if A=QR then

1

K1 (A) ¤ K1 (R) ¤ nK1 (A),

n

while K2 (A) = K2 (R).

15. Let A ∈ Rn—n be a nonsingular matrix. Determine the conditions under

which the ratio y 2 / x 2 , with x and y as in (3.70), approximates A’1 2 .

[Solution : let UΣVT be the singular value decomposition of A. Denote

by ui , vi the column vectors of U and V, respectively, and expand the

˜

vector d in (3.70) on the basis spanned by {vi }. Then d = n di vi and,

i=1

˜ ˜

n n 2

from (3.70), x = i=1 (di /σi )ui , y = i=1 (di /σi )vi , having denoted the

singular values of A by σ1 , . . . , σn .

The ratio

1/2

n n

˜2 ˜

(di /σi )2 / 2

y 2/ x = (di /σi )

2

i=1 i=1

is about equal to σn = A’1 2 if: (i) y has a relevant component in the

’1

˜ ˜

direction of vn (i.e., if dn is not excessively small), and (ii) the ratio dn /σn

˜

is not negligible with respect to the ratios di /σi for i = 1, . . . , n ’ 1. This

last circumstance certainly occurs if A is ill-conditioned in the · 2 -norm

since σn σ1 .]

4

Iterative Methods for Solving Linear

Systems

Iterative methods formally yield the solution x of a linear system after an

in¬nite number of steps. At each step they require the computation of the

residual of the system. In the case of a full matrix, their computational

cost is therefore of the order of n2 operations for each iteration, to be

compared with an overall cost of the order of 2 n3 operations needed by

3

direct methods. Iterative methods can therefore become competitive with

direct methods provided the number of iterations that are required to con-

verge (within a prescribed tolerance) is either independent of n or scales

sublinearly with respect to n.

In the case of large sparse matrices, as discussed in Section 3.9, direct

methods may be unconvenient due to the dramatic ¬ll-in, although ex-

tremely e¬cient direct solvers can be devised on sparse matrices featuring

special structures like, for example, those encountered in the approximation

of partial di¬erential equations (see Chapters 12 and 13).

Finally, we notice that, when A is ill-conditioned, a combined use of direct

and iterative methods is made possible by preconditioning techniques that

will be addressed in Section 4.3.2.

4.1 On the Convergence of Iterative Methods

The basic idea of iterative methods is to construct a sequence of vectors

x(k) that enjoy the property of convergence

x = lim x(k) , (4.1)

k’∞

124 4. Iterative Methods for Solving Linear Systems

where x is the solution to (3.2). In practice, the iterative process is stopped

at the minimum value of n such that x(n) ’ x < µ, where µ is a ¬xed

tolerance and · is any convenient vector norm. However, since the exact

solution is obviously not available, it is necessary to introduce suitable

stopping criteria to monitor the convergence of the iteration (see Section

4.6).

To start with, we consider iterative methods of the form

k ≥ 0,

x(0) given, x(k+1) = Bx(k) + f , (4.2)

having denoted by B an n — n square matrix called the iteration matrix

and by f a vector that is obtained from the right hand side b.

De¬nition 4.1 An iterative method of the form (4.2) is said to be consis-

tent with (3.2) if f and B are such that x = Bx + f . Equivalently,

f = (I ’ B)A’1 b.

Having denoted by

e(k) = x(k) ’ x (4.3)

the error at the k-th step of the iteration, the condition for convergence

(4.1) amounts to requiring that lim e(k) = 0 for any choice of the initial

k’∞

(0)

datum x (often called the initial guess).

Consistency alone does not su¬ce to ensure the convergence of the iter-

ative method (4.2), as shown in the following example.

Example 4.1 To solve the linear system 2Ix = b, consider the iterative method

x(k+1) = ’x(k) + b,

which is obviously consistent. This scheme is not convergent for any choice of

the initial guess. If, for instance, x(0) = 0, the method generates the sequence

x(2k) = 0, x(2k+1) = b, k = 0, 1, . . . .

•

On the other hand, if x(0) = 1 b the method is convergent.

2

Theorem 4.1 Let (4.2) be a consistent method. Then, the sequence of vec-

tors x(k) converges to the solution of (3.2) for any choice of x(0) i¬

ρ(B) < 1.

Proof. From (4.3) and the consistency assumption, the recursive relation e(k+1) =

Be(k) is obtained. Therefore,

∀k = 0, 1, . . .

e(k) = Bk e(0) , (4.4)

4.1 On the Convergence of Iterative Methods 125

Thus, thanks to Theorem 1.5, it follows that lim Bk e(0) = 0 for any e(0) i¬

k’∞

ρ(B) < 1.

Conversely, suppose that ρ(B) > 1, then there exists at least one eigenvalue

»(B) with module greater than 1. Let e(0) be an eigenvector associated with »;

then Be(0) = »e(0) and, therefore, e(k) = »k e(0) . As a consequence, e(k) cannot

tend to 0 as k ’ ∞, since |»| > 1. 3

From (1.23) and Theorem 1.5 it follows that a su¬cient condition for con-

vergence to hold is that B < 1, for any matrix norm. It is reasonable

to expect that the convergence is faster when ρ(B) is smaller so that an

estimate of ρ(B) might provide a sound indication of the convergence of

the algorithm. Other remarkable quantities in convergence analysis are con-

tained in the following de¬nition.

De¬nition 4.2 Let B be the iteration matrix. We call:

1. Bm the convergence factor after m steps of the iteration;

2. Bm 1/m

the average convergence factor after m steps;

3. Rm (B) = ’ m log Bm the average convergence rate after m steps.

1

These quantities are too expensive to compute since they require evaluating

Bm . Therefore, it is usually preferred to estimate the asymptotic conver-

gence rate, which is de¬ned as

R(B) = lim Rk (B) = ’ log ρ(B) (4.5)

k’∞

where Property 1.13 has been accounted for. In particular, if B were sym-

metric, we would have

1

Rm (B) = ’ = ’ log ρ(B).

log Bm 2

m

In the case of nonsymmetric matrices, ρ(B) sometimes provides an overop-

timistic estimate of Bm 1/m (see [Axe94], Section 5.1). Indeed, although

ρ(B) < 1, the convergence to zero of the sequence Bm might be non-

monotone (see Exercise 1). We ¬nally notice that, due to (4.5), ρ(B) is

the asymptotic convergence factor. Criteria for estimating the quantities

de¬ned so far will be addressed in Section 4.6.

Remark 4.1 The iterations introduced in (4.2) are a special instance of

iterative methods of the form

x(0) = f0 (A, b),

x(n+1) = fn+1 (x(n) , x(n’1) , . . . , x(n’m) , A, b), for n ≥ m,

126 4. Iterative Methods for Solving Linear Systems

where fi and x(m) , . . . , x(1) are given functions and vectors, respectively.

The number of steps which the current iteration depends on is called the

order of the method. If the functions fi are independent of the step index i,

the method is called stationary, otherwise it is nonstationary. Finally, if fi

depends linearly on x(0) , . . . , x(m) , the method is called linear, otherwise

it is nonlinear.

In the light of these de¬nitions, the methods considered so far are there-

fore stationary linear iterative methods of ¬rst order. In Section 4.3, exam-

ples of nonstationary linear methods will be provided.

4.2 Linear Iterative Methods

A general technique to devise consistent linear iterative methods is based

on an additive splitting of the matrix A of the form A=P’N, where P

and N are two suitable matrices and P is nonsingular. For reasons that

will be clear in the later sections, P is called preconditioning matrix or

preconditioner.

Precisely, given x(0) , one can compute x(k) for k ≥ 1, solving the systems

k ≥ 0.

Px(k+1) = Nx(k) + b, (4.6)

The iteration matrix of method (4.6) is B = P’1 N, while f = P’1 b. Alter-

natively, (4.6) can be written in the form

x(k+1) = x(k) + P’1 r(k) , (4.7)

where

r(k) = b ’ Ax(k) (4.8)

denotes the residual vector at step k. Relation (4.7) outlines the fact that

a linear system, with coe¬cient matrix P, must be solved to update the

solution at step k + 1. Thus P, besides being nonsingular, ought to be easily

invertible, in order to keep the overall computational cost low. (Notice that,

if P were equal to A and N=0, method (4.7) would converge in one iteration,

but at the same cost of a direct method).

Let us mention two results that ensure convergence of the iteration (4.7),

provided suitable conditions on the splitting of A are ful¬lled (for their

proof, we refer to [Hac94]).

Property 4.1 Let A = P ’ N, with A and P symmetric and positive def-

inite. If the matrix 2P ’ A is positive de¬nite, then the iterative method

de¬ned in (4.7) is convergent for any choice of the initial datum x(0) and

ρ(B) = B =B < 1.

A P

4.2 Linear Iterative Methods 127

Moreover, the convergence of the iteration is monotone with respect to the

norms · P and · A (i.e., e(k+1) P < e(k) P and e(k+1) A < e(k) A

k = 0, 1, . . . ).

Property 4.2 Let A = P ’ N with A symmetric and positive de¬nite. If

the matrix P + PT ’ A is positive de¬nite, then P is invertible, the iterative

method de¬ned in (4.7) is monotonically convergent with respect to norm

· A and ρ(B) ¤ B A < 1.

4.2.1 Jacobi, Gauss-Seidel and Relaxation Methods

In this section we consider some classical linear iterative methods.

If the diagonal entries of A are nonzero, we can single out in each equation

the corresponding unknown, obtaining the equivalent linear system

®

n

1

°bi ’ aij xj » ,

xi = i = 1, . . . , n. (4.9)

aii j=1

j=i

In the Jacobi method, once an arbitrarily initial guess x0 has been chosen,

x(k+1) is computed by the formulae

®

n

1 (k)

(k+1)

°bi ’ aij xj » , i = 1, . . . , n.

xi = (4.10)

aii j=1

j=i

This amounts to performing the following splitting for A

P = D, N = D ’ A = E + F,

where D is the diagonal matrix of the diagonal entries of A, E is the lower

triangular matrix of entries eij = ’aij if i > j, eij = 0 if i ¤ j, and F is

the upper triangular matrix of entries fij = ’aij if j > i, fij = 0 if j ¤ i.

As a consequence, A=D-(E+F).

The iteration matrix of the Jacobi method is thus given by

BJ = D’1 (E + F) = I ’ D’1 A. (4.11)

A generalization of the Jacobi method is the over-relaxation method

(or JOR), in which, having introduced a relaxation parameter ω, (4.10) is

replaced by

®

n

ω (k)

(k+1) (k)

°bi ’ aij xj » + (1 ’ ω)xi ,

xi = i = 1, . . . , n.

aii j=1

j=i

128 4. Iterative Methods for Solving Linear Systems

The corresponding iteration matrix is

BJω = ωBJ + (1 ’ ω)I. (4.12)

In the form (4.7), the JOR method corresponds to

x(k+1) = x(k) + ωD’1 r(k) .

This method is consistent for any ω = 0 and for ω = 1 it coincides with

the Jacobi method.

The Gauss-Seidel method di¬ers from the Jacobi method in the fact that

(k+1)

at the k + 1-th step the available values of xi are being used to update

the solution, so that, instead of (4.10), one has

®

i’1 n

1°

aij xj » , i = 1, . . . , n. (4.13)

(k+1) (k+1) (k)

bi ’ ’

xi = aij xj

aii j=1 j=i+1

This method amounts to performing the following splitting for A

P = D ’ E, N = F,

and the associated iteration matrix is

BGS = (D ’ E)’1 F. (4.14)

Starting from Gauss-Seidel method, in analogy to what was done for

Jacobi iterations, we introduce the successive over-relaxation method (or

SOR method)

®

i’1 n

ω°

aij xj » + (1 ’ ω)xi , (4.15)

(k+1) (k+1) (k) (k)

bi ’ ’

xi = aij xj

aii j=1 j=i+1

for i = 1, . . . , n. The method (4.15) can be written in vector form as

(I ’ ωD’1 E)x(k+1) = [(1 ’ ω)I + ωD’1 F]x(k) + ωD’1 b (4.16)

from which the iteration matrix is

B(ω) = (I ’ ωD’1 E)’1 [(1 ’ ω)I + ωD’1 F]. (4.17)

Multiplying by D both sides of (4.16) and recalling that A = D ’ (E + F)

yields the following form (4.7) of the SOR method

’1

1

D’E

(k+1) (k)

r(k) .

x =x +

ω

It is consistent for any ω = 0 and for ω = 1 it coincides with Gauss-Seidel

method. In particular, if ω ∈ (0, 1) the method is called under-relaxation,

while if ω > 1 it is called over-relaxation.

4.2 Linear Iterative Methods 129

4.2.2 Convergence Results for Jacobi and Gauss-Seidel

Methods

There exist special classes of matrices for which it is possible to state a

priori some convergence results for the methods examined in the previous

section. The ¬rst result in this direction is the following.

Theorem 4.2 If A is a strictly diagonally dominant matrix by rows, the

Jacobi and Gauss-Seidel methods are convergent.

Proof. Let us prove the part of the theorem concerning the Jacobi method, while

for the Gauss-Seidel method we refer to [Axe94]. Since A is strictly diagonally

dominant by rows, |aii | > n |aij | for j = i and i = 1, . . . , n. As a consequence,

j=1

n

|aij |/|aii | < 1, so that the Jacobi method is convergent.

BJ = max

∞

i=1,... ,n

j=1,j=i

3

Theorem 4.3 If A and 2D ’ A are symmetric and positive de¬nite matri-

ces, then the Jacobi method is convergent and ρ(BJ ) = BJ A = BJ D .

3

Proof. The theorem follows from Property 4.1 taking P=D.

In the case of the JOR method, the assumption on 2D ’ A can be removed,

yielding the following result.

Theorem 4.4 If A if symmetric positive de¬nite, then the JOR method is

convergent if 0 < ω < 2/ρ(D’1 A).

Proof. The result immediately follows from (4.12) and noting that A has real

3

positive eigenvalues.

Concerning the Gauss-Seidel method, the following result holds.

Theorem 4.5 If A is symmetric positive de¬nite, the Gauss-Seidel method

is monotonically convergent with respect to the norm · A .

Proof. We can apply Property 4.2 to the matrix P=D’E, upon checking that

P + PT ’ A is positive de¬nite. Indeed

P + PT ’ A = 2D ’ E ’ F ’ A = D,

having observed that (D ’ E)T = D ’ F. We conclude by noticing that D is

3

positive de¬nite, since it is the diagonal of A.

Finally, if A is positive de¬nite and tridiagonal, it can be shown that also

the Jacobi method is convergent and

ρ(BGS ) = ρ2 (BJ ). (4.18)

130 4. Iterative Methods for Solving Linear Systems

In this case, the Gauss-Seidel method is more rapidly convergent than the

Jacobi method. Relation (4.18) holds even if A enjoys the following A-

property.

De¬nition 4.3 A consistently ordered matrix M ∈ Rn—n (that is, a matrix

such that ±D’1 E+±’1 D’1 F, for ± = 0, has eigenvalues that do not depend

on ±, where M=D-E-F, D = diag(m11 , . . . , mnn ), E and F are strictly lower

and upper triangular matrices, respectively) enjoys the A-property if it can

be partitioned in the 2 — 2 block form

˜

D1 M12

M= ,

˜

M21 D2

˜ ˜

where D1 and D2 are diagonal matrices.

When dealing with general matrices, no a priori conclusions on the conver-

gence properties of the Jacobi and Gauss-Seidel methods can be drawn, as

shown in Example 4.2.

Example 4.2 Consider the 3 — 3 linear systems of the form Ai x = bi , where bi

is always taken in such a way that the solution of the system is the unit vector,

and the matrices Ai are

® ®

’3 3 ’6

304

A1 = ° 7 4 2 » , A2 = ° ’4 7 ’8 » ,

’1 1 2 5 7 ’9

® ®

4 1 1 7 6 9

°2 0 », °4 ’4 » .

’9 5

A3 = A4 =

’8 ’6 ’7 ’3

0 8

It can be checked that the Jacobi method does fail to converge for A1 (ρ(BJ ) =

1.33), while the Gauss-Seidel scheme is convergent. Conversely, in the case of

A2 , the Jacobi method is convergent, while the Gauss-Seidel method fails to

converge (ρ(BGS ) = 1.¯ In the remaining two cases, the Jacobi method is more

1).

slowly convergent than the Gauss-Seidel method for matrix A3 (ρ(BJ ) = 0.44

against ρ(BGS ) = 0.018), and the converse is true for A4 (ρ(BJ ) = 0.64 while

•

ρ(BGS ) = 0.77).

We conclude the section with the following result.

Theorem 4.6 If the Jacobi method is convergent, then the JOR method

converges if 0 < ω ¤ 1.

Proof. From (4.12) we obtain that the eigenvalues of BJω are

µk = ω»k + 1 ’ ω, k = 1, . . . , n,

4.2 Linear Iterative Methods 131

where »k are the eigenvalues of BJ . Then, recalling the Euler formula for the

representation of a complex number, we let »k = rk eiθk and get

|µk |2 = ω 2 rk + 2ωrk cos(θk )(1 ’ ω) + (1 ’ ω)2 ¤ (ωrk + 1 ’ ω)2 ,

2

which is less than 1 if 0 < ω ¤ 1. 3

4.2.3 Convergence Results for the Relaxation Method

The following result provides a necessary condition on ω in order the SOR

method to be convergent.

Theorem 4.7 For any ω ∈ R we have ρ(B(ω)) ≥ |ω ’ 1|; therefore, the

SOR method fails to converge if ω ¤ 0 or ω ≥ 2.

Proof. If {»i } denote the eigenvalues of the SOR iteration matrix, then

n

»i = det (1 ’ ω)I + ωD’1 F = |1 ’ ω|n .

i=1

Therefore, at least one eigenvalue »i must exist such that |»i | ≥ |1 ’ ω| and thus,

in order for convergence to hold, we must have |1 ’ ω| < 1, that is 0 < ω < 2. 3

Assuming that A is symmetric and positive de¬nite, the condition 0 < ω <

2, besides being necessary, becomes also su¬cient for convergence. Indeed

the following result holds (for the proof, see [Hac94]).

Property 4.3 (Ostrowski) If A is symmetric and positive de¬nite, then

the SOR method is convergent i¬ 0 < ω < 2. Moreover, its convergence is

monotone with respect to · A .

Finally, if A is strictly diagonally dominant by rows, the SOR method

converges if 0 < ω ¤ 1.

The results above show that the SOR method is more or less rapidly

convergent, depending on the choice of the relaxation parameter ω. The

question of how to determine the value ωopt for which the convergence rate

is the highest possible can be given a satisfactory answer only in special

cases (see, for instance, [Axe94], [You71], [Var62] or [Wac66]). Here we limit

ourselves to quoting the following result (whose proof is in [Axe94]).

Property 4.4 If the matrix A enjoys the A-property and if BJ has real

eigenvalues, then the SOR method converges for any choice of x(0) i¬

ρ(BJ ) < 1 and 0 < ω < 2. Moreover,

2

ωopt = (4.19)

1 ’ ρ(BJ )2

1+

132 4. Iterative Methods for Solving Linear Systems

and the corresponding asymptotic convergence factor is

1’ 1 ’ ρ(BJ )2

ρ(B(ωopt )) = .

1 ’ ρ(BJ )2

1+

A priori Forward Analysis

4.2.4

In the previous analysis we have neglected the rounding errors. However, as

shown in the following example (taken from [HW76]), they can dramatically

a¬ect the convergence rate of the iterative method.

Example 4.3 Let A be a lower bidiagonal matrix of order 100 with entries

aii = 1.5 and ai,i’1 = 1, and let b ∈ R100 be the right-side with bi = 2.5. The

exact solution of the system Ax = b has components xi = 1 ’ (’2/3)i . The

SOR method with ω = 1.5 should be convergent, working in exact arithmetic,

since ρ(B(1.5)) = 0.5 (far below one). However, running Program 16 with x(0) =

¬‚ (x) + M , which is extremely close to the exact value, the sequence x(k) diverges

and after 100 iterations the algorithm yields a solution with x(100) ∞ = 1013 .

The ¬‚aw is due to rounding error propagation and must not be ascribed to a

•

possible ill-conditioning of the matrix since K∞ (A) 5.

To account for rounding errors, let us denote by x(k) the solution (in ¬nite

arithmetic) generated by an iterative method of the form (4.6) after k steps.

Due to rounding errors, x(k) can be regarded as the exact solution to the

problem

Px(k+1) = Nx(k) + b ’ ζ k , (4.20)

with

ζ k = δPk+1 x(k+1) ’ gk .

The matrix δPk+1 accounts for the rounding errors in the solution of (4.6),

while the vector gk includes the errors made in the evaluation of Nx(k) + b.

From (4.20), we obtain

k

Bj P’1 (b ’ ζ k’j )

(k+1) k+1 (0)

x =B x +

j=0

and for the absolute error e(k+1) = x ’ x(k+1)

k

Bj P’1 ζ k’j .

(k+1) k+1 (0)

e =B e +

j=0

The ¬rst term represents the error that is made by the iterative method

in exact arithmetic; if the method is convergent, this error is negligible for

su¬ciently large values of k. The second term refers instead to rounding

error propagation; its analysis is quite technical and is carried out, for

instance, in [Hig88] in the case of Jacobi, Gauss-Seidel and SOR methods.

4.2 Linear Iterative Methods 133

4.2.5 Block Matrices

The methods of the previous sections are also referred to as point (or line)

iterative methods, since they act on single entries of matrix A. It is possible

to devise block versions of the algorithms, provided that D denotes the block

diagonal matrix whose entries are the m — m diagonal blocks of matrix A

(see Section 1.6).

The block Jacobi method is obtained taking again P=D and N=D-A. The

method is well-de¬ned only if the diagonal blocks of D are nonsingular. If

A is decomposed in p — p square blocks, the block Jacobi method is

p

(k+1) (k)

= bi ’

Aii xi Aij xj , i = 1, . . . , p,

j=1

j=i

having also decomposed the solution vector and the right side in blocks of

size p, denoted by xi and bi , respectively. As a result, at each step, the block

Jacobi method requires solving p linear systems of matrices Aii . Theorem

4.3 is still valid, provided that D is substituted by the corresponding block

diagonal matrix.

In a similar manner, the block Gauss-Seidel and block SOR methods can

be introduced.

4.2.6 Symmetric Form of the Gauss-Seidel and SOR Methods

Even if A is a symmetric matrix, the Gauss-Seidel and SOR methods gen-

erate iteration matrices that are not necessarily symmetric. For that, we

introduce in this section a technique that allows for symmetrizing these

schemes. The ¬nal aim is to provide an approach for generating symmetric

preconditioners (see Section 4.3.2).

Firstly, let us remark that an analogue of the Gauss-Seidel method can

be constructed, by simply exchanging E with F. The following iteration

can thus be de¬ned, called the backward Gauss-Seidel method

(D ’ F)x(k+1) = Ex(k) + b

with iteration matrix given by BGSb = (D ’ F)’1 E.

The symmetric Gauss-Seidel method is obtained by combining an itera-

tion of Gauss-Seidel method with an iteration of backward Gauss-Seidel

method. Precisely, the k-th iteration of the symmetric Gauss-Seidel method

is

(D ’ E)x(k+1/2) = Fx(k) + b, (D ’ F)x(k+1) = Ex(k+1/2) + b.

134 4. Iterative Methods for Solving Linear Systems

Eliminating x(k+1/2) , the following scheme is obtained

x(k+1) = BSGS x(k) + bSGS ,

BSGS = (D ’ F)’1 E(D ’ E)’1 F, (4.21)

bSGS = (D ’ F)’1 [E(D ’ E)’1 + I]b.

The preconditioning matrix associated with (4.21) is

PSGS = (D ’ E)D’1 (D ’ F).

The following result can be proved (see [Hac94]).

Property 4.5 If A is a symmetric positive de¬nite matrix, the symmet-

ric Gauss-Seidel method is convergent, and, moreover, BSGS is symmetric

positive de¬nite.

In a similar manner, de¬ning the backward SOR method

(D ’ ωF)x(k+1) = [ωE + (1 ’ ω)D] x(k) + ωb,

and combining it with a step of SOR method, the following symmetric SOR

method or SSOR, is obtained

x(k+1) = Bs (ω)x(k) + bω

where

Bs (ω) = (D ’ ωF)’1 (ωE + (1 ’ ω)D)(D ’ ωE)’1 (ωF + (1 ’ ω)D),

bω = ω(2 ’ ω)(D ’ ωF)’1 D(D ’ ωE)’1 b.

The preconditioning matrix of this scheme is

1 ω 1

D’1

D’E D’F .

PSSOR (ω) = (4.22)

2’ω

ω ω

If A is symmetric and positive de¬nite, the SSOR method is convergent if

0 < ω < 2 (see [Hac94] for the proof). Typically, the SSOR method with an

optimal choice of the relaxation parameter converges more slowly than the

corresponding SOR method. However, the value of ρ(Bs (ω)) is less sensitive

to a choice of ω around the optimal value (in this respect, see the behavior

of the spectral radii of the two iteration matrices in Figure 4.1). For this

reason, the optimal value of ω that is chosen in the case of SSOR method

is usually the same used for the SOR method (for further details, we refer

to [You71]).

4.2 Linear Iterative Methods 135

1

0.9

0.8 SSOR

0.7

0.6

ρ 0.5

SOR

0.4

0.3

0.2

0.1

0

0 0.5 1 1.5 2

ω

FIGURE 4.1. Spectral radius of the iteration matrix of SOR and SSOR methods,

as a function of the relaxation parameter ω for the matrix tridiag10 (’1, 2, ’1)

4.2.7 Implementation Issues

We provide the programs implementing the Jacobi and Gauss-Seidel meth-

ods in their point form and with relaxation.

In Program 15 the JOR method is implemented (the Jacobi method is

obtained as a special case setting omega = 1). The stopping test monitors

the Euclidean norm of the residual at each iteration, normalized to the

value of the initial residual.

Notice that each component x(i) of the solution vector can be computed

independently; this method can thus be easily parallelized.

Program 15 - JOR : JOR method

function [x, iter]= jor ( a, b, x0, nmax, toll, omega)

[n,n]=size(a);

iter = 0; r = b - a * x0; r0 = norm(r); err = norm (r); x = x0;

while err > toll & iter < nmax

iter = iter + 1;

for i=1:n

s = 0;

for j = 1:i-1, s = s + a (i,j) * x (j); end

for j = i+1:n, s = s + a (i,j) * x (j); end

x (i) = omega * ( b(i) - s) / a(i,i) + (1 - omega) * x(i);

end

r = b - a * x; err = norm (r) / r0;

end

Program 16 implements the SOR method. Taking omega=1 yields the

Gauss-Seidel method.

136 4. Iterative Methods for Solving Linear Systems

Unlike the Jacobi method, this scheme is fully sequential. However, it can

be e¬ciently implemented without storing the solution of the previous step,

with a saving of memory storage.

Program 16 - SOR : SOR method

function [x, iter]= sor ( a, b, x0, nmax, toll, omega)

[n,n]=size(a);

iter = 0; r = b - a * x0; r0 = norm (r); err = norm (r); xold = x0;

while err > toll & iter < nmax

iter = iter + 1;

for i=1:n

s = 0;

for j = 1:i-1, s = s + a (i,j) * x (j); end

for j = i+1:n

s = s + a (i,j) * xold (j);

end

x (i) = omega * ( b(i) - s) / a(i,i) + (1 - omega) * xold (i);

end

x = x™; xold = x; r = b - a * x; err = norm (r) / r0;

end

4.3 Stationary and Nonstationary Iterative

Methods

Denote by

RP = I ’ P’1 A

the iteration matrix associated with (4.7). Proceeding as in the case of

relaxation methods, (4.7) can be generalized introducing a relaxation (or

acceleration) parameter ±. This leads to the following stationary Richard-

son method

x(k+1) = x(k) + ±P’1 r(k) , k ≥ 0. (4.23)

More generally, allowing ± to depend on the iteration index, the nonsta-

tionary Richardson method or semi-iterative method given by

x(k+1) = x(k) + ±k P’1 r(k) , k ≥ 0. (4.24)

The iteration matrix at the k-th step for these methods (depending on k)

is

R(±k ) = I ’ ±k P’1 A,

4.3 Stationary and Nonstationary Iterative Methods 137

with ±k = ± in the stationary case. If P=I, the methods will be called

nonpreconditioned. The Jacobi and Gauss-Seidel methods can be regarded

as stationary Richardson methods with ± = 1, P = D and P = D ’ E,

respectively.

We can rewrite (4.24) (and, thus, also (4.23)) in a form of greater inter-

est for computation. Letting z(k) = P’1 r(k) (the so-called preconditioned

residual), we get x(k+1) = x(k) + ±k z(k) and r(k+1) = b ’ Ax(k+1) =

r(k) ’±k Az(k) . To summarize, a nonstationary Richardson method requires

at each k + 1-th step the following operations:

solve the linear system Pz(k) = r(k) ;

compute the acceleration parameter ±k ;

(4.25)

update the solution x(k+1) = x(k) + ±k z(k) ;

update the residual r(k+1) = r(k) ’ ±k Az(k) .

4.3.1 Convergence Analysis of the Richardson Method

Let us ¬rst consider the stationary Richardson methods for which ±k = ±

for k ≥ 0. The following convergence result holds.

Theorem 4.8 For any nonsingular matrix P, the stationary Richardson

method (4.23) is convergent i¬

2Re»i

> 1 ∀i = 1, . . . , n, (4.26)

±|»i |2

where »i ∈ C are the eigenvalues of P’1 A.

Proof. Let us apply Theorem 4.1 to the iteration matrix R± = I ’ ±P’1 A. The

condition |1 ’ ±»i | < 1 for i = 1, . . . , n yields the inequality

(1 ’ ±Re»i )2 + ±2 (Im»i )2 < 1

3

from which (4.26) immediately follows.

Let us notice that, if the sign of the real parts of the eigenvalues of P’1 A

is not constant, the stationary Richardson method cannot converge.

More speci¬c results can be obtained provided that suitable assumptions

are made on the spectrum of P’1 A.

Theorem 4.9 Assume that P is a nonsingular matrix and that P’1 A has

positive real eigenvalues, ordered in such a way that »1 ≥ »2 ≥ . . . ≥

»n > 0. Then, the stationary Richardson method (4.23) is convergent i¬

0 < ± < 2/»1 . Moreover, letting

2

±opt = (4.27)

»1 + »n

138 4. Iterative Methods for Solving Linear Systems

the spectral radius of the iteration matrix R± is minimum if ± = ±opt , with

»1 ’ » n

ρopt = min [ρ(R± )] = . (4.28)

»1 + »n

±

Proof. The eigenvalues of R± are given by »i (R± ) = 1 ’ ±»i , so that (4.23) is

convergent i¬ |»i (R± )| < 1 for i = 1, . . . , n, that is, if 0 < ± < 2/»1 . It follows

(see Figure 4.2) that ρ(R± ) is minimum when 1 ’ ±»n = ±»1 ’ 1, that is, for

± = 2/(»1 + »n ), which furnishes the desired value for ±opt . By substitution, the

3

desired value of ρopt is obtained.

|1 ’ ±»1 |

ρ=1

|1 ’ ±»k |

ρopt

|1 ’ ±»n |

1 2 1

±opt ±

»1 »1 »n

FIGURE 4.2. Spectral radius of R± as a function of the eigenvalues of P’1 A

If P’1 A is symmetric positive de¬nite, it can be shown that the convergence

of the Richardson method is monotone with respect to either · 2 and · A .

In such a case, using (4.28), we can also relate ρopt to K2 (P’1 A) as follows

K2 (P’1 A) ’ 1 2 A’1 P 2

(4.29)

ρopt = , ±opt = .

K2 (P’1 A) + 1 K2 (P’1 A) + 1

The choice of a suitable preconditioner P is, therefore, of paramount im-

portance for improving the convergence of a Richardson method. Of course,

such a choice should also account for the need of keeping the computational

e¬ort as low as possible. In Section 4.3.2, some preconditioners of common

use in practice will be described.

Corollary 4.1 Let A be a symmetric positive de¬nite matrix. Then, the

non preconditioned stationary Richardson method is convergent and

¤ ρ(R± ) e(k) k ≥ 0.

e(k+1) A, (4.30)

A

4.3 Stationary and Nonstationary Iterative Methods 139

The same result holds for the preconditioned Richardson method, provided

that the matrices P, A and P’1 A are symmetric positive de¬nite.

Proof. The convergence is a consequence of Theorem 4.8. Moreover, we notice

that

¤ A1/2 R± A’1/2

e(k+1) = R± e(k) = A1/2 R± e(k) A1/2 e(k) 2.

A A 2 2

The matrix R± is symmetric positive de¬nite and is similar to A1/2 R± A’1/2 .

Therefore,

A1/2 R± A’1/2 2 = ρ(R± ).

The result (4.30) follows by noting that A1/2 e(k) 2 = e(k) A . A similar proof

can be carried out in the preconditioned case, provided we replace A with P’1 A.

3

Finally, the inequality (4.30) holds even if only P and A are symmetric

positive de¬nite (for the proof, see [QV94], Chapter 2).

4.3.2 Preconditioning Matrices

All the methods introduced in the previous sections can be cast in the form

(4.2), so that they can be regarded as being methods for solving the system

(I ’ B)x = f = P’1 b.

On the other hand, since B=P’1 N, system (3.2) can be equivalently refor-

mulated as

P’1 Ax = P’1 b. (4.31)

The latter is the preconditioned system, being P the preconditioning matrix

or left preconditioner. Right and centered preconditioners can be introduced

as well, if system (3.2) is transformed, respectively, as

AP’1 y = b, y = Px,

or

P’1 AP’1 y = P’1 b, y = PR x.

L R L

There are point preconditioners or block preconditioners, depending on

whether they are applied to the single entries of A or to the blocks of

a partition of A. The iterative methods considered so far correspond to

¬xed-point iterations on a left-preconditioned system. As stressed by (4.25),

computing the inverse of P is not mandatory; actually, the role of P is to

“preconditioning” the residual r(k) through the solution of the additional

system Pz(k) = r(k) .

140 4. Iterative Methods for Solving Linear Systems

Since the preconditioner acts on the spectral radius of the iteration ma-

trix, it would be useful to pick up, for a given linear system, an optimal

preconditioner, i.e., a preconditioner which is able to make the number of

iterations required for convergence independent of the size of the system.

Notice that the choice P=A is optimal but, trivially, “ine¬cient”; some

alternatives of greater computational interest will be examined below.

There is a lack of general theoretical results that allow to devise optimal

preconditioners. However, an established “rule of thumb” is that P is a

good preconditioner for A if P’1 A is near to being a normal matrix and if

its eigenvalues are clustered within a su¬ciently small region of the com-

plex ¬eld. The choice of a preconditioner must also be guided by practical

considerations, noticeably, its computational cost and its memory require-

ments.

Preconditioners can be divided into two main categories: algebraic and

functional preconditioners, the di¬erence being that the algebraic precon-

ditioners are independent of the problem that originated the system to

be solved, and are actually constructed via algebraic procedure, while the

functional preconditioners take advantage of the knowledge of the problem

and are constructed as a function of it. In addition to the preconditioners

already introduced in Section 4.2.6, we give a description of other algebraic

preconditioners of common use.

1. Diagonal preconditioners: choosing P as the diagonal of A is generally

e¬ective if A is symmetric positive de¬nite. A usual choice in the non

symmetric case is to set

« 1/2

n

pii = a2 .

ij

j=1

Block diagonal preconditioners can be constructed in a similar man-

ner. We remark that devising an optimal diagonal preconditioner is

far from being trivial, as previously noticed in Section 3.12.1 when

dealing with the scaling of a matrix.

2. Incomplete LU factorization (shortly ILU) and Incomplete Cholesky

factorization (shortly IC).

An incomplete factorization of A is a process that computes P =

Lin Uin , where Lin is a lower triangular matrix and Uin is an upper

triangular matrix. These matrices are approximations of the exact

matrices L, U of the LU factorization of A and are chosen in such a

way that the residual matrix R = A’Lin Uin satis¬es some prescribed

requirements, such as having zero entries in speci¬ed locations.

For a given matrix M, the L-part (U-part) of M will mean henceforth

the lower (upper) triangular part of M. Moreover, we assume that the

factorization process can be carried out without resorting to pivoting.

4.3 Stationary and Nonstationary Iterative Methods 141

The basic approach to incomplete factorization, consists of requiring

the approximate factors Lin and Uin to have the same sparsity pat-

tern as the L-part and U-part of A, respectively. A general algorithm

for constructing an incomplete factorization is to perform Gauss elim-

(k) (k)

ination as follows: at each step k, compute mik = aik /akk only if

aik = 0 for i = k + 1, . . . , n. Then, compute for j = k + 1, . . . , n

(k+1)

aij only if aij = 0. This algorithm is implemented in Program 17

where the matrices Lin and Uin are progressively overwritten onto

the L-part and U-part of A.

Program 17 - basicILU : Incomplete LU factorization

function [a] = basicILU(a)

[n,n]=size(a);

for k=1:n-1, for i=k+1:n,

if a(i,k) ˜= 0

a(i,k) = a(i,k) / a(k,k);

for j=k+1:n

if a(i,j) ˜= 0

a(i,j) = a(i,j) -a(i,k)*a(k,j);

end

end

end

end, end

We notice that having Lin and Uin with the same patterns as the

L and U-parts of A, respectively, does not necessarily imply that R

has the same sparsity pattern as A, but guarantees that rij = 0 if

aij = 0, as is shown in Figure 4.3.

The resulting incomplete factorization is known as ILU(0), where “0”

means that no ¬ll-in has been introduced in the factorization process.

An alternative strategy might be to ¬x the structure of Lin and Uin

irrespectively of that of A, in such a way that some computational

criteria are satis¬ed (for example, that the incomplete factors have

the simplest possible structure).

The accuracy of the ILU(0) factorization can obviously be improved

by allowing some ¬ll-in to arise, and thus, by accepting nonzero entries

in the factorization whereas A has elements equal to zero. To this

purpose, it is convenient to introduce a function, which we call ¬ll-

in level, that is associated with each entry of A and that is being

modi¬ed during the factorization process. If the ¬ll-in level of an

142 4. Iterative Methods for Solving Linear Systems

0

1

2

3

4

5

6

7

8

9

10

11

0 1 2 3 4 5 6 7 8 9 10 11

FIGURE 4.3. The sparsity pattern of the original matrix A is represented by the

squares, while the pattern of R = A’Lin Uin , computed by Program 17, is drawn

by the bullets

element is greater than an admissible value p ∈ N, the corresponding

entry in Uin or Lin is set equal to zero.

Let us explain how this procedure works, assuming that the matri-

ces Lin and Uin are progressively overwritten to A (as happens in

(k)

Program 4). The ¬ll-in level of an entry aij is denoted by levij ,

where the dependence on k is understood, and it should provide a

reasonable estimate of the size of the entry during the factorization

process. Actually, we are assuming that if levij = q then |aij | δq

(k)

with δ ∈ (0, 1), so that q is greater when |aij | is smaller.

At the starting step of the procedure, the level of the nonzero entries

of A and of the diagonal entries is set equal to 0, while the level of

the null entries is set equal to in¬nity. For any row i = 2, . . . , n, the

following operations are performed: if levik ¤ p, k = 1, . . . , i ’ 1, the

(k+1)

entry mik of Lin and the entries aij of Uin , j = i + 1, . . . , n, are

(k+1)

updated. Moreover, if aij = 0 the value levij is updated as being

the minimum between the available value of levij and levik +levkj +1.

(k+1) (k) (k)

The reason of this choice is that |aij | = |aij ’ mik akj | |δ levij ’

(k+1)

δ levik +levkj +1 |, so that one can assume that the size of |aij | is the

maximum between δ levij and δ levik +levkj +1 .

The above factorization process is called ILU(p) and turns out to be

extremely e¬cient (with p small) provided that it is coupled with a

suitable matrix reordering (see Section 3.9).

Program 18 implements the ILU(p) factorization; it returns in out-

put the approximate matrices Lin and Uin (overwritten to the input

matrix a), with the diagonal entries of Lin equal to 1, and the ma-

4.3 Stationary and Nonstationary Iterative Methods 143

trix lev containing the ¬ll-in level of each entry at the end of the

factorization.

Program 18 - ilup : ILU(p) factorization

function [a,lev] = ilup (a,p)

[n,n]=size(a);

for i=1:n, for j=1:n

if (a(i,j) ˜= 0) | (i==j)

lev(i,j)=0;

else

lev(i,j)=Inf;

end

end, end

for i=2:n,

for k=1:i-1

if lev(i,k) <= p

a(i,k)=a(i,k)/a(k,k);

for j=k+1:n

a(i,j)=a(i,j)-a(i,k)*a(k,j);

if a(i,j) ˜= 0

lev(i,j)=min(lev(i,j),lev(i,k)+lev(k,j)+1);

end

end

end

end

for j=1:n, if lev(i,j) > p, a(i,j) = 0; end, end

end

Example 4.4 Consider the matrix A ∈ R46—46 associated with the ¬nite

‚2· ‚2·

di¬erence approximation of the Laplace operator ∆· = ‚x2 + ‚y2 (see

Section 12.6). This matrix can be generated with the following MATLAB

commands: G=numgrid(™B™,10); A=delsq(G) and corresponds to the dis-

cretization of the di¬erential operator on a domain having the shape of the

exterior of a butter¬‚y and included in the square [’1, 1]2 (see Section 12.6).

The number of nonzero entries of A is 174. Figure 4.4 shows the pattern of

matrix A (drawn by the bullets) and the entries in the pattern added by

the ILU(1) and ILU(2) factorizations due to ¬ll-in (denoted by the squares

and the triangles, respectively). Notice that these entries are all contained

•

within the envelope of A since no pivoting has been performed.

The ILU(p) process can be carried out without knowing the actual

values of the entries of A, but only working on their ¬ll-in levels.

Therefore, we can distinguish between a symbolic factorization (the

generation of the levels) and an actual factorization (the computation

of the entries of ILU(p) starting from the informations contained in

144 4. Iterative Methods for Solving Linear Systems

0

5

10

15

20

25

30

35

40

45

0 5 10 15 20 25 30 35 40 45

FIGURE 4.4. Pattern of the matrix A in Example 4.4 (bullets); entries added by

the ILU(1) and ILU(2) factorizations (squares and triangles, respectively)

the level function). The scheme is thus particularly e¬ective when

several linear systems must be solved, with matrices having the same

structure but di¬erent entries.

On the other hand, for certain classes of matrices, the ¬ll-in level

does not always provide a sound indication of the actual size attained

by the entries. In such cases, it is better to monitor the size of the

entries of R by neglecting each time the entries that are too small.

(k+1)

For instance, one can drop out the entries aij such that

(k+1) (k+1) (k+1) 1/2

|aij | ¤ c|aii ajj | , i, j = 1, . . . , n,

with 0 < c < 1 (see [Axe94]).

In the strategies considered so far, the entries of the matrix that are

dropped out can no longer be recovered in the incomplete factoriza-

tion process. Some remedies exist for this drawback: for instance, at

the end of each k-th step of the factorization, one can sum, row by

row, the discarded entries to the diagonal entries of Uin . By doing

so, an incomplete factorization known as MILU (Modi¬ed ILU) is

obtained, which enjoys the property of being exact with respect to

the constant vectors, i.e., such that R1T = 0T (see [Axe94] for other

formulations). In the practice, this simple trick provides, for a wide

class of matrices, a better preconditioner than obtained with the ILU

method. In the case of symmetric positive de¬nite matrices one can

resort to the Modi¬ed Incomplete Cholesky Factorization (MICh).

We conclude by mentioning the ILUT factorization, which collects the

features of ILU(p) and MILU. This factorization can also include par-

tial pivoting by columns with a slight increase of the computational

4.3 Stationary and Nonstationary Iterative Methods 145

cost. For an e¬cient implementation of incomplete factorizations, we

refer to the MATLAB function luinc in the toolbox sparfun.

The existence of the ILU factorization is not guaranteed for all non-

singular matrices (see for an example [Elm86]) and the process stops

if zero pivotal entries arise. Existence theorems can be proved if A is

an M-matrix [MdV77] or diagonally dominant [Man80]. It is worth

noting that sometimes the ILU factorization turns out to be more

stable than the complete LU factorization [GM83].

3. Polynomial preconditioners: the preconditioning matrix is de¬ned as

P’1 = p(A),

where p is a polynomial in A, usually of low degree.

A remarkable example is given by Neumann polynomial precondi-

tioners. Letting A = D ’ C, we have A = (I ’ CD’1 )D, from which

A’1 = D’1 (I ’ CD’1 )’1 = D’1 (I + CD’1 + (CD’1 )2 + . . . ).

A preconditioner can then be obtained by truncating the series above

at a certain power p. This method is actually e¬ective only if ρ(CD’1 )

< 1, which is the necessary condition in order the series to be con-

vergent.

4. Least-squares preconditioners: A’1 is approximated by a least-squares

polynomial ps (A) (see Section 3.13). Since the aim is to make ma-

trix I ’ P’1 A as close as possible to the null matrix, the least-

squares approximant ps (A) is chosen in such a way that the function

•(x) = 1’ps (x)x is minimized. This preconditioning technique works

e¬ectively only if A is symmetric and positive de¬nite.

For further results on preconditioners, see [dV89] and [Axe94].

Example 4.5 Consider the matrix A∈ R324—324 associated with the ¬nite di¬er-

ence approximation of the Laplace operator on the square [’1, 1]2 . This matrix

can be generated with the following MATLAB commands: G=numgrid(™N™,20);

A=delsq(G). The condition number of the matrix is K2 (A) = 211.3. In Table

4.1 we show the values of K2 (P’1 A) computed using the ILU(p) and Neumann

preconditioners, with p = 0, 1, 2, 3. In the last case D is the diagonal part of A. •

Remark 4.2 Let A and P be real symmetric matrices of order n, with P

positive de¬nite. The eigenvalues of the preconditioned matrix P’1 A are

solutions of the algebraic equation

Ax = »Px, (4.32)

146 4. Iterative Methods for Solving Linear Systems

p ILU(p) Neumann

0 22.3 211.3

1 12 36.91

2 8.6 48.55

3 5.6 18.7

TABLE 4.1. Spectral condition numbers of the preconditioned matrix A of Ex-

ample 4.5 as a function of p

where x is an eigenvector associated with the eigenvalue ». Equation (4.32)

is an example of generalized eigenvalue problem (see Section 5.9 for a thor-

ough discussion) and the eigenvalue » can be computed through the fol-

lowing generalized Rayleigh quotient

(Ax, x)

»= .

(Px, x)

Applying the Courant-Fisher Theorem (see Section 5.11) yields

»min (A) »max (A)

¤»¤ . (4.33)

»max (P) »min (P)

Relation (4.33) provides a lower and upper bound for the eigenvalues of the

preconditioned matrix as a function of the extremal eigenvalues of A and

P, and therefore it can be pro¬tably used to estimate the condition number

of P’1 A.

4.3.3 The Gradient Method

The expression of the optimal parameter that has been provided in Theo-

rem 4.9 is of limited usefulness in practical computations, since it requires

the knowledge of the extremal eigenvalues of the matrix P’1 A. In the spe-

cial case of symmetric and positive de¬nite matrices, however, the optimal

acceleration parameter can be dynamically computed at each step k as

follows.

We ¬rst notice that, for such matrices, solving system (3.2) is equivalent

to ¬nding the minimizer x ∈ Rn of the quadratic form

1

¦(y) = yT Ay ’ yT b,

2

which is called the energy of system (3.2). Indeed, the gradient of ¦ is given

by

1

∇¦(y) = (AT + A)y ’ b = Ay ’ b. (4.34)

2

As a consequence, if ∇¦(x) = 0 then x is a solution of the original system.

Conversely, if x is a solution, then

1

¦(y) = ¦(x + (y ’ x)) = ¦(x) + (y ’ x)T A(y ’ x), ∀y ∈ Rn

2

4.3 Stationary and Nonstationary Iterative Methods 147

and thus, ¦(y) > ¦(x) if y = x, i.e. x is a minimizer of the functional ¦.

Notice that the previous relation is equivalent to

1

y’x = ¦(y) ’ ¦(x)

2

(4.35)

A

2

where · A is the A-norm or energy norm, de¬ned in (1.28).

The problem is thus to determine the minimizer x of ¦ starting from a

point x(0) ∈ Rn and, consequently, to select suitable directions along which

moving to get as close as possible to the solution x. The optimal direction,

that joins the starting point x(0) to the solution point x, is obviously un-

known a priori. Therefore, we must take a step from x(0) along another

direction d(0) , and then ¬x along this latter a new point x(1) from which

to iterate the process until convergence.

Thus, at the generic step k, x(k+1) is computed as

x(k+1) = x(k) + ±k d(k) , (4.36)

where ±k is the value which ¬xes the length of the step along d(k) . The most

natural idea is to take the descent direction of maximum slope ∇¦(x(k) ),

which yields the gradient method or steepest descent method.

On the other hand, due to (4.34), ∇¦(x(k) ) = Ax(k) ’ b = ’r(k) , so that

the direction of the gradient of ¦ coincides with that of residual and can

be immediately computed using the current iterate. This shows that the

gradient method, as well as the Richardson method, moves at each step k

along the direction d(k) = r(k) .

To compute the parameter ±k let us write explicitly ¦(x(k+1) ) as a func-

tion of a parameter ±

1 (k)

(x + ±r(k) )T A(x(k) + ±r(k) ) ’ (x(k) + ±r(k) )T b.

¦(x(k+1) ) =

2

Di¬erentiating with respect to ± and setting it equal to zero, yields the

desired value of ±k

T

r(k) r(k)

±k = (4.37)

T

r(k) Ar(k)

which depends only on the residual at the k-th step. For this reason, the