. 6
( 26)


Let us apply the regularization technique to the triangulation of the unit
square in Figure 3.14, which is a¬ected by a severe non uniformity of the
triangle size. The grid consists of NT = 112 triangles and N = 73 vertices,
120 3. Direct Methods for the Solution of Linear Systems

1 1

0.9 0.9

0.8 0.8

0.7 0.7

0.6 0.6

0.5 0.5

0.4 0.4

0.3 0.3

0.2 0.2

0.1 0.1

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

FIGURE 3.14. Triangulation before (left) and after (right) the regularization

of which Nb = 32 are on the boundary. The size of each of the two linear
systems (3.78) is thus equal to 41 and their solution is carried out by the
LU factorization of matrix A in its original form (1) and using its sparse
format (2), obtained using the Cuthill-McKee inverse reordering algorithm
described in Section 3.9.1.
In Figure 3.15 the sparsity patterns of A are displayed, without and with
reordering; the integer nz = 237 denotes the number of nonzero entries
in the matrix. Notice that in the second case there is a decrease in the
bandwidth of the matrix, to which corresponds a large reduction in the
operation count from 61623 to 5552. The ¬nal con¬guration of the grid is
displayed in Figure 3.14 (right), which clearly shows the e¬ectiveness of the
regularization procedure.

0 0

5 5

10 10

15 15

20 20

25 25

30 30

35 35

40 40

0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40
nz = 237 nz = 237

FIGURE 3.15. Sparsity patterns of matrix A without and with reordering (left
and right, respectively)
3.15 Exercises 121

3.15 Exercises
1. For any square matrix A∈ Rn—n , prove the following relations
1 1
K2 (A) ¤ K1 (A) ¤ nK2 (A), K∞ (A) ¤ K2 (A) ¤ nK∞ (A),
n n
K1 (A) ¤ K∞ (A) ¤ n2 K1 (A).

They allow us to conclude that if a matrix is ill-conditioned in a certain
norm it remains so even in another norm, up to a factor depending on n.
2. Check that the matrix B ∈ Rn—n : bii = 1, bij = ’1 if i < j, bij = 0 if
i > j, has determinant equal to 1, yet K∞ (B) is large (equal to n2n’1 ).
3. Prove that K(AB) ¤ K(A)K(B), for any two square nonsingular matrices
A,B∈ Rn—n .
4. Given the matrix A ∈ R2—2 , a11 = a22 = 1, a12 = γ, a21 = 0, check that
for γ ≥ 0, K∞ (A) = K1 (A) = (1 + γ)2 . Next, consider the linear system
Ax = b where b is such that x = (1 ’ γ, 1)T is the solution. Find a bound
for δx ∞ / x ∞ in terms of δb ∞ / b ∞ when δb = (δ1 , δ2 )T . Is the
problem well- or ill-conditioned?
5. Consider the matrix A ∈ Rn—n , with entries aij = 1 if i = j or j = n,
aij = ’1 if i > j, zero otherwise. Show that A admits an LU factorization,
with |lij | ¤ 1 and unn = 2n’1 .
6. Consider matrix (3.68) in Example 3.7. Prove that the matrices L and U
have entries very large in module. Check that using GEM with complete
pivoting yields the exact solution.
7. Devise a variant of GEM that transforms a nonsingular matrix A ∈ Rn—n
directly into a diagonal matrix D. This process is commonly known as the
Gauss-Jordan method. Find the Gauss-Jordan transformation matrices Gi ,
i = 1, . . . , n, such that Gn . . . G1 A = D.
8. Let A be a sparse matrix of order n. Prove that the computational cost of
the LU factorization of A is given by (3.61). Prove also that it is always
less than
mk (A) (mk (A) + 3) .

9. Prove that, if A is a symmetric and positive de¬nite matrix, solving the
linear system Ax = b amounts to computing x= n (ci /»i )vi , where »i
are the eigenvalues of A and vi are the corresponding eigenvectors.
10. (From [JM92]). Consider the following linear system

1001 1000 x1 b1
= .
1000 1001 x2 b2

Using Exercise 9, explain why, when b = (2001, 2001)T , a small change
δb = (1, 0)T produces large variations in the solution, while, conversely,
122 3. Direct Methods for the Solution of Linear Systems

when b = (1, ’1)T , a small variation δx = (0.001, 0)T in the solution
induces a large change in b.
[Hint : expand the right hand side on the basis of the eigenvectors of the
11. Characterize the ¬ll-in for a matrix A ∈ Rn—n having nonzero entries only
on the main diagonal and on the ¬rst column and last row. Propose a
permutation that minimizes the ¬ll-in.
[Hint : it su¬ces to exchange the ¬rst row and the ¬rst column with the
last row and the last column, respectively.]
12. Consider the linear system Hn x = b, where Hn is the Hilbert matrix of
order n. Estimate, as a function of n, the maximum number of signi¬cant
digits that are expected when solving the system by GEM.
13. Given the vectors
v1 = [1, 1, 1, ’1]T , v2 = [2, ’1, ’1, 1]T
v3 = [0, 3, 3, ’3]T , v4 = [’1, 2, 2, 1]T

generate an orthonormal system using the Gram-Schmidt algorithm, in
either its standard and modi¬ed versions, and compare the obtained results.
What is the dimension of the space generated by the given vectors?
14. Prove that if A=QR then
K1 (A) ¤ K1 (R) ¤ nK1 (A),

while K2 (A) = K2 (R).
15. Let A ∈ Rn—n be a nonsingular matrix. Determine the conditions under
which the ratio y 2 / x 2 , with x and y as in (3.70), approximates A’1 2 .
[Solution : let UΣVT be the singular value decomposition of A. Denote
by ui , vi the column vectors of U and V, respectively, and expand the
vector d in (3.70) on the basis spanned by {vi }. Then d = n di vi and,
˜ ˜
n n 2
from (3.70), x = i=1 (di /σi )ui , y = i=1 (di /σi )vi , having denoted the
singular values of A by σ1 , . . . , σn .
The ratio
n n
˜2 ˜
(di /σi )2 / 2
y 2/ x = (di /σi )
i=1 i=1

is about equal to σn = A’1 2 if: (i) y has a relevant component in the

˜ ˜
direction of vn (i.e., if dn is not excessively small), and (ii) the ratio dn /σn
is not negligible with respect to the ratios di /σi for i = 1, . . . , n ’ 1. This
last circumstance certainly occurs if A is ill-conditioned in the · 2 -norm
since σn σ1 .]
Iterative Methods for Solving Linear

Iterative methods formally yield the solution x of a linear system after an
in¬nite number of steps. At each step they require the computation of the
residual of the system. In the case of a full matrix, their computational
cost is therefore of the order of n2 operations for each iteration, to be
compared with an overall cost of the order of 2 n3 operations needed by
direct methods. Iterative methods can therefore become competitive with
direct methods provided the number of iterations that are required to con-
verge (within a prescribed tolerance) is either independent of n or scales
sublinearly with respect to n.
In the case of large sparse matrices, as discussed in Section 3.9, direct
methods may be unconvenient due to the dramatic ¬ll-in, although ex-
tremely e¬cient direct solvers can be devised on sparse matrices featuring
special structures like, for example, those encountered in the approximation
of partial di¬erential equations (see Chapters 12 and 13).
Finally, we notice that, when A is ill-conditioned, a combined use of direct
and iterative methods is made possible by preconditioning techniques that
will be addressed in Section 4.3.2.

4.1 On the Convergence of Iterative Methods
The basic idea of iterative methods is to construct a sequence of vectors
x(k) that enjoy the property of convergence
x = lim x(k) , (4.1)
124 4. Iterative Methods for Solving Linear Systems

where x is the solution to (3.2). In practice, the iterative process is stopped
at the minimum value of n such that x(n) ’ x < µ, where µ is a ¬xed
tolerance and · is any convenient vector norm. However, since the exact
solution is obviously not available, it is necessary to introduce suitable
stopping criteria to monitor the convergence of the iteration (see Section
To start with, we consider iterative methods of the form

k ≥ 0,
x(0) given, x(k+1) = Bx(k) + f , (4.2)

having denoted by B an n — n square matrix called the iteration matrix
and by f a vector that is obtained from the right hand side b.

De¬nition 4.1 An iterative method of the form (4.2) is said to be consis-
tent with (3.2) if f and B are such that x = Bx + f . Equivalently,

f = (I ’ B)A’1 b.

Having denoted by

e(k) = x(k) ’ x (4.3)

the error at the k-th step of the iteration, the condition for convergence
(4.1) amounts to requiring that lim e(k) = 0 for any choice of the initial
datum x (often called the initial guess).
Consistency alone does not su¬ce to ensure the convergence of the iter-
ative method (4.2), as shown in the following example.

Example 4.1 To solve the linear system 2Ix = b, consider the iterative method

x(k+1) = ’x(k) + b,

which is obviously consistent. This scheme is not convergent for any choice of
the initial guess. If, for instance, x(0) = 0, the method generates the sequence
x(2k) = 0, x(2k+1) = b, k = 0, 1, . . . .

On the other hand, if x(0) = 1 b the method is convergent.

Theorem 4.1 Let (4.2) be a consistent method. Then, the sequence of vec-
tors x(k) converges to the solution of (3.2) for any choice of x(0) i¬
ρ(B) < 1.
Proof. From (4.3) and the consistency assumption, the recursive relation e(k+1) =
Be(k) is obtained. Therefore,

∀k = 0, 1, . . .
e(k) = Bk e(0) , (4.4)
4.1 On the Convergence of Iterative Methods 125

Thus, thanks to Theorem 1.5, it follows that lim Bk e(0) = 0 for any e(0) i¬
ρ(B) < 1.
Conversely, suppose that ρ(B) > 1, then there exists at least one eigenvalue
»(B) with module greater than 1. Let e(0) be an eigenvector associated with »;
then Be(0) = »e(0) and, therefore, e(k) = »k e(0) . As a consequence, e(k) cannot
tend to 0 as k ’ ∞, since |»| > 1. 3

From (1.23) and Theorem 1.5 it follows that a su¬cient condition for con-
vergence to hold is that B < 1, for any matrix norm. It is reasonable
to expect that the convergence is faster when ρ(B) is smaller so that an
estimate of ρ(B) might provide a sound indication of the convergence of
the algorithm. Other remarkable quantities in convergence analysis are con-
tained in the following de¬nition.

De¬nition 4.2 Let B be the iteration matrix. We call:
1. Bm the convergence factor after m steps of the iteration;
2. Bm 1/m
the average convergence factor after m steps;
3. Rm (B) = ’ m log Bm the average convergence rate after m steps.

These quantities are too expensive to compute since they require evaluating
Bm . Therefore, it is usually preferred to estimate the asymptotic conver-
gence rate, which is de¬ned as

R(B) = lim Rk (B) = ’ log ρ(B) (4.5)

where Property 1.13 has been accounted for. In particular, if B were sym-
metric, we would have
Rm (B) = ’ = ’ log ρ(B).
log Bm 2
In the case of nonsymmetric matrices, ρ(B) sometimes provides an overop-
timistic estimate of Bm 1/m (see [Axe94], Section 5.1). Indeed, although
ρ(B) < 1, the convergence to zero of the sequence Bm might be non-
monotone (see Exercise 1). We ¬nally notice that, due to (4.5), ρ(B) is
the asymptotic convergence factor. Criteria for estimating the quantities
de¬ned so far will be addressed in Section 4.6.

Remark 4.1 The iterations introduced in (4.2) are a special instance of
iterative methods of the form
x(0) = f0 (A, b),

x(n+1) = fn+1 (x(n) , x(n’1) , . . . , x(n’m) , A, b), for n ≥ m,
126 4. Iterative Methods for Solving Linear Systems

where fi and x(m) , . . . , x(1) are given functions and vectors, respectively.
The number of steps which the current iteration depends on is called the
order of the method. If the functions fi are independent of the step index i,
the method is called stationary, otherwise it is nonstationary. Finally, if fi
depends linearly on x(0) , . . . , x(m) , the method is called linear, otherwise
it is nonlinear.
In the light of these de¬nitions, the methods considered so far are there-
fore stationary linear iterative methods of ¬rst order. In Section 4.3, exam-
ples of nonstationary linear methods will be provided.

4.2 Linear Iterative Methods
A general technique to devise consistent linear iterative methods is based
on an additive splitting of the matrix A of the form A=P’N, where P
and N are two suitable matrices and P is nonsingular. For reasons that
will be clear in the later sections, P is called preconditioning matrix or
Precisely, given x(0) , one can compute x(k) for k ≥ 1, solving the systems

k ≥ 0.
Px(k+1) = Nx(k) + b, (4.6)

The iteration matrix of method (4.6) is B = P’1 N, while f = P’1 b. Alter-
natively, (4.6) can be written in the form

x(k+1) = x(k) + P’1 r(k) , (4.7)


r(k) = b ’ Ax(k) (4.8)

denotes the residual vector at step k. Relation (4.7) outlines the fact that
a linear system, with coe¬cient matrix P, must be solved to update the
solution at step k + 1. Thus P, besides being nonsingular, ought to be easily
invertible, in order to keep the overall computational cost low. (Notice that,
if P were equal to A and N=0, method (4.7) would converge in one iteration,
but at the same cost of a direct method).
Let us mention two results that ensure convergence of the iteration (4.7),
provided suitable conditions on the splitting of A are ful¬lled (for their
proof, we refer to [Hac94]).

Property 4.1 Let A = P ’ N, with A and P symmetric and positive def-
inite. If the matrix 2P ’ A is positive de¬nite, then the iterative method
de¬ned in (4.7) is convergent for any choice of the initial datum x(0) and

ρ(B) = B =B < 1.
4.2 Linear Iterative Methods 127

Moreover, the convergence of the iteration is monotone with respect to the
norms · P and · A (i.e., e(k+1) P < e(k) P and e(k+1) A < e(k) A
k = 0, 1, . . . ).

Property 4.2 Let A = P ’ N with A symmetric and positive de¬nite. If
the matrix P + PT ’ A is positive de¬nite, then P is invertible, the iterative
method de¬ned in (4.7) is monotonically convergent with respect to norm
· A and ρ(B) ¤ B A < 1.

4.2.1 Jacobi, Gauss-Seidel and Relaxation Methods
In this section we consider some classical linear iterative methods.
If the diagonal entries of A are nonzero, we can single out in each equation
the corresponding unknown, obtaining the equivalent linear system
® 
1 
°bi ’ aij xj » ,
xi = i = 1, . . . , n. (4.9)
aii j=1

In the Jacobi method, once an arbitrarily initial guess x0 has been chosen,
x(k+1) is computed by the formulae
® 
1 (k) 
°bi ’ aij xj » , i = 1, . . . , n.
xi = (4.10)
aii j=1

This amounts to performing the following splitting for A
P = D, N = D ’ A = E + F,
where D is the diagonal matrix of the diagonal entries of A, E is the lower
triangular matrix of entries eij = ’aij if i > j, eij = 0 if i ¤ j, and F is
the upper triangular matrix of entries fij = ’aij if j > i, fij = 0 if j ¤ i.
As a consequence, A=D-(E+F).
The iteration matrix of the Jacobi method is thus given by
BJ = D’1 (E + F) = I ’ D’1 A. (4.11)
A generalization of the Jacobi method is the over-relaxation method
(or JOR), in which, having introduced a relaxation parameter ω, (4.10) is
replaced by
® 
ω (k) 
(k+1) (k)
°bi ’ aij xj » + (1 ’ ω)xi ,
xi = i = 1, . . . , n.
aii j=1
128 4. Iterative Methods for Solving Linear Systems

The corresponding iteration matrix is
BJω = ωBJ + (1 ’ ω)I. (4.12)
In the form (4.7), the JOR method corresponds to
x(k+1) = x(k) + ωD’1 r(k) .
This method is consistent for any ω = 0 and for ω = 1 it coincides with
the Jacobi method.
The Gauss-Seidel method di¬ers from the Jacobi method in the fact that
at the k + 1-th step the available values of xi are being used to update
the solution, so that, instead of (4.10), one has
® 
i’1 n

aij xj » , i = 1, . . . , n. (4.13)
(k+1) (k+1) (k)
bi ’ ’
xi = aij xj
aii j=1 j=i+1

This method amounts to performing the following splitting for A
P = D ’ E, N = F,
and the associated iteration matrix is
BGS = (D ’ E)’1 F. (4.14)

Starting from Gauss-Seidel method, in analogy to what was done for
Jacobi iterations, we introduce the successive over-relaxation method (or
SOR method)
® 
i’1 n
aij xj » + (1 ’ ω)xi , (4.15)
(k+1) (k+1) (k) (k)
bi ’ ’
xi = aij xj
aii j=1 j=i+1

for i = 1, . . . , n. The method (4.15) can be written in vector form as
(I ’ ωD’1 E)x(k+1) = [(1 ’ ω)I + ωD’1 F]x(k) + ωD’1 b (4.16)
from which the iteration matrix is
B(ω) = (I ’ ωD’1 E)’1 [(1 ’ ω)I + ωD’1 F]. (4.17)
Multiplying by D both sides of (4.16) and recalling that A = D ’ (E + F)
yields the following form (4.7) of the SOR method
(k+1) (k)
r(k) .
x =x +
It is consistent for any ω = 0 and for ω = 1 it coincides with Gauss-Seidel
method. In particular, if ω ∈ (0, 1) the method is called under-relaxation,
while if ω > 1 it is called over-relaxation.
4.2 Linear Iterative Methods 129

4.2.2 Convergence Results for Jacobi and Gauss-Seidel
There exist special classes of matrices for which it is possible to state a
priori some convergence results for the methods examined in the previous
section. The ¬rst result in this direction is the following.

Theorem 4.2 If A is a strictly diagonally dominant matrix by rows, the
Jacobi and Gauss-Seidel methods are convergent.
Proof. Let us prove the part of the theorem concerning the Jacobi method, while
for the Gauss-Seidel method we refer to [Axe94]. Since A is strictly diagonally
dominant by rows, |aii | > n |aij | for j = i and i = 1, . . . , n. As a consequence,
|aij |/|aii | < 1, so that the Jacobi method is convergent.
BJ = max

i=1,... ,n

Theorem 4.3 If A and 2D ’ A are symmetric and positive de¬nite matri-
ces, then the Jacobi method is convergent and ρ(BJ ) = BJ A = BJ D .
Proof. The theorem follows from Property 4.1 taking P=D.

In the case of the JOR method, the assumption on 2D ’ A can be removed,
yielding the following result.

Theorem 4.4 If A if symmetric positive de¬nite, then the JOR method is
convergent if 0 < ω < 2/ρ(D’1 A).
Proof. The result immediately follows from (4.12) and noting that A has real
positive eigenvalues.

Concerning the Gauss-Seidel method, the following result holds.

Theorem 4.5 If A is symmetric positive de¬nite, the Gauss-Seidel method
is monotonically convergent with respect to the norm · A .
Proof. We can apply Property 4.2 to the matrix P=D’E, upon checking that
P + PT ’ A is positive de¬nite. Indeed
P + PT ’ A = 2D ’ E ’ F ’ A = D,

having observed that (D ’ E)T = D ’ F. We conclude by noticing that D is
positive de¬nite, since it is the diagonal of A.

Finally, if A is positive de¬nite and tridiagonal, it can be shown that also
the Jacobi method is convergent and
ρ(BGS ) = ρ2 (BJ ). (4.18)
130 4. Iterative Methods for Solving Linear Systems

In this case, the Gauss-Seidel method is more rapidly convergent than the
Jacobi method. Relation (4.18) holds even if A enjoys the following A-

De¬nition 4.3 A consistently ordered matrix M ∈ Rn—n (that is, a matrix
such that ±D’1 E+±’1 D’1 F, for ± = 0, has eigenvalues that do not depend
on ±, where M=D-E-F, D = diag(m11 , . . . , mnn ), E and F are strictly lower
and upper triangular matrices, respectively) enjoys the A-property if it can
be partitioned in the 2 — 2 block form
D1 M12
M= ,
M21 D2

˜ ˜
where D1 and D2 are diagonal matrices.

When dealing with general matrices, no a priori conclusions on the conver-
gence properties of the Jacobi and Gauss-Seidel methods can be drawn, as
shown in Example 4.2.

Example 4.2 Consider the 3 — 3 linear systems of the form Ai x = bi , where bi
is always taken in such a way that the solution of the system is the unit vector,
and the matrices Ai are
®  ® 
’3 3 ’6
A1 = ° 7 4 2 » , A2 = ° ’4 7 ’8 » ,
’1 1 2 5 7 ’9
®  ® 
4 1 1 7 6 9
°2 0 », °4 ’4 » .
’9 5
A3 = A4 =
’8 ’6 ’7 ’3
0 8

It can be checked that the Jacobi method does fail to converge for A1 (ρ(BJ ) =
1.33), while the Gauss-Seidel scheme is convergent. Conversely, in the case of
A2 , the Jacobi method is convergent, while the Gauss-Seidel method fails to
converge (ρ(BGS ) = 1.¯ In the remaining two cases, the Jacobi method is more
slowly convergent than the Gauss-Seidel method for matrix A3 (ρ(BJ ) = 0.44
against ρ(BGS ) = 0.018), and the converse is true for A4 (ρ(BJ ) = 0.64 while

ρ(BGS ) = 0.77).

We conclude the section with the following result.

Theorem 4.6 If the Jacobi method is convergent, then the JOR method
converges if 0 < ω ¤ 1.
Proof. From (4.12) we obtain that the eigenvalues of BJω are

µk = ω»k + 1 ’ ω, k = 1, . . . , n,
4.2 Linear Iterative Methods 131

where »k are the eigenvalues of BJ . Then, recalling the Euler formula for the
representation of a complex number, we let »k = rk eiθk and get

|µk |2 = ω 2 rk + 2ωrk cos(θk )(1 ’ ω) + (1 ’ ω)2 ¤ (ωrk + 1 ’ ω)2 ,

which is less than 1 if 0 < ω ¤ 1. 3

4.2.3 Convergence Results for the Relaxation Method
The following result provides a necessary condition on ω in order the SOR
method to be convergent.

Theorem 4.7 For any ω ∈ R we have ρ(B(ω)) ≥ |ω ’ 1|; therefore, the
SOR method fails to converge if ω ¤ 0 or ω ≥ 2.
Proof. If {»i } denote the eigenvalues of the SOR iteration matrix, then
»i = det (1 ’ ω)I + ωD’1 F = |1 ’ ω|n .

Therefore, at least one eigenvalue »i must exist such that |»i | ≥ |1 ’ ω| and thus,
in order for convergence to hold, we must have |1 ’ ω| < 1, that is 0 < ω < 2. 3

Assuming that A is symmetric and positive de¬nite, the condition 0 < ω <
2, besides being necessary, becomes also su¬cient for convergence. Indeed
the following result holds (for the proof, see [Hac94]).

Property 4.3 (Ostrowski) If A is symmetric and positive de¬nite, then
the SOR method is convergent i¬ 0 < ω < 2. Moreover, its convergence is
monotone with respect to · A .
Finally, if A is strictly diagonally dominant by rows, the SOR method
converges if 0 < ω ¤ 1.

The results above show that the SOR method is more or less rapidly
convergent, depending on the choice of the relaxation parameter ω. The
question of how to determine the value ωopt for which the convergence rate
is the highest possible can be given a satisfactory answer only in special
cases (see, for instance, [Axe94], [You71], [Var62] or [Wac66]). Here we limit
ourselves to quoting the following result (whose proof is in [Axe94]).

Property 4.4 If the matrix A enjoys the A-property and if BJ has real
eigenvalues, then the SOR method converges for any choice of x(0) i¬
ρ(BJ ) < 1 and 0 < ω < 2. Moreover,
ωopt = (4.19)
1 ’ ρ(BJ )2
132 4. Iterative Methods for Solving Linear Systems

and the corresponding asymptotic convergence factor is
1’ 1 ’ ρ(BJ )2
ρ(B(ωopt )) = .
1 ’ ρ(BJ )2

A priori Forward Analysis
In the previous analysis we have neglected the rounding errors. However, as
shown in the following example (taken from [HW76]), they can dramatically
a¬ect the convergence rate of the iterative method.

Example 4.3 Let A be a lower bidiagonal matrix of order 100 with entries
aii = 1.5 and ai,i’1 = 1, and let b ∈ R100 be the right-side with bi = 2.5. The
exact solution of the system Ax = b has components xi = 1 ’ (’2/3)i . The
SOR method with ω = 1.5 should be convergent, working in exact arithmetic,
since ρ(B(1.5)) = 0.5 (far below one). However, running Program 16 with x(0) =
¬‚ (x) + M , which is extremely close to the exact value, the sequence x(k) diverges
and after 100 iterations the algorithm yields a solution with x(100) ∞ = 1013 .
The ¬‚aw is due to rounding error propagation and must not be ascribed to a

possible ill-conditioning of the matrix since K∞ (A) 5.

To account for rounding errors, let us denote by x(k) the solution (in ¬nite
arithmetic) generated by an iterative method of the form (4.6) after k steps.
Due to rounding errors, x(k) can be regarded as the exact solution to the
Px(k+1) = Nx(k) + b ’ ζ k , (4.20)
ζ k = δPk+1 x(k+1) ’ gk .
The matrix δPk+1 accounts for the rounding errors in the solution of (4.6),
while the vector gk includes the errors made in the evaluation of Nx(k) + b.
From (4.20), we obtain
Bj P’1 (b ’ ζ k’j )
(k+1) k+1 (0)
x =B x +

and for the absolute error e(k+1) = x ’ x(k+1)
Bj P’1 ζ k’j .
(k+1) k+1 (0)
e =B e +

The ¬rst term represents the error that is made by the iterative method
in exact arithmetic; if the method is convergent, this error is negligible for
su¬ciently large values of k. The second term refers instead to rounding
error propagation; its analysis is quite technical and is carried out, for
instance, in [Hig88] in the case of Jacobi, Gauss-Seidel and SOR methods.
4.2 Linear Iterative Methods 133

4.2.5 Block Matrices
The methods of the previous sections are also referred to as point (or line)
iterative methods, since they act on single entries of matrix A. It is possible
to devise block versions of the algorithms, provided that D denotes the block
diagonal matrix whose entries are the m — m diagonal blocks of matrix A
(see Section 1.6).
The block Jacobi method is obtained taking again P=D and N=D-A. The
method is well-de¬ned only if the diagonal blocks of D are nonsingular. If
A is decomposed in p — p square blocks, the block Jacobi method is

(k+1) (k)
= bi ’
Aii xi Aij xj , i = 1, . . . , p,

having also decomposed the solution vector and the right side in blocks of
size p, denoted by xi and bi , respectively. As a result, at each step, the block
Jacobi method requires solving p linear systems of matrices Aii . Theorem
4.3 is still valid, provided that D is substituted by the corresponding block
diagonal matrix.
In a similar manner, the block Gauss-Seidel and block SOR methods can
be introduced.

4.2.6 Symmetric Form of the Gauss-Seidel and SOR Methods
Even if A is a symmetric matrix, the Gauss-Seidel and SOR methods gen-
erate iteration matrices that are not necessarily symmetric. For that, we
introduce in this section a technique that allows for symmetrizing these
schemes. The ¬nal aim is to provide an approach for generating symmetric
preconditioners (see Section 4.3.2).
Firstly, let us remark that an analogue of the Gauss-Seidel method can
be constructed, by simply exchanging E with F. The following iteration
can thus be de¬ned, called the backward Gauss-Seidel method

(D ’ F)x(k+1) = Ex(k) + b

with iteration matrix given by BGSb = (D ’ F)’1 E.
The symmetric Gauss-Seidel method is obtained by combining an itera-
tion of Gauss-Seidel method with an iteration of backward Gauss-Seidel
method. Precisely, the k-th iteration of the symmetric Gauss-Seidel method

(D ’ E)x(k+1/2) = Fx(k) + b, (D ’ F)x(k+1) = Ex(k+1/2) + b.
134 4. Iterative Methods for Solving Linear Systems

Eliminating x(k+1/2) , the following scheme is obtained

x(k+1) = BSGS x(k) + bSGS ,
BSGS = (D ’ F)’1 E(D ’ E)’1 F, (4.21)
bSGS = (D ’ F)’1 [E(D ’ E)’1 + I]b.

The preconditioning matrix associated with (4.21) is

PSGS = (D ’ E)D’1 (D ’ F).

The following result can be proved (see [Hac94]).

Property 4.5 If A is a symmetric positive de¬nite matrix, the symmet-
ric Gauss-Seidel method is convergent, and, moreover, BSGS is symmetric
positive de¬nite.

In a similar manner, de¬ning the backward SOR method

(D ’ ωF)x(k+1) = [ωE + (1 ’ ω)D] x(k) + ωb,

and combining it with a step of SOR method, the following symmetric SOR
method or SSOR, is obtained

x(k+1) = Bs (ω)x(k) + bω


Bs (ω) = (D ’ ωF)’1 (ωE + (1 ’ ω)D)(D ’ ωE)’1 (ωF + (1 ’ ω)D),
bω = ω(2 ’ ω)(D ’ ωF)’1 D(D ’ ωE)’1 b.

The preconditioning matrix of this scheme is

1 ω 1
D’E D’F .
PSSOR (ω) = (4.22)
ω ω

If A is symmetric and positive de¬nite, the SSOR method is convergent if
0 < ω < 2 (see [Hac94] for the proof). Typically, the SSOR method with an
optimal choice of the relaxation parameter converges more slowly than the
corresponding SOR method. However, the value of ρ(Bs (ω)) is less sensitive
to a choice of ω around the optimal value (in this respect, see the behavior
of the spectral radii of the two iteration matrices in Figure 4.1). For this
reason, the optimal value of ω that is chosen in the case of SSOR method
is usually the same used for the SOR method (for further details, we refer
to [You71]).
4.2 Linear Iterative Methods 135



0.8 SSOR


ρ 0.5




0 0.5 1 1.5 2

FIGURE 4.1. Spectral radius of the iteration matrix of SOR and SSOR methods,
as a function of the relaxation parameter ω for the matrix tridiag10 (’1, 2, ’1)

4.2.7 Implementation Issues
We provide the programs implementing the Jacobi and Gauss-Seidel meth-
ods in their point form and with relaxation.
In Program 15 the JOR method is implemented (the Jacobi method is
obtained as a special case setting omega = 1). The stopping test monitors
the Euclidean norm of the residual at each iteration, normalized to the
value of the initial residual.
Notice that each component x(i) of the solution vector can be computed
independently; this method can thus be easily parallelized.

Program 15 - JOR : JOR method

function [x, iter]= jor ( a, b, x0, nmax, toll, omega)
iter = 0; r = b - a * x0; r0 = norm(r); err = norm (r); x = x0;
while err > toll & iter < nmax
iter = iter + 1;
for i=1:n
s = 0;
for j = 1:i-1, s = s + a (i,j) * x (j); end
for j = i+1:n, s = s + a (i,j) * x (j); end
x (i) = omega * ( b(i) - s) / a(i,i) + (1 - omega) * x(i);
r = b - a * x; err = norm (r) / r0;

Program 16 implements the SOR method. Taking omega=1 yields the
Gauss-Seidel method.
136 4. Iterative Methods for Solving Linear Systems

Unlike the Jacobi method, this scheme is fully sequential. However, it can
be e¬ciently implemented without storing the solution of the previous step,
with a saving of memory storage.

Program 16 - SOR : SOR method
function [x, iter]= sor ( a, b, x0, nmax, toll, omega)
iter = 0; r = b - a * x0; r0 = norm (r); err = norm (r); xold = x0;
while err > toll & iter < nmax
iter = iter + 1;
for i=1:n
s = 0;
for j = 1:i-1, s = s + a (i,j) * x (j); end
for j = i+1:n
s = s + a (i,j) * xold (j);
x (i) = omega * ( b(i) - s) / a(i,i) + (1 - omega) * xold (i);
x = x™; xold = x; r = b - a * x; err = norm (r) / r0;

4.3 Stationary and Nonstationary Iterative
Denote by

RP = I ’ P’1 A

the iteration matrix associated with (4.7). Proceeding as in the case of
relaxation methods, (4.7) can be generalized introducing a relaxation (or
acceleration) parameter ±. This leads to the following stationary Richard-
son method

x(k+1) = x(k) + ±P’1 r(k) , k ≥ 0. (4.23)

More generally, allowing ± to depend on the iteration index, the nonsta-
tionary Richardson method or semi-iterative method given by

x(k+1) = x(k) + ±k P’1 r(k) , k ≥ 0. (4.24)

The iteration matrix at the k-th step for these methods (depending on k)

R(±k ) = I ’ ±k P’1 A,
4.3 Stationary and Nonstationary Iterative Methods 137

with ±k = ± in the stationary case. If P=I, the methods will be called
nonpreconditioned. The Jacobi and Gauss-Seidel methods can be regarded
as stationary Richardson methods with ± = 1, P = D and P = D ’ E,
We can rewrite (4.24) (and, thus, also (4.23)) in a form of greater inter-
est for computation. Letting z(k) = P’1 r(k) (the so-called preconditioned
residual), we get x(k+1) = x(k) + ±k z(k) and r(k+1) = b ’ Ax(k+1) =
r(k) ’±k Az(k) . To summarize, a nonstationary Richardson method requires
at each k + 1-th step the following operations:
solve the linear system Pz(k) = r(k) ;
compute the acceleration parameter ±k ;
update the solution x(k+1) = x(k) + ±k z(k) ;
update the residual r(k+1) = r(k) ’ ±k Az(k) .

4.3.1 Convergence Analysis of the Richardson Method
Let us ¬rst consider the stationary Richardson methods for which ±k = ±
for k ≥ 0. The following convergence result holds.

Theorem 4.8 For any nonsingular matrix P, the stationary Richardson
method (4.23) is convergent i¬
> 1 ∀i = 1, . . . , n, (4.26)
±|»i |2
where »i ∈ C are the eigenvalues of P’1 A.
Proof. Let us apply Theorem 4.1 to the iteration matrix R± = I ’ ±P’1 A. The
condition |1 ’ ±»i | < 1 for i = 1, . . . , n yields the inequality
(1 ’ ±Re»i )2 + ±2 (Im»i )2 < 1
from which (4.26) immediately follows.

Let us notice that, if the sign of the real parts of the eigenvalues of P’1 A
is not constant, the stationary Richardson method cannot converge.

More speci¬c results can be obtained provided that suitable assumptions
are made on the spectrum of P’1 A.

Theorem 4.9 Assume that P is a nonsingular matrix and that P’1 A has
positive real eigenvalues, ordered in such a way that »1 ≥ »2 ≥ . . . ≥
»n > 0. Then, the stationary Richardson method (4.23) is convergent i¬
0 < ± < 2/»1 . Moreover, letting
±opt = (4.27)
»1 + »n
138 4. Iterative Methods for Solving Linear Systems

the spectral radius of the iteration matrix R± is minimum if ± = ±opt , with

»1 ’ » n
ρopt = min [ρ(R± )] = . (4.28)
»1 + »n

Proof. The eigenvalues of R± are given by »i (R± ) = 1 ’ ±»i , so that (4.23) is
convergent i¬ |»i (R± )| < 1 for i = 1, . . . , n, that is, if 0 < ± < 2/»1 . It follows
(see Figure 4.2) that ρ(R± ) is minimum when 1 ’ ±»n = ±»1 ’ 1, that is, for
± = 2/(»1 + »n ), which furnishes the desired value for ±opt . By substitution, the
desired value of ρopt is obtained.

|1 ’ ±»1 |

|1 ’ ±»k |


|1 ’ ±»n |

1 2 1
±opt ±
»1 »1 »n
FIGURE 4.2. Spectral radius of R± as a function of the eigenvalues of P’1 A

If P’1 A is symmetric positive de¬nite, it can be shown that the convergence
of the Richardson method is monotone with respect to either · 2 and · A .
In such a case, using (4.28), we can also relate ρopt to K2 (P’1 A) as follows

K2 (P’1 A) ’ 1 2 A’1 P 2
ρopt = , ±opt = .
K2 (P’1 A) + 1 K2 (P’1 A) + 1

The choice of a suitable preconditioner P is, therefore, of paramount im-
portance for improving the convergence of a Richardson method. Of course,
such a choice should also account for the need of keeping the computational
e¬ort as low as possible. In Section 4.3.2, some preconditioners of common
use in practice will be described.

Corollary 4.1 Let A be a symmetric positive de¬nite matrix. Then, the
non preconditioned stationary Richardson method is convergent and

¤ ρ(R± ) e(k) k ≥ 0.
e(k+1) A, (4.30)
4.3 Stationary and Nonstationary Iterative Methods 139

The same result holds for the preconditioned Richardson method, provided
that the matrices P, A and P’1 A are symmetric positive de¬nite.
Proof. The convergence is a consequence of Theorem 4.8. Moreover, we notice

¤ A1/2 R± A’1/2
e(k+1) = R± e(k) = A1/2 R± e(k) A1/2 e(k) 2.
A A 2 2

The matrix R± is symmetric positive de¬nite and is similar to A1/2 R± A’1/2 .
A1/2 R± A’1/2 2 = ρ(R± ).
The result (4.30) follows by noting that A1/2 e(k) 2 = e(k) A . A similar proof
can be carried out in the preconditioned case, provided we replace A with P’1 A.

Finally, the inequality (4.30) holds even if only P and A are symmetric
positive de¬nite (for the proof, see [QV94], Chapter 2).

4.3.2 Preconditioning Matrices
All the methods introduced in the previous sections can be cast in the form
(4.2), so that they can be regarded as being methods for solving the system

(I ’ B)x = f = P’1 b.

On the other hand, since B=P’1 N, system (3.2) can be equivalently refor-
mulated as

P’1 Ax = P’1 b. (4.31)

The latter is the preconditioned system, being P the preconditioning matrix
or left preconditioner. Right and centered preconditioners can be introduced
as well, if system (3.2) is transformed, respectively, as

AP’1 y = b, y = Px,


P’1 AP’1 y = P’1 b, y = PR x.

There are point preconditioners or block preconditioners, depending on
whether they are applied to the single entries of A or to the blocks of
a partition of A. The iterative methods considered so far correspond to
¬xed-point iterations on a left-preconditioned system. As stressed by (4.25),
computing the inverse of P is not mandatory; actually, the role of P is to
“preconditioning” the residual r(k) through the solution of the additional
system Pz(k) = r(k) .
140 4. Iterative Methods for Solving Linear Systems

Since the preconditioner acts on the spectral radius of the iteration ma-
trix, it would be useful to pick up, for a given linear system, an optimal
preconditioner, i.e., a preconditioner which is able to make the number of
iterations required for convergence independent of the size of the system.
Notice that the choice P=A is optimal but, trivially, “ine¬cient”; some
alternatives of greater computational interest will be examined below.
There is a lack of general theoretical results that allow to devise optimal
preconditioners. However, an established “rule of thumb” is that P is a
good preconditioner for A if P’1 A is near to being a normal matrix and if
its eigenvalues are clustered within a su¬ciently small region of the com-
plex ¬eld. The choice of a preconditioner must also be guided by practical
considerations, noticeably, its computational cost and its memory require-
Preconditioners can be divided into two main categories: algebraic and
functional preconditioners, the di¬erence being that the algebraic precon-
ditioners are independent of the problem that originated the system to
be solved, and are actually constructed via algebraic procedure, while the
functional preconditioners take advantage of the knowledge of the problem
and are constructed as a function of it. In addition to the preconditioners
already introduced in Section 4.2.6, we give a description of other algebraic
preconditioners of common use.
1. Diagonal preconditioners: choosing P as the diagonal of A is generally
e¬ective if A is symmetric positive de¬nite. A usual choice in the non
symmetric case is to set
« 1/2
pii =  a2  .

Block diagonal preconditioners can be constructed in a similar man-
ner. We remark that devising an optimal diagonal preconditioner is
far from being trivial, as previously noticed in Section 3.12.1 when
dealing with the scaling of a matrix.
2. Incomplete LU factorization (shortly ILU) and Incomplete Cholesky
factorization (shortly IC).
An incomplete factorization of A is a process that computes P =
Lin Uin , where Lin is a lower triangular matrix and Uin is an upper
triangular matrix. These matrices are approximations of the exact
matrices L, U of the LU factorization of A and are chosen in such a
way that the residual matrix R = A’Lin Uin satis¬es some prescribed
requirements, such as having zero entries in speci¬ed locations.
For a given matrix M, the L-part (U-part) of M will mean henceforth
the lower (upper) triangular part of M. Moreover, we assume that the
factorization process can be carried out without resorting to pivoting.
4.3 Stationary and Nonstationary Iterative Methods 141

The basic approach to incomplete factorization, consists of requiring
the approximate factors Lin and Uin to have the same sparsity pat-
tern as the L-part and U-part of A, respectively. A general algorithm
for constructing an incomplete factorization is to perform Gauss elim-
(k) (k)
ination as follows: at each step k, compute mik = aik /akk only if
aik = 0 for i = k + 1, . . . , n. Then, compute for j = k + 1, . . . , n
aij only if aij = 0. This algorithm is implemented in Program 17
where the matrices Lin and Uin are progressively overwritten onto
the L-part and U-part of A.

Program 17 - basicILU : Incomplete LU factorization

function [a] = basicILU(a)
for k=1:n-1, for i=k+1:n,
if a(i,k) ˜= 0
a(i,k) = a(i,k) / a(k,k);
for j=k+1:n
if a(i,j) ˜= 0
a(i,j) = a(i,j) -a(i,k)*a(k,j);
end, end

We notice that having Lin and Uin with the same patterns as the
L and U-parts of A, respectively, does not necessarily imply that R
has the same sparsity pattern as A, but guarantees that rij = 0 if
aij = 0, as is shown in Figure 4.3.

The resulting incomplete factorization is known as ILU(0), where “0”
means that no ¬ll-in has been introduced in the factorization process.
An alternative strategy might be to ¬x the structure of Lin and Uin
irrespectively of that of A, in such a way that some computational
criteria are satis¬ed (for example, that the incomplete factors have
the simplest possible structure).
The accuracy of the ILU(0) factorization can obviously be improved
by allowing some ¬ll-in to arise, and thus, by accepting nonzero entries
in the factorization whereas A has elements equal to zero. To this
purpose, it is convenient to introduce a function, which we call ¬ll-
in level, that is associated with each entry of A and that is being
modi¬ed during the factorization process. If the ¬ll-in level of an
142 4. Iterative Methods for Solving Linear Systems












0 1 2 3 4 5 6 7 8 9 10 11

FIGURE 4.3. The sparsity pattern of the original matrix A is represented by the
squares, while the pattern of R = A’Lin Uin , computed by Program 17, is drawn
by the bullets

element is greater than an admissible value p ∈ N, the corresponding
entry in Uin or Lin is set equal to zero.
Let us explain how this procedure works, assuming that the matri-
ces Lin and Uin are progressively overwritten to A (as happens in
Program 4). The ¬ll-in level of an entry aij is denoted by levij ,
where the dependence on k is understood, and it should provide a
reasonable estimate of the size of the entry during the factorization
process. Actually, we are assuming that if levij = q then |aij | δq
with δ ∈ (0, 1), so that q is greater when |aij | is smaller.
At the starting step of the procedure, the level of the nonzero entries
of A and of the diagonal entries is set equal to 0, while the level of
the null entries is set equal to in¬nity. For any row i = 2, . . . , n, the
following operations are performed: if levik ¤ p, k = 1, . . . , i ’ 1, the
entry mik of Lin and the entries aij of Uin , j = i + 1, . . . , n, are
updated. Moreover, if aij = 0 the value levij is updated as being
the minimum between the available value of levij and levik +levkj +1.
(k+1) (k) (k)
The reason of this choice is that |aij | = |aij ’ mik akj | |δ levij ’
δ levik +levkj +1 |, so that one can assume that the size of |aij | is the
maximum between δ levij and δ levik +levkj +1 .
The above factorization process is called ILU(p) and turns out to be
extremely e¬cient (with p small) provided that it is coupled with a
suitable matrix reordering (see Section 3.9).
Program 18 implements the ILU(p) factorization; it returns in out-
put the approximate matrices Lin and Uin (overwritten to the input
matrix a), with the diagonal entries of Lin equal to 1, and the ma-
4.3 Stationary and Nonstationary Iterative Methods 143

trix lev containing the ¬ll-in level of each entry at the end of the

Program 18 - ilup : ILU(p) factorization

function [a,lev] = ilup (a,p)
for i=1:n, for j=1:n
if (a(i,j) ˜= 0) | (i==j)
end, end
for i=2:n,
for k=1:i-1
if lev(i,k) <= p
for j=k+1:n
if a(i,j) ˜= 0
for j=1:n, if lev(i,j) > p, a(i,j) = 0; end, end

Example 4.4 Consider the matrix A ∈ R46—46 associated with the ¬nite
‚2· ‚2·
di¬erence approximation of the Laplace operator ∆· = ‚x2 + ‚y2 (see
Section 12.6). This matrix can be generated with the following MATLAB
commands: G=numgrid(™B™,10); A=delsq(G) and corresponds to the dis-
cretization of the di¬erential operator on a domain having the shape of the
exterior of a butter¬‚y and included in the square [’1, 1]2 (see Section 12.6).
The number of nonzero entries of A is 174. Figure 4.4 shows the pattern of
matrix A (drawn by the bullets) and the entries in the pattern added by
the ILU(1) and ILU(2) factorizations due to ¬ll-in (denoted by the squares
and the triangles, respectively). Notice that these entries are all contained

within the envelope of A since no pivoting has been performed.

The ILU(p) process can be carried out without knowing the actual
values of the entries of A, but only working on their ¬ll-in levels.
Therefore, we can distinguish between a symbolic factorization (the
generation of the levels) and an actual factorization (the computation
of the entries of ILU(p) starting from the informations contained in
144 4. Iterative Methods for Solving Linear Systems











0 5 10 15 20 25 30 35 40 45

FIGURE 4.4. Pattern of the matrix A in Example 4.4 (bullets); entries added by
the ILU(1) and ILU(2) factorizations (squares and triangles, respectively)

the level function). The scheme is thus particularly e¬ective when
several linear systems must be solved, with matrices having the same
structure but di¬erent entries.
On the other hand, for certain classes of matrices, the ¬ll-in level
does not always provide a sound indication of the actual size attained
by the entries. In such cases, it is better to monitor the size of the
entries of R by neglecting each time the entries that are too small.
For instance, one can drop out the entries aij such that

(k+1) (k+1) (k+1) 1/2
|aij | ¤ c|aii ajj | , i, j = 1, . . . , n,

with 0 < c < 1 (see [Axe94]).

In the strategies considered so far, the entries of the matrix that are
dropped out can no longer be recovered in the incomplete factoriza-
tion process. Some remedies exist for this drawback: for instance, at
the end of each k-th step of the factorization, one can sum, row by
row, the discarded entries to the diagonal entries of Uin . By doing
so, an incomplete factorization known as MILU (Modi¬ed ILU) is
obtained, which enjoys the property of being exact with respect to
the constant vectors, i.e., such that R1T = 0T (see [Axe94] for other
formulations). In the practice, this simple trick provides, for a wide
class of matrices, a better preconditioner than obtained with the ILU
method. In the case of symmetric positive de¬nite matrices one can
resort to the Modi¬ed Incomplete Cholesky Factorization (MICh).
We conclude by mentioning the ILUT factorization, which collects the
features of ILU(p) and MILU. This factorization can also include par-
tial pivoting by columns with a slight increase of the computational
4.3 Stationary and Nonstationary Iterative Methods 145

cost. For an e¬cient implementation of incomplete factorizations, we
refer to the MATLAB function luinc in the toolbox sparfun.

The existence of the ILU factorization is not guaranteed for all non-
singular matrices (see for an example [Elm86]) and the process stops
if zero pivotal entries arise. Existence theorems can be proved if A is
an M-matrix [MdV77] or diagonally dominant [Man80]. It is worth
noting that sometimes the ILU factorization turns out to be more
stable than the complete LU factorization [GM83].
3. Polynomial preconditioners: the preconditioning matrix is de¬ned as

P’1 = p(A),

where p is a polynomial in A, usually of low degree.
A remarkable example is given by Neumann polynomial precondi-
tioners. Letting A = D ’ C, we have A = (I ’ CD’1 )D, from which

A’1 = D’1 (I ’ CD’1 )’1 = D’1 (I + CD’1 + (CD’1 )2 + . . . ).

A preconditioner can then be obtained by truncating the series above
at a certain power p. This method is actually e¬ective only if ρ(CD’1 )
< 1, which is the necessary condition in order the series to be con-
4. Least-squares preconditioners: A’1 is approximated by a least-squares
polynomial ps (A) (see Section 3.13). Since the aim is to make ma-
trix I ’ P’1 A as close as possible to the null matrix, the least-
squares approximant ps (A) is chosen in such a way that the function
•(x) = 1’ps (x)x is minimized. This preconditioning technique works
e¬ectively only if A is symmetric and positive de¬nite.
For further results on preconditioners, see [dV89] and [Axe94].

Example 4.5 Consider the matrix A∈ R324—324 associated with the ¬nite di¬er-
ence approximation of the Laplace operator on the square [’1, 1]2 . This matrix
can be generated with the following MATLAB commands: G=numgrid(™N™,20);
A=delsq(G). The condition number of the matrix is K2 (A) = 211.3. In Table
4.1 we show the values of K2 (P’1 A) computed using the ILU(p) and Neumann
preconditioners, with p = 0, 1, 2, 3. In the last case D is the diagonal part of A. •

Remark 4.2 Let A and P be real symmetric matrices of order n, with P
positive de¬nite. The eigenvalues of the preconditioned matrix P’1 A are
solutions of the algebraic equation

Ax = »Px, (4.32)
146 4. Iterative Methods for Solving Linear Systems

p ILU(p) Neumann
0 22.3 211.3
1 12 36.91
2 8.6 48.55
3 5.6 18.7
TABLE 4.1. Spectral condition numbers of the preconditioned matrix A of Ex-
ample 4.5 as a function of p

where x is an eigenvector associated with the eigenvalue ». Equation (4.32)
is an example of generalized eigenvalue problem (see Section 5.9 for a thor-
ough discussion) and the eigenvalue » can be computed through the fol-
lowing generalized Rayleigh quotient
(Ax, x)
»= .
(Px, x)
Applying the Courant-Fisher Theorem (see Section 5.11) yields
»min (A) »max (A)
¤»¤ . (4.33)
»max (P) »min (P)
Relation (4.33) provides a lower and upper bound for the eigenvalues of the
preconditioned matrix as a function of the extremal eigenvalues of A and
P, and therefore it can be pro¬tably used to estimate the condition number
of P’1 A.

4.3.3 The Gradient Method
The expression of the optimal parameter that has been provided in Theo-
rem 4.9 is of limited usefulness in practical computations, since it requires
the knowledge of the extremal eigenvalues of the matrix P’1 A. In the spe-
cial case of symmetric and positive de¬nite matrices, however, the optimal
acceleration parameter can be dynamically computed at each step k as
We ¬rst notice that, for such matrices, solving system (3.2) is equivalent
to ¬nding the minimizer x ∈ Rn of the quadratic form
¦(y) = yT Ay ’ yT b,
which is called the energy of system (3.2). Indeed, the gradient of ¦ is given
∇¦(y) = (AT + A)y ’ b = Ay ’ b. (4.34)
As a consequence, if ∇¦(x) = 0 then x is a solution of the original system.
Conversely, if x is a solution, then
¦(y) = ¦(x + (y ’ x)) = ¦(x) + (y ’ x)T A(y ’ x), ∀y ∈ Rn
4.3 Stationary and Nonstationary Iterative Methods 147

and thus, ¦(y) > ¦(x) if y = x, i.e. x is a minimizer of the functional ¦.
Notice that the previous relation is equivalent to

y’x = ¦(y) ’ ¦(x)
where · A is the A-norm or energy norm, de¬ned in (1.28).
The problem is thus to determine the minimizer x of ¦ starting from a
point x(0) ∈ Rn and, consequently, to select suitable directions along which
moving to get as close as possible to the solution x. The optimal direction,
that joins the starting point x(0) to the solution point x, is obviously un-
known a priori. Therefore, we must take a step from x(0) along another
direction d(0) , and then ¬x along this latter a new point x(1) from which
to iterate the process until convergence.
Thus, at the generic step k, x(k+1) is computed as

x(k+1) = x(k) + ±k d(k) , (4.36)

where ±k is the value which ¬xes the length of the step along d(k) . The most
natural idea is to take the descent direction of maximum slope ∇¦(x(k) ),
which yields the gradient method or steepest descent method.
On the other hand, due to (4.34), ∇¦(x(k) ) = Ax(k) ’ b = ’r(k) , so that
the direction of the gradient of ¦ coincides with that of residual and can
be immediately computed using the current iterate. This shows that the
gradient method, as well as the Richardson method, moves at each step k
along the direction d(k) = r(k) .
To compute the parameter ±k let us write explicitly ¦(x(k+1) ) as a func-
tion of a parameter ±

1 (k)
(x + ±r(k) )T A(x(k) + ±r(k) ) ’ (x(k) + ±r(k) )T b.
¦(x(k+1) ) =
Di¬erentiating with respect to ± and setting it equal to zero, yields the
desired value of ±k
r(k) r(k)
±k = (4.37)
r(k) Ar(k)
which depends only on the residual at the k-th step. For this reason, the


. 6
( 26)