<<

. 8
( 26)



>>

’5
10

’6
10

’7
10

’8
10
0 5 10 15 20 25




FIGURE 4.11. Absolute error (in solid line) versus the error estimated by (4.73)
(dashed line). The number of iterations is indicated on the x-axis
174 4. Iterative Methods for Solving Linear Systems

4.6.2 A Stopping Test Based on the Residual
A di¬erent stopping criterion consists of continuing the iteration until
r(k) ¤ µ, µ being a ¬xed tolerance. Note that

x ’ x(k) = A’1 b ’ x(k) = A’1 r(k) ¤ A’1 µ.

Considering instead a normalized residual, i.e. stopping the iteration as
soon as r(k) / b ¤ µ, we obtain the following control on the relative
error
A’1 r(k)
x ’ x(k) r(k)
¤ ¤ K(A) ¤ µK(A).
x x b
In the case of preconditioned methods, the residual is replaced by the pre-
conditioned residual, so that the previous criterion becomes

P’1 r(k)
¤ µ,
P’1 r(0)

where P is the preconditioning matrix.



4.7 Applications
In this section we consider two examples arising in electrical network anal-
ysis and structural mechanics which lead to the solution of large sparse
linear systems.


4.7.1 Analysis of an Electric Network
We consider a purely resistive electric network (shown in Figure 4.12, left)
which consists of a connection of n stages S (Figure 4.12, right) through
the series resistances R. The circuit is completed by the driving current
generator I0 and the load resistance RL . As an example, a purely resistive
network is a model of a signal attenuator for low-frequency applications
where capacitive and inductive e¬ects can be neglected. The connecting
points between the electrical components will be referred to henceforth as
nodes and are progressively labeled as drawn in the ¬gure. For n ≥ 1, the
total number of nodes is 4n. Each node is associated with a value of the
electric potential Vi , i = 0, . . . , 4n, which are the unknowns of the problem.

The nodal analysis method is employed to solve the problem. Precisely,
the Kirchho¬ current law is written at any node of the network leading
to the linear system YV = ˜ where V ∈ RN +1 is the vector of nodal
˜˜ ˜
I,
potentials, ˜ ∈ RN +1 is the load vector and the entries of the matrix Y ∈
˜
I
4.7 Applications 175

R4
R
1 2 4 n-1
R5

R1 R3
S S RL
I
R6
R
3 5 n



R2


FIGURE 4.12. Resistive electric network (left) and resistive stage S (right)

R(N +1)—(N +1) , for i, j = 0, . . . , 4n, are given by
±

 Gij , for i = j,
˜
Yij =
 j∈k(i)
 ’G , for i = j,
ij


where k(i) is the index set of the neighboring nodes of node i and Gij =
1/Rij is the admittance between node i and node j, provided Rij denotes
the resistance between the two nodes i and j. Since the potential is de¬ned
up to an additive constant, we arbitrarily set V0 = 0 (ground potential). As
a consequence, the number of independent nodes for potential di¬erence
computations is N = 4n ’ 1 and the linear system to be solved becomes
YV = I, where Y ∈ RN —N , V ∈ RN and I ∈ RN are obtained eliminating
the ¬rst row and column in Y and the ¬rst entry in V and ˜ respectively.
˜
˜ I,
The matrix Y is symmetric, diagonally dominant and positive de¬nite. This
last property follows by noting that
N N
˜T˜˜ Gij (Vi ’ Vj )2 ,
Gii Vi2
V YV = +
i=1 i,j=1

˜
which is always a positive quantity, being equal to zero only if V = 0. The
sparsity pattern of Y in the case n = 3 is shown in Figure 4.13 (left) while
the spectral condition number of Y as a function of the number of blocks n
is reported in Figure 4.13 (right). Our numerical computations have been
carried out setting the resistance values equal to 1 „¦ while I0 = 1 A.
In Figure 4.14 we report the convergence history of several non precondi-
tioned iterative methods in the case n = 5 corresponding to a matrix size of
19 — 19. The plots show the Euclidean norms of the residual normalized to
the initial residual. The dashed curve refers to the Gauss-Seidel method, the
dash-dotted line refers to the gradient method, while the solid and circled
lines refer respectively to the conjugate gradient (CG) and SOR method
(with an optimal value of the relaxation parameter ω 1.76 computed
according to (4.19) since Y is block tridiagonal symmetric positive de¬-
nite). The SOR method converges in 109 iterations, while the CG method
converges in 10 iterations.
We have also considered the solution of the system at hand by the conju-
gate gradient (CG) method using the Cholesky version of the ILU(0) and
176 4. Iterative Methods for Solving Linear Systems

5
0 10


2 4
10

4
3
10

6
2
10
8

1
10
10


0
12 10
0 2 4 6 8 10 12 0 5 10 15 20 25 30 35 40 45 50




FIGURE 4.13. Sparsity pattern of Y for n = 3 (left) and spectral condition
number of Y as a function of n (right)

2
10

0
10

’2
10

’4
10

’6
10

’8
10

’10
10

’12
10

’14
10

’16
10
0 50 100 150 200 250



FIGURE 4.14. Convergence history of several non preconditioned iterative meth-
ods


MILU(0) preconditioners, where drop tolerances equal to µ = 10’2 , 10’3
have been chosen for the MILU(0) preconditioner (see Section 4.3.2). Cal-
culations with both preconditioners have been done using the MATLAB
functions cholinc and michol. Table 4.2 shows the convergence iterations
of the method for n = 5, 10, 20, 40, 80, 160 and for the considered values of
µ. We report in the second column the number of nonzero entries in the
Cholesky factor of matrix Y, in the third column the number of iterations
for the CG method without preconditioning to converge, while the columns
ICh(0) and MICh(0) with µ = 10’2 and µ = 10’3 show the same infor-
mation for the CG method using the incomplete Cholesky and modi¬ed
incomplete Cholesky preconditioners, respectively.
The entries in the table are the number of iterations to converge and the
number in the brackets are the nonzero entries of the L-factor of the cor-
responding preconditioners. Notice the decrease of the iterations as µ de-
creases, as expected. Notice also the increase of the number of iterations
with respect to the increase of the size of the problem.
4.7 Applications 177

MICh(0) µ = 10’2 MICh(0) µ = 10’3
n CG ICh(0)
nz
5 114 10 9 (54) 6 (78) 4 (98)
10 429 20 15 (114) 7 (173) 5 (233)
20 1659 40 23 (234) 10 (363) 6 (503)
40 6519 80 36 (474) 14 (743) 7 (1043)
80 25839 160 62 (954) 21 (1503) 10 (2123)
160 102879 320 110 (1914) 34 (3023) 14 (4283)
TABLE 4.2. Convergence iterations for the preconditioned CG method


4.7.2 Finite Di¬erence Analysis of Beam Bending
Consider the beam clamped at the endpoints that is drawn in Figure 4.15
(left). The structure, of length L, is subject to a distributed load P , varying
along the free coordinate x and expressed in [Kgm’1 ]. We assume hence-
forth that the beam has uniform rectangular section, of width r and depth
s, momentum of inertia J = rs3 /12 and Young™s module E, expressed in
[m4 ] and [Kg m’2 ], respectively.

5
10



0
10



’5
10
P(x) 000
111
000
111 111
000
000
111 000
111
111
000 111
000
111
000 111
000
000
111 111
000
111
000 000
111 ’10
111
000 10
111
000
000
111 111
000
111
000 111
000
000
111 111
000
x
000
111 000
111
000
111 000
111
111
000 000
111
000
111 000
111 ’15
000
111 111
000 10
111
000 000
111 n=60 n=110
n=10
u(x) ’20
10
0 20 40 60 80 100 120



FIGURE 4.15. Clamped beam (left); convergence histories for the preconditioned
conjugate gradient method in the solution of system (4.76) (right)

The transverse bending of the beam, under the assumption of small dis-
placements, is governed by the following fourth-order di¬erential equation

(EJu ) (x) = P (x), 0 < x < L, (4.74)

where u = u(x) denotes the vertical displacement. The following boundary
conditions (at the endpoints x = 0 and x = L)

u(0) = u(L) = 0, u (0) = u (L) = 0, (4.75)

model the e¬ect of the two clampings (vanishing displacements and rota-
tions). To solve numerically the boundary-value problem (4.74)-(4.75), we
use the ¬nite di¬erence method (see Section 10.10.1 and Exercise 11 of
Chapter 12).
178 4. Iterative Methods for Solving Linear Systems

With this aim, let us introduce the discretization nodes xj = jh, with
h = L/Nh and j = 0, . . . , Nh , and substitute at each node xj the fourth-
order derivative with an approximation through centered ¬nite di¬erences.
Letting f (x) = P (x)/(EJ), fj = f (xj ) and denoting by ·j the (approx-
imate) nodal displacement of the beam at node xj , the ¬nite di¬erence
discretization of (4.74)-(4.75) is




·j’2 ’ 4·j’1 + 6·j ’ 4·j+1 + ·j+2 = h4 fj , ∀j = 2, . . . , Nh ’ 2,
(4.76)
·0 = ·1 = ·Nh ’1 = ·Nh = 0.




The null displacement boundary conditions in (4.76) that have been im-
posed at the ¬rst and the last two nodes of the grid, require that Nh ≥ 4.
Notice that a fourth-order scheme has been used to approximate the fourth-
order derivative, while, for sake of simplicity, a ¬rst-order approximation
has been employed to deal with the boundary conditions (see Section
10.10.1).
The Nh ’ 3 discrete equations (4.76) yield a linear system of the form
Ax = b where the unknown vector x ∈ RNh ’3 and the load vector
b ∈ RNh ’3 are given respectively by x = (·2 , ·3 , . . . , ·Nh ’2 )T and b =
(f2 , f3 , . . . , fNh ’2 )T , while the coe¬cient matrix A ∈ R(Nh ’3)—(Nh ’3) is
pentadiagonal and symmetric, given by A = pentadiagNh ’3 (1, ’4, 6, ’4, 1).
The matrix A is symmetric and positive de¬nite. Therefore, to solve
system Ax = b, the SSOR preconditioned conjugated gradient method (see
Section 4.21) and the Cholesky factorization method have been employed.
In the remainder of the section, the two methods are identi¬ed by the
symbols (CG) and (CH).
The convergence histories of CG are reported in Figure 4.15 (right),
where the sequences r(k) 2 / b(k) 2 , for the values n = 10, 60, 110, are
plotted, r(k) = b ’ Ax(k) being the residual at the k-th step. The results
have been obtained using Program 20, with toll=10’15 and ω = 1.8 in
(4.22). The initial vector x(0) has been set equal to the null vector.
As a comment to the graphs, it is worth noting that CG has required 7, 33
and 64 iterations to converge, respectively, with a maximum absolute error
of 5 · 10’15 with respect to the solution produced by CH. This latter has an
overall computational cost of 136, 1286 and 2436 ¬‚ops respectively, to be
compared with the corresponding 3117, 149424 and 541647 ¬‚ops of method
CG. As for the performances of the SSOR preconditioner, we remark that
the spectral condition number of matrix A is equal to 192, 3.8 · 105 and
4.5 · 106 , respectively, while the corresponding values in the preconditioned
case are 65, 1.2 · 104 and 1.3 · 105 .
4.8 Exercises 179

4.8 Exercises
1. The spectral radius of the matrix
a 4
B=
0 a
1/m
is ρ(B) = a. Check that if 0 < a < 1, then ρ(B) < 1, while Bm can
2
be greater than 1.
2. Let A ∈ Rn—n be a strictly diagonally dominant matrix by rows. Show
that the Gauss-Seidel method for the solution of the linear system (3.2) is
convergent.
3. Check that the matrix A = tridiag(’1, ±, ’1), with ± ∈ R, has eigenvalues
given by

»j = ± ’ 2 cos(jθ), j = 1, . . . , n

where θ = π/(n + 1) and the corresponding eigenvectors are

qj = [sin(jθ), sin(2jθ), . . . , sin(njθ)]T .

Under which conditions on ± is the matrix positive de¬nite?
[Solution : ± ≥ 2.]
4. Consider the pentadiagonal matrix A = pentadiagn (’1, ’1, 10, ’1, ’1).
Assume n = 10 and A = M + N + D, with D = diag(8, . . . , 8) ∈ R10—10 ,
M = pentadiag10 (’1, ’1, 1, 0, 0) and N = MT . To solve Ax = b, analyze
the convergence of the following iterative methods

(M + D)x(k+1) = ’Nx(k) + b,
(a)

Dx(k+1) = ’(M + N)x(k) + b,
(b)

(M + N)x(k+1) = ’Dx(k) + b.
(c)

[Solution : denoting respectively by ρa , ρb and ρc the spectral radii of the
iteration matrices of the three methods, we have ρa = 0.1450, ρb = 0.5
and ρc = 12.2870 which implies convergence for methods (a) and (b) and
divergence for method (c).]
5. For the solution of the linear system Ax = b with
1 2 3
A= , b= ,
2 3 5
consider the following iterative method

k ≥ 0,
x(k+1) = B(θ)x(k) + g(θ), with x(0) given,

where θ is a real parameter and
’2θ2 + 2θ + 1 ’θ
2θ2 + 2θ + 1 1
1 2
B(θ) = , g(θ) = .
’2θ2 + 2θ + 1 ’θ
2θ2 + 2θ + 1 1
4 2
180 4. Iterative Methods for Solving Linear Systems

Check that the method is consistent ∀ θ ∈ R. Then, determine the values
of θ for which the method is convergent and compute the optimal value
of θ (i.e., the value of the parameter for which the convergence rate is
maximum).
[Solution : the method is convergent i¬ ’1 < θ < 1/2 and the convergence

rate is maximum if θ = (1 ’ 3)/2.]
6. To solve the following block linear system
A1 B x b1
= ,
B A2 y b2
consider the two methods
A1 x(k+1) + By(k) = b1 , Bx(k) + A2 y(k+1) = b2 ;
(1)
A1 x(k+1) + By(k) = b1 , Bx(k+1) + A2 y(k+1) = b2 .
(2)
Find su¬cient conditions in order for the two schemes to be convergent for
any choice of the initial data x(0) , y(0) .
[Solution : method (1) is a decoupled system in the unknowns x(k+1) and
y(k+1) . Assuming that A1 and A2 are invertible, method (1) converges if
ρ(A’1 B) < 1 and ρ(A’1 B) < 1. In the case of method (2) we have a coupled
1 2
system to solve at each step in the unknowns x(k+1) and y(k+1) . Solving
formally the ¬rst equation with respect to x(k+1) (which requires A1 to be
invertible) and substituting into the second one we see that method (2) is
convergent if ρ(A’1 BA’1 B) < 1 (again A2 must be invertible).]
2 1
7. Consider the linear system Ax = b with
®  ® 
62 24 1 8 15 110
 23 50 7 14 16   
110
   
A= 4 22  , b= .
6 58 20 110
   
° 10 12 19 66 3» ° »
110
11 18 25 2 54 110
(1) Check if the Jacobi and Gauss-Seidel methods can be applied to solve
the system. (2) Check if the stationary Richardson method with optimal
parameter can be applied with P = I and P = D, where D is the diagonal
part of A, and compute the corresponding values of ±opt and ρopt .
[Solution : (1): matrix A is neither diagonally dominant nor symmetric
positive de¬nite, so that we must compute the spectral radii of the itera-
tion matrices of the Jacobi and Gauss-Seidel methods to verify if they are
convergent. It turns out that ρJ = 0.9280 and ρGS = 0.3066 which implies
convergence for both methods. (2): in the case P = I all the eigenvalues
of A are positive so that the Richardson method can be applied yielding
±opt = 0.015 and ρopt = 0.6452. If P = D the method is still applicable
and ±opt = 0.8510, ρopt = 0.6407.]
8. Consider the linear system Ax = b with
®  ® 
57 6 5 23
 7 10 8 7  32 
A= , b =  .
° 6 8 10 9 » ° 33 »
57 9 10 31
4.8 Exercises 181

Analyze the convergence properties of the Jacobi and Gauss-Seidel methods
applied to the system above in their point and block forms (for a 2—2 block
partition of A).
[Solution : both methods are convergent, the block form being the faster
one. Moreover, ρ2 (BJ ) = ρ(BGS ).]
9. To solve the linear system Ax = b, consider the iterative method (4.6),
with P = D + ωF and N = ’βF ’ E, ω and β being real numbers. Check
that the method is consistent only if β = 1 ’ ω. In such a case, express
the eigenvalues of the iteration matrix as a function of ω and determine for
which values of ω the method is convergent, as well as the value of ωopt ,
assuming that A = tridiag10 (’1, 2, ’1).
[Hint : Take advantage of the result in Exercise 3.]
10. Let A ∈ Rn—n be such that A = (1+ω)P’(N+ωP), with P’1 N nonsingular
and with real eigenvalues 1 > »1 ≥ »2 ≥ . . . ≥ »n . Find the values of ω ∈ R
for which the following iterative method

k ≥ 0,
(1 + ω)Px(k+1) = (N + ωP)x(k) + b,

converges ∀x(0) to the solution of the linear system (3.2). Determine also
the value of ω for which the convergence rate is maximum.
[Solution : ω > ’(1 + »n )/2; ωopt = ’(»1 + »n )/2.]
11. Consider the linear system

3 2 2
Ax = b with A = , b= .
’8
2 6

Write the associated functional ¦(x) and give a graphical interpretation of
the solution of the linear system. Perform some iterations of the gradient
method, after proving convergence for it.
12. Check that in the gradient method x(k+2) is not an optimal direction with
respect to r(k) .
13. Show that the coe¬cients ±k and βk in the conjugate gradient method can
be written in the alternative form (4.45).
[Solution: notice Ap(k) = (r(k) ’ r(k+1) )/±k and thus (Ap(k) )T r(k+1) =
’ r(k+1) 2 /±k . Moreover, ±k (Ap(k) )T p(k) = ’ r(k) 2 .]
2 2

14. Prove the three-terms recursive relation (4.46) for the residual in the con-
jugate gradient method.
[Solution: subtract from both sides of Ap(k) = (r(k) ’ r(k+1) )/±k the quan-
tity βk’1 /±k r(k) and recall that Ap(k) = Ar(k) ’ βk’1 Ap(k’1) . Then, ex-
pressing the residual r(k) as a function of r(k’1) one immediately gets the
desired relation.]
5
Approximation of Eigenvalues and
Eigenvectors




In this chapter we deal with approximations of the eigenvalues and eigen-
vectors of a matrix A ∈ Cn—n . Two main classes of numerical methods
exist to this purpose, partial methods, which compute the extremal eigen-
values of A (that is, those having maximum and minimum module), or
global methods, which approximate the whole spectrum of A.
It is worth noting that methods which are introduced to solve the matrix
eigenvalue problem are not necessarily suitable for calculating the matrix
eigenvectors. For example, the power method (a partial method, see Section
5.3) provides an approximation to a particular eigenvalue/eigenvector pair.
The QR method (a global method, see Section 5.5) instead computes the
real Schur form of A, a canonical form that displays all the eigenvalues of
A but not its eigenvectors. These eigenvectors can be computed, starting
from the real Schur form of A, with an extra amount of work, as described
in Section 5.8.2.
Finally, some ad hoc methods for dealing e¬ectively with the special case
where A is a symmetric (n — n) matrix are considered in Section 5.10.



5.1 Geometrical Location of the Eigenvalues
Since the eigenvalues of A are the roots of the characteristic polynomial
pA (») (see Section 1.7), iterative methods must be used for their approxi-
mation when n ≥ 5. Knowledge of eigenvalue location in the complex plane
can thus be helpful in accelerating the convergence of the process.
184 5. Approximation of Eigenvalues and Eigenvectors

A ¬rst estimate is provided by Theorem 1.4,

|»| ¤ A , ∀» ∈ σ(A), (5.1)

for any consistent matrix norm · . Inequality (5.1), which is often quite
rough, states that all the eigenvalues of A are contained in a circle of radius
R A = A centered at the origin of the Gauss plane.
Another result is obtained by extending the Decomposition Property 1.23
to complex-valued matrices.

Theorem 5.1 If A ∈ Cn—n , let

iS = A ’ AH /2
H = A + AH /2 and

be the hermitian and skew-hermitian parts of A, respectively, i being the
imaginary unit. For any » ∈ σ(A)

»min (H) ¤ Re(») ¤ »max (H), »min (S) ¤ Im(») ¤ »max (S). (5.2)

Proof. From the de¬nition of H and S it follows that A = H + iS. Let u ∈ Cn ,
u 2 = 1, be the eigenvector associated with the eigenvalue »; the Rayleigh
quotient (introduced in Section 1.7) reads

» = uH Au = uH Hu + iuH Su. (5.3)

Notice that both H and S are hermitian matrices, whilst iS is skew-hermitian.
Matrices H and S are thus unitarily similar to a real diagonal matrix (see Section
1.7), and therefore their eigenvalues are real. In such a case, (5.3) yields

Re(») = uH Hu, Im(») = uH Su,

3
from which (5.2) follows.

An a priori bound for the eigenvalues of A is given by the following result.

Theorem 5.2 (of the Gershgorin circles) Let A ∈ Cn—n . Then
n n
σ(A) ⊆ SR = Ri , Ri = {z ∈ C : |z ’ aii | ¤ |aij |}. (5.4)
j=1
i=1
j=i

The sets Ri are called Gershgorin circles.
Proof. Let us decompose A as A = D + E, where D is the diagonal part of
A, whilst eii = 0 for i = 1, . . . , n. For » ∈ σ(A) (with » = aii , i = 1, . . . , n),
let us introduce the matrix B» = A ’ »I = (D ’ »I) + E. Since B» is singular,
there exists a non-null vector x ∈ Cn such that B» x = 0. This means that
((D ’ »I) + E) x = 0, that is, passing to the · ∞ norm,

x = ’(D ’ »I)’1 Ex, ¤ (D ’ »I)’1 E
x x ∞,
∞ ∞
5.1 Geometrical Location of the Eigenvalues 185

and thus
n n
|ekj | |akj |
’1
1 ¤ (D ’ »I) E = =

|akk ’ »| |akk ’ »| (5.5)
j=1
j=1
j=k

for a certain k, 1 ¤ k ¤ n. Inequality (5.5) implies » ∈ Rk and thus (5.4). 3

The bounds (5.4) ensure that any eigenvalue of A lies within the union
of the circles Ri . Moreover, since A and AT share the same spectrum,
Theorem 5.2 also holds in the form
n n
σ(A) ⊆ SC = Cj , Cj = {z ∈ C : |z ’ ajj | ¤ |aij |}. (5.6)
i=1
j=1
i=j

The circles Ri in the complex plane are called row circles, and Cj column
circles. The immediate consequence of (5.4) and (5.6) is the following.

Property 5.1 (First Gershgorin theorem) For a given matrix A ∈
Cn—n ,

∀» ∈ σ(A), » ∈ SR SC . (5.7)

The following two location theorems can also be proved (see [Atk89], pp.
588-590 and [Hou75], pp. 66-67).

Property 5.2 (Second Gershgorin theorem) Let
m n
S1 = Ri , S2 = Ri .
i=1 i=m+1

If S1 © S2 = …, then S1 contains exactly m eigenvalues of A, each one being
accounted for with its algebraic multiplicity, while the remaining eigenvalues
are contained in S2 .

Remark 5.1 Properties 5.1 and 5.2 do not exclude the possibility that
there exist circles containing no eigenvalues, as happens for the matrix in
Exercise 1.

De¬nition 5.1 A matrix A ∈ Cn—n is called reducible if there exists a
permutation matrix P such that
B11 B12
PAPT = ,
0 B22

where B11 and B22 are square matrices; A is irreducible if it is not reducible.
186 5. Approximation of Eigenvalues and Eigenvectors

To check if a matrix is reducible, the oriented graph of the matrix can be
conveniently employed. Recall from Section 3.9 that the oriented graph of a
real matrix A is obtained by joining n points (called vertices of the graph)
P1 , . . . , Pn through a line oriented from Pi to Pj if the corresponding
matrix entry aij = 0. An oriented graph is strongly connected if for any
pair of distinct vertices Pi and Pj there exists an oriented path from Pi to
Pj . The following result holds (see [Var62] for the proof).

Property 5.3 A matrix A ∈ Rn—n is irreducible i¬ its oriented graph is
strongly connected.

Property 5.4 (Third Gershgorin theorem) Let A ∈ Cn—n be an irre-
ducible matrix. An eigenvalue » ∈ σ(A) cannot lie on the boundary of SR
unless it belongs to the boundary of every circle Ri , for i = 1, . . . , n.

Example 5.1 Let us consider the matrix
® 
10 2 3
 
A =  ’1 2 ’1 
° »
01 3

whose spectrum is (to four signi¬cant ¬gures) σ(A) = {9.687, 2.656±i0.693}. The
following values of the norm of A: A 1 = 11, A 2 = 10.72, A ∞ = 15 and
A F = 11.36 can be used in the estimate (5.1). Estimate (5.2) provides instead
1.96 ¤ Re(»(A)) ¤ 10.34, ’2.34 ¤ Im(»(A)) ¤ 2.34, while the row and column
circles are given respectively by R1 = {|z| : |z ’10| ¤ 5}, R2 = {|z| : |z ’2| ¤ 2},
R3 = {|z| : |z ’ 3| ¤ 1} and C1 = {|z| : |z ’ 10| ¤ 1}, C2 = {|z| : |z ’ 2| ¤ 3},
C3 = {|z| : |z ’ 3| ¤ 4}.
In Figure 5.1, for i = 1, 2, 3 the Ri and Ci circles and the intersection SR © SC
(shaded areas) are drawn. In agreement with Property 5.2, we notice that an
eigenvalue is contained in C1 , which is disjoint from C2 and C3 , while the remaining
eigenvalues, thanks to Property 5.1, lie within the set R2 ∪ {C3 © R1 }. •




5.2 Stability and Conditioning Analysis
In this section we introduce some a priori and a posteriori estimates that
are relevant in the stability analysis of the matrix eigenvalue and eigenvec-
tor problem. The presentation follows the guidelines that have been traced
in Chapter 2.


A priori Estimates
5.2.1
Assume that A ∈ Cn—n is a diagonalizable matrix and denote by X =
(x1 , . . . , xn ) ∈ Cn—n the matrix of its right eigenvectors, where xk ∈ Cn
5.2 Stability and Conditioning Analysis 187


Im(z)
R1
C3
C2
C1
R2
R3
23 10 Re(z)




FIGURE 5.1. Row and column circles for matrix A in Example 5.1

for k = 1, . . . , n, such that D = X’1 AX = diag(»1 , . . . , »n ), »i being the
eigenvalues of A, i = 1, . . . , n. Moreover, let E ∈ Cn—n be a perturbation
of A. The following theorem holds.

Theorem 5.3 (Bauer-Fike) Let µ be an eigenvalue of the matrix A+E ∈
Cn—n ; then

min |» ’ µ| ¤ Kp (X) E (5.8)
p
»∈σ(A)


where · p is any matrix p-norm and Kp (X) = X p X’1 is called the
p
condition number of the eigenvalue problem for matrix A.
Proof. We ¬rst notice that if µ ∈ σ(A) then (5.8) is trivially veri¬ed, since
X p X’1 p E p ≥ 0. Let us thus assume henceforth that µ ∈ σ(A). From the
de¬nition of eigenvalue it follows that matrix (A+E’µI) is singular, which means
that, since X is invertible, the matrix X’1 (A + E ’ µI)X = D + X’1 EX ’ µI is
singular. Therefore, there exists a non-null vector x ∈ Cn such that

(D ’ µI) + X’1 EX x = 0.

Since µ ∈ σ(A), the diagonal matrix (D ’ µI) is invertible and the previous
equation can be written in the form

I + (D ’ µI)’1 (X’1 EX) x = 0.

·
Passing to the norm and proceeding as in the proof of Theorem 5.2, we get
p

1 ¤ (D ’ µI)’1 p Kp (X) E p,

from which the estimate (5.8) follows, since

(D ’ µI)’1 = ( min |» ’ µ|)’1 .
p
»∈σ(A)

3
188 5. Approximation of Eigenvalues and Eigenvectors

If A is a normal matrix, from the Schur decomposition theorem (see Section
1.8) it follows that the similarity transformation matrix X is unitary so that
Kp (X) = 1. This implies that

∀µ ∈ σ(A + E), min |» ’ µ| ¤ E p , (5.9)
»∈σ(A)


hence the eigenvalue problem is well-conditioned with respect to the abso-
lute error. This, however, does not prevent the matrix eigenvalue problem
from being a¬ected by signi¬cant relative errors, especially when A has a
widely spread spectrum.

Example 5.2 Let us consider, for 1 ¤ n ¤ 10, the calculation of the eigenvalues
of the Hilbert matrix Hn ∈ Rn—n (see Example 3.2, Chapter 3). It is symmetric
(thus, in particular, normal) and exhibits, for n ≥ 4, a very large condition
number. Let En ∈ Rn—n be a matrix having constant entries equal to · = 10’3 .
We show in Table 5.1 the results of the computation of the minimum in (5.9),
taking p = 2 (that is, En 2 = n·). Notice how the absolute error is decreasing,
since the eigenvalue of minimum module tends to zero, whilst the relative error
is increasing as the size n of the matrix increases, due to the higher sensitivity of

“small” eigenvalues with respect to rounding errors.


n Abs. Err. Rel. Err. En 2 K2 (Hn ) K2 (Hn + En )
1 · 10’3 1 · 10’3 1 · 10’3 1 · 10’3
1 1
1.677 · 10’4 1.446 · 10’3 2 · 10’3
2 19.28 19.26
5.080 · 10’7 2.207 · 10’3 4 · 10’3 1.551 · 104 1.547 · 104
4
1.156 · 10’12 3.496 · 10’3 8 · 10’3 1.526 · 1010 1.515 · 1010
8
1.355 · 10’15 4.078 · 10’3 1 · 10’2 1.603 · 1013 1.589 · 1013
10
TABLE 5.1. Relative and absolute errors in the calculation of the eigenvalues of
the Hilbert matrix (using the MATLAB intrinsic function eig). “Abs. Err.” and
“Rel. Err.” denote respectively the absolute and relative errors (with respect to
»)



The Bauer-Fike theorem states that the matrix eigenvalue problem is well-
conditioned if A is a normal matrix. Failure to ful¬l this property, however,
does not necessarily imply that A must exhibit a “strong” numerical sen-
sitivity to the computation of every one of its eigenvalues. In this respect,
the following result holds, which can be regarded as an a priori estimate of
the conditioning of the calculation of a particular eigenvalue of a matrix.

Theorem 5.4 Let A ∈ Cn—n be a diagonalizable matrix; let », x and y
be a simple eigenvalue of A and its associated right and left eigenvectors,
respectively, with x 2 = y 2 = 1. Moreover, for µ > 0, let A(µ) =
A + µE, with E ∈ Cn—n such that E 2 = 1. Denoting by »(µ) and x(µ) the
5.2 Stability and Conditioning Analysis 189

eigenvalue and the corresponding eigenvector of A(µ), such that »(0) = »
and x(0) = x,

‚» 1
(0) ¤ H . (5.10)
|y x|
‚µ

Proof. Let us ¬rst prove that yH x = 0. Setting Y = (y1 , . . . , yn ) = (XH )’1 ,
with yk ∈ Cn for k = 1, . . . , n, it follows that yk A = »k yk , i.e., the rows
H H

of X’1 = YH are left eigenvectors of A. Then, since YH X = I, yi xj = δij H

for i, j = 1, . . . , n, δij being the Kronecker symbol. This result is equivalent to
saying that the eigenvectors {x} of A and the eigenvectors {y} of AH form a
bi-orthogonal set (see (4.67)).
Let us now prove (5.10). Since the roots of the characteristic equation are
continuous functions of the coe¬cients of the characteristic polynomial associated
with A(µ), it follows that the eigenvalues of A(µ) are continuous functions of µ
(see, for instance, [Hen74], p. 281). Therefore, in a neighborhood of µ = 0,

(A + µE)x(µ) = »(µ)x(µ).

Di¬erentiating the previous equation with respect to µ and setting µ = 0 yields

‚x ‚» ‚x
A (0) + Ex = (0)x + » (0),
‚µ ‚µ ‚µ

from which, left-multiplying both sides by yH and recalling that yH is a left
eigenvector of A,
yH Ex
‚»
(0) = H .
‚µ yx
3
Using the Cauchy-Schwarz inequality gives the desired estimate (5.10).

Notice that |yH x| = | cos(θ» )|, where θ» is the angle between the eigenvec-
tors yH and x (both having unit Euclidean norm). Therefore, if these two
vectors are almost orthogonal the computation of the eigenvalue » turns
out to be ill-conditioned. The quantity
1 1
κ(») = = (5.11)
|yH x| | cos(θ» )|

can thus be taken as the condition number of the eigenvalue ». Obviously,
κ(») ≥ 1; when A is a normal matrix, since it is unitarily similar to a
diagonal matrix, the left and right eigenvectors y and x coincide, yielding
κ(») = 1/ x 2 = 1.
2
Inequality (5.10) can be roughly interpreted as stating that perturbations
of the order of δµ in the entries of matrix A induce changes of the order of
δ» = δµ/| cos(θ» )| in the eigenvalue ». If normal matrices are considered,
the calculation of » is a well-conditioned problem; the case of a generic non-
symmetric matrix A can be conveniently dealt with using methods based
on similarity transformations, as will be seen in later sections.
190 5. Approximation of Eigenvalues and Eigenvectors

It is interesting to check that the conditioning of the matrix eigenvalue
problem remains unchanged if the transformation matrices are unitary. To
this end, let U ∈ Cn—n be a unitary matrix and let A = UH AU. Also let
»j be an eigenvalue of A and denote by κj the condition number (5.11).
Moreover, let κj be the condition number of »j when it is regarded as an
eigenvalue of A. Finally, let {xk }, {yk } be the right and left eigenvectors of
A respectively. Clearly, {UH xk }, {UH yk } are the right and left eigenvectors
of A. Thus, for any j = 1, . . . , n,
’1
κj = yj UUH xj
H
= κj ,

from which it follows that the stability of the computation of »j is not
a¬ected by performing similarity transformations using unitary matrices.
It can also be checked that unitary transformation matrices do not change
the Euclidean length and the angles between vectors in Cn . Moreover, the
following a priori estimate holds (see [GL89], p. 317)

f l X’1 AX = X’1 AX + E, with E uK2 (X) A (5.12)
2 2

where f l(M) is the machine representation of matrix M and u is the roundo¬
unit (see Section 2.5). From (5.12) it follows that using nonunitary trans-
formation matrices in the eigenvalue computation can lead to an unstable
process with respect to rounding errors.
We conclude this section with a stability result for the approximation of
the eigenvector associated with a simple eigenvalue. Under the same as-
sumptions of Theorem 5.4, the following result holds (see for the proof,
[Atk89], Problem 6, pp. 649-650).

Property 5.5 The eigenvectors xk and xk (µ) of the matrices A and A(µ) =
A + µE, with xk (µ) 2 = xk 2 = 1 for k = 1, . . . , n, satisfy
µ
xk (µ) ’ xk ¤ + O(µ2 ), ∀k = 1, . . . , n.
E
2 2
minj=k |»k ’ »j |

Analogous to (5.11), the quantity
1
κ(xk ) =
minj=k |»k ’ »j |
can be regarded as being the condition number of the eigenvector xk . Com-
puting xk might be an ill-conditioned operation if some eigenvalues »j are
“very close” to the eigenvalue »k associated with xk .


A posteriori Estimates
5.2.2
The a priori estimates examined in the previous section characterize the
stability properties of the matrix eigenvalue and eigenvector problem. From
5.2 Stability and Conditioning Analysis 191

the implementation standpoint, it is also important to dispose of a pos-
teriori estimates that allow for a run-time control of the quality of the
approximation that is being constructed. Since the methods that will be
considered later are iterative processes, the results of this section can be
usefully employed to devise reliable stopping criteria for these latter.

Theorem 5.5 Let A ∈ Cn—n be an hermitian matrix and let (», x) be
the computed approximations of an eigenvalue/eigenvector pair (», x) of
A. De¬ning the residual as

r = Ax ’ »x, x = 0,

it then follows that

r 2
min |» ’ »i | ¤ . (5.13)
x
»i ∈σ(A) 2

Proof. Since A is hermitian, it admits a system of orthonormal eigenvectors {uk }
n
which can be taken as a basis of C . In particular, x =
n
±i ui with ±i = uH x,
i
i=1
n
±i (»i ’ »)ui . As a consequence
and thus r =
i=1

n n
2
r 2
βi (»i ’ ») , with βi = |±k | /( |±j |2 ).
2 2
= (5.14)
x 2
i=1 j=1

n
3
Since βi = 1, the inequality (5.13) immediately follows from (5.14).
i=1

The estimate (5.13) ensures that a small absolute error corresponds to a
small relative residual in the computation of the eigenvalue of the matrix
A which is closest to ».
Let us now consider the following a posteriori estimate for the eigenvector
x (for the proof, see [IK66], pp. 142-143).

Property 5.6 Under the same assumptions of Theorem 5.5, suppose that
|»i ’ »| ¤ r 2 for i = 1, . . . , m and that |»i ’ »| ≥ δ > 0 for i =
m + 1, . . . , n. Then

r 2
d(x, Um ) ¤ (5.15)
δ
where d(x, Um ) is the Euclidean distance between x and the space Um gen-
erated by the eigenvectors ui , i = 1, . . . , m associated with the eigenvalues
»i of A.
192 5. Approximation of Eigenvalues and Eigenvectors

Notice that the a posteriori estimate (5.15) ensures that a small absolute
error corresponds to a small residual in the approximation of the eigenvec-
tor associated with the eigenvalue of A that is closest to », provided that
the eigenvalues of A are well-separated (that is, if δ is su¬ciently large).
In the general case of a nonhermitian matrix A, an a posteriori estimate
can be given for the eigenvalue » only when the matrix of the eigenvectors
of A is available. We have the following result (for the proof, we refer to
[IK66], p. 146).

Property 5.7 Let A ∈ Cn—n be a diagonalizable matrix, with matrix of
eigenvectors X = [x1 , . . . , xn ]. If, for some µ > 0,

¤ µ x 2,
r 2

then
min |» ’ »i | ¤ µ X’1 X 2.
2
»i ∈σ(A)

This estimate is of little practical use, since it requires the knowledge of all
the eigenvectors of A. Examples of a posteriori estimates that can actually
be implemented in a numerical algorithm will be provided in Sections 5.3.1
and 5.3.2.


5.3 The Power Method
The power method is very good at approximating the extremal eigenvalues
of the matrix, that is, the eigenvalues having largest and smallest module,
denoted by »1 and »n respectively, as well as their associated eigenvectors.
Solving such a problem is of great interest in several real-life applications
(geosysmic, machine and structural vibrations, electric network analysis,
quantum mechanics, . . . ) where the computation of »n (and its associated
eigenvector xn ) arises in the determination of the proper frequency (and
the corresponding fundamental mode) of a given physical system. We shall
come back to this point in Section 5.12.
Having approximations of »1 and »n can also be useful in the analysis of
numerical methods. For instance, if A is symmetric and positive de¬nite,
one can compute the optimal value of the acceleration parameter of the
Richardson method and estimate its error reducing factor (see Chapter
4), as well as perform the stability analysis of discretization methods for
systems of ordinary di¬erential equations (see Chapter 11).


5.3.1 Approximation of the Eigenvalue of Largest Module
Let A ∈ Cn—n be a diagonalizable matrix and let X ∈ Cn—n be the matrix of
its eigenvectors xi , for i = 1, . . . , n. Let us also suppose that the eigenvalues
5.3 The Power Method 193

of A are ordered as

|»1 | > |»2 | ≥ |»3 | . . . ≥ |»n |, (5.16)

where »1 has algebraic multiplicity equal to 1. Under these assumptions,
»1 is called the dominant eigenvalue of matrix A.
Given an arbitrary initial vector q(0) ∈ Cn of unit Euclidean norm, consider
for k = 1, 2, . . . the following iteration based on the computation of powers
of matrices, commonly known as the power method:

z(k) = Aq(k’1)
q(k) = z(k) / z(k) (5.17)
2

ν (k) = (q(k) )H Aq(k) .

Let us analyze the convergence properties of method (5.17). By induction
on k one can check that

Ak q(0)
k ≥ 1.
(k)
q = , (5.18)
Ak q(0) 2

This relation explains the role played by the powers of A in the method.
Because A is diagonalizable, its eigenvectors xi form a basis of Cn ; it is
thus possible to represent q(0) as
n
±i ∈ C,
(0)
q = ±i xi , i = 1, . . . , n. (5.19)
i=1

Moreover, since Axi = »i xi , we have
n k
±i »i
k (0)
±1 »k
Aq = x1 + xi , k = 1, 2, . . . (5.20)
1
± »1
i=2 1


Since |»i /»1 | < 1 for i = 2, . . . , n, as k increases the vector Ak q(0) (and
thus also q(k) , due to (5.18)), tends to assume an increasingly signi¬cant
component in the direction of the eigenvector x1 , while its components in
the other directions xj decrease. Using (5.18) and (5.20), we get

±1 »k (x1 + y(k) ) x1 + y(k)
1
(k)
q = = µk ,
±1 »k (x1 + y(k) ) 2 x1 + y(k) 2
1

where µk is the sign of ±1 »k and y(k) denotes a vector that vanishes as
1
k ’ ∞.
As k ’ ∞, the vector q(k) thus aligns itself along the direction of eigen-
vector x1 , and the following error estimate holds at each step k.
194 5. Approximation of Eigenvalues and Eigenvectors

Theorem 5.6 Let A ∈ Cn—n be a diagonalizable matrix whose eigenvalues
satisfy (5.16). Assuming that ±1 = 0, there exists a constant C > 0 such
that
k
»2
’ x1 ¤C k ≥ 1,
(k)
˜
q , (5.21)
2
»1
where
n k
q(k) Ak q(0) ±i »i
2
(k)
˜
q = = x1 + xi , k = 1, 2, . . . (5.22)
±1 »k ± »1
i=2 1
1

Proof. Since A is diagonalizable, without losing generality, we can pick up the
nonsingular matrix X in such a way that its columns have unit Euclidean length,
that is xi 2 = 1 for i = 1, . . . , n. From (5.20) it thus follows that
n n
k k
±i »i ±i »i
xi ’ x1
x1 + = xi
2 2
±1 »1 ±1 »1
i=2 i=2
1/2 1/2
n n
2 2k k 2
±i »i »2 ±i
¤ ¤ ,
±1 »1 »1 ±1
i=2 i=2

1/2
n
3
2
that is (5.21) with C = (±i /±1 ) .
i=2

Estimate (5.21) expresses the convergence of the sequence q(k) towards x1 .
˜
Therefore the sequence of Rayleigh quotients
H
((˜ (k) )H A˜ (k) )/ q(k) 2
= q(k) Aq(k) = ν (k)
˜
q q 2

will converge to »1 . As a consequence, limk’∞ ν (k) = »1 , and the conver-
gence will be faster when the ratio |»2 /»1 | is smaller.
If the matrix A is real and symmetric it can be proved, always assuming
that ±1 = 0, that (see [GL89], pp. 406-407)
2k
»2
|»1 ’ ν | ¤ |»1 ’ »n | tan (θ0 )
(k) 2
, (5.23)
»1
where cos(θ0 ) = |xT q(0) | = 0. Inequality (5.23) outlines that the conver-
1
gence of the sequence ν (k) to »1 is quadratic with respect to the ratio |»2 /»1 |
(we refer to Section 5.3.3 for numerical results).
We conclude the section by providing a stopping criterion for the iteration
(5.17). For this purpose, let us introduce the residual at step k
r(k) = Aq(k) ’ ν (k) q(k) , k ≥ 1,
H
and, for µ > 0, the matrix µE(k) = ’r(k) q(k) ∈ Cn—n with E(k) = 1.
2
Since
µE(k) q(k) = ’r(k) , k ≥ 1, (5.24)
5.3 The Power Method 195

we obtain A + µE(k) q(k) = ν (k) q(k) . As a result, at each step of the
power method ν (k) is an eigenvalue of the perturbed matrix A + µE(k) .
From (5.24) and from de¬nition (1.20) it also follows that µ = r(k) 2 for
k = 1, 2, . . . . Plugging this identity back into (5.10) and approximating the
partial derivative in (5.10) by the incremental ratio |»1 ’ ν (k) |/µ, we get

r(k) 2
|»1 ’ ν | k ≥ 1,
(k)
, (5.25)
| cos(θ» )|

where θ» is the angle between the right and the left eigenvectors, x1 and y1 ,
associated with »1 . Notice that, if A is an hermitian matrix, then cos(θ» ) =
1, so that (5.25) yields an estimate which is analogue to (5.13).
In practice, in order to employ the estimate (5.25) it is necessary at each
step k to replace | cos(θ» )| with the module of the scalar product between
two approximations q(k) and w(k) of x1 and y1 , computed by the power
method. The following a posteriori estimate is thus obtained

r(k) 2
|»1 ’ ν | k ≥ 1.
(k)
, (5.26)
|(w(k) )H q(k) |

Examples of applications of (5.26) will be provided in Section 5.3.3.


5.3.2 Inverse Iteration
In this section we look for an approximation of the eigenvalue of a matrix
A ∈ Cn—n which is closest to a given number µ ∈ C, where µ ∈ σ(A).
’1
For this, the power iteration (5.17) can be applied to the matrix (Mµ ) =
’1
(A ’ µI) , yielding the so-called inverse iteration or inverse power method.
The number µ is called a shift.
The eigenvalues of M’1 are ξi = (»i ’ µ)’1 ; let us assume that there
µ
exists an integer m such that

|»m ’ µ| < |»i ’ µ|, ∀i = 1, . . . , n and i = m. (5.27)

This amounts to requiring that the eigenvalue »m which is closest to µ has
multiplicity equal to 1. Moreover, (5.27) shows that ξm is the eigenvalue of
M’1 with largest module; in particular, if µ = 0, »m turns out to be the
µ
eigenvalue of A with smallest module.
Given an arbitrary initial vector q(0) ∈ Cn of unit Euclidean norm, for
k = 1, 2, . . . the following sequence is constructed:

(A ’ µI) z(k) = q(k’1)
q(k) = z(k) / z(k) (5.28)
2

σ (k) = (q(k) )H Aq(k) .
196 5. Approximation of Eigenvalues and Eigenvectors

Notice that the eigenvectors of Mµ are the same as those of A since Mµ =
X (Λ ’ µIn ) X’1 , where Λ = diag(»1 , . . . , »n ). For this reason, the Rayleigh
quotient in (5.28) is computed directly on the matrix A (and not on M’1 ). µ
The main di¬erence with respect to (5.17) is that at each step k a linear
system with coe¬cient matrix Mµ = A ’ µI must be solved. For numerical
convenience, the LU factorization of Mµ is computed once for all at k = 1,
so that at each step only two triangular systems are to be solved, with a
cost of the order of n2 ¬‚ops.
Although being more computationally expensive than the power method
(5.17), the inverse iteration has the advantage that it can converge to any
desired eigenvalue of A (namely, the one closest to the shift µ). Inverse iter-
ation is thus ideally suited for re¬ning an initial estimate µ of an eigenvalue
of A, which can be obtained, for instance, by applying the localization tech-
niques introduced in Section 5.1. Inverse iteration can be also e¬ectively
employed to compute the eigenvector associated with a given (approximate)
eigenvalue, as described in Section 5.8.1.

In view of the convergence analysis of the iteration (5.28) we assume that
A is diagonalizable, so that q(0) can be represented in the form (5.19).
Proceeding in the same way as in the power method, we let
n k
±i ξi
(k)
˜
q = xm + xi ,
±m ξm
i=1,i=m

where xi are the eigenvectors of M’1 (and thus also of A), while ±i are as
µ
in (5.19). As a consequence, recalling the de¬nition of ξi and using (5.27),
we get
lim q(k) = xm , lim σ (k) = »m .
˜
k’∞ k’∞
Convergence will be faster when µ is closer to »m . Under the same assump-
tions made for proving (5.26), the following a posteriori estimate can be
obtained for the approximation error on »m
r(k) 2
|»m ’ σ | k ≥ 1,
(k)
, (5.29)
|(w(k) )H q(k) |
where r(k) = Aq(k) ’ σ (k) q(k) and w(k) is the k-th iterate of the inverse
power method to approximate the left eigenvector associated with »m .


5.3.3 Implementation Issues
The convergence analysis of Section 5.3.1 shows that the e¬ectiveness of
the power method strongly depends on the dominant eigenvalues being
well-separated (that is, |»2 |/|»1 | 1). Let us now analyze the behavior of
iteration (5.17) when two dominant eigenvalues of equal module exist (that
is, |»2 | = |»1 |). Three cases must be distinguished:
5.3 The Power Method 197

1. »2 = »1 : the two dominant eigenvalues are coincident. The method
is still convergent, since for k su¬ciently large (5.20) yields

Ak q(0) »k (±1 x1 + ±2 x2 )
1

which is an eigenvector of A. For k ’ ∞, the sequence q(k) (after
˜
a suitable rede¬nition) converges to a vector lying in the subspace
spanned by the eigenvectors x1 and x2 , while the sequence ν (k) still
converges to »1 .
2. »2 = ’»1 : the two dominant eigenvalues are opposite. In this case
the eigenvalue of largest module can be approximated by applying the
power method to the matrix A2 . Indeed, for i = 1, . . . , n, »i (A2 ) =
2
[»i (A)] , so that »2 = »2 and the analysis falls into the previous case,
1 2
where the matrix is now A2 .
3. »2 = »1 : the two dominant eigenvalues are complex conjugate. Here,
undamped oscillations arise in the sequence of vectors q(k) and the
power method is not convergent (see [Wil65], Chapter 9, Section 12).


As for the computer implementation of (5.17), it is worth noting that nor-
malizing the vector q(k) to 1 keeps away from over¬‚ow (when |»1 | > 1) or
under¬‚ow (when |»1 | < 1) in (5.20). We also point out that the requirement
±1 = 0 (which is a priori impossible to ful¬l when no information about
the eigenvector x1 is available) is not essential for the actual convergence
of the algorithm.
Indeed, although it can be proved that, working in exact arithmetic, the
sequence (5.17) converges to the pair (»2 , x2 ) if ±1 = 0 (see Exercise 10),
the arising of (unavoidable) rounding errors ensures that in practice the
vector q(k) contains a non-null component also in the direction of x1 . This
allows for the eigenvalue »1 to “show-up” and the power method to quickly
converge to it.
An implementation of the power method is given in Program 26. Here
and in the following algorithm, the convergence check is based on the a
posteriori estimate (5.26).
Here and in the remainder of the chapter, the input data z0, toll and
nmax are the initial vector, the tolerance for the stopping test and the
maximum admissible number of iterations, respectively. In output, the vec-
tors nu1 and err contain the sequences {ν (k) } and { r(k) 2 /| cos(θ» )|} (see
(5.26)), whilst x1 and niter are the approximation of the eigenvector x1
and the number of iterations taken by the algorithm to converge, respec-
tively.

Program 26 - powerm : Power method
function [nu1,x1,niter,err]=powerm(A,z0,toll,nmax)
198 5. Approximation of Eigenvalues and Eigenvectors

q=z0/norm(z0); q2=q; err=[]; nu1=[]; res=toll+1; niter=0; z=A*q;
while (res >= toll & niter <= nmax)
q=z/norm(z); z=A*q; lam=q™*z; x1=q;
z2=q2™*A; q2=z2/norm(z2); q2=q2™;
y1=q2; costheta=abs(y1™*x1);
if (costheta >= 5e-2),
niter=niter+1; res=norm(z-lam*q)/costheta;
err=[err; res]; nu1=[nu1; lam];
else
disp(™ Multiple eigenvalue ™); break;
end
end

A coding of the inverse power method is provided in Program 27. The
input parameter mu is the initial approximation of the eigenvalue. In output,
the vectors sigma and err contain the sequences {σ (k) } and r(k) 2 /|(w(k) )H
q(k) | (see (5.29)). The LU factorization (with partial pivoting) of the ma-
trix Mµ is carried out using the MATLAB intrinsic function lu.

Program 27 - invpower : Inverse power method
function [sigma,x,niter,err]=invpower(A,z0,mu,toll,nmax)
n=max(size(A)); M=A-mu*eye(n); [L,U,P]=lu(M);
q=z0/norm(z0); q2=q™; err=[]; sigma=[]; res=toll+1; niter=0;
while (res >= toll & niter <= nmax)
niter=niter+1; b=P*q; y=L\b; z=U\y;
q=z/norm(z); z=A*q; lam=q™*z;
b=q2™; y=U™\b; w=L™\y;
q2=(P™*w)™; q2=q2/norm(q2); costheta=abs(q2*q);
if (costheta >= 5e-2),
res=norm(z-lam*q)/costheta; err=[err; res]; sigma=[sigma; lam];
else,
disp(™ Multiple eigenvalue ™); break;
end
x=q;
end



Example 5.3 The matrix A in (5.30)
®  ® 
’2 ’0.944 ’0.088
15 2 0.393
   
A= ’3  , V =  ’0.312 0.309 
1 10 0.919 (5.30)
° » ° »
’2 1 0 0.112 0.013 0.947

has the following eigenvalues (to ¬ve signi¬cant ¬gures): »1 = 14.103, »2 = 10.385
and »3 = 0.512, while the corresponding eigenvectors are the vector columns of
matrix V.
5.3 The Power Method 199

0
0
10
10

’2
10
’2
10

’4
(NS)
10
’4
10
’6
10
’6
10
’8
10
’8
10 (S)
’10
10

’10
10 ’12
10

’14
’12
10
10
0 6 12 18
0 10 20 30 40 50 60 70 80



FIGURE 5.2. Comparison between the a posteriori error estimate and the actual
absolute error for matrix A in (5.30) (left); convergence curves for the power
method applied to matrix A in (5.31) in its symmetric (S) and nonsymmetric
(NS) forms (right)

To approximate the pair (»1 , x1 ), we have run the Program 26 with initial datum
z(0) = (1, 1, 1)T . After 71 iterations of the power method the absolute errors are
|»1 ’ ν (71) | = 7.91 · 10’11 and x1 ’ x1 ’11
(71)
∞ = 1.42 · 10 .
(0)
In a second run, we have used z = x2 + x3 (notice that with this choice
±1 = 0). After 215 iterations the absolute errors are |»1 ’ ν (215) | = 4.26 · 10’14
’14

<<

. 8
( 26)



>>