and x1 ’ x1 ∞ = 1.38 · 10 .

Figure 5.2 (left) shows the reliability of the a posteriori estimate (5.26). The

sequences |»1 ’ ν (k) | (solid line) and the corresponding a posteriori estimates

(5.26) (dashed line) are plotted as a function of the number of iterations (in

abscissae). Notice the excellent agreement between the two curves.

The symmetric matrix A in (5.31)

® ®

1 3 4 8 1 6

A= 3 2 , T= 3 7

1 5 (5.31)

° » ° »

4 2 1 4 9 2

has the following spectrum: »1 = 7.047, »2 = ’3.1879 and »3 = ’0.8868 (to ¬ve

signi¬cant ¬gures).

It is interesting to compare the behaviour of the power method when computing

»1 for the symmetric matrix A and for its similar matrix M = T’1 AT, where T

is the nonsingular (and nonorthogonal) matrix in (5.31).

Running Program 26 with z(0) = (1, 1, 1)T , the power method converges to the

eigenvalue »1 in 18 and 30 iterations, for matrices A and M, respectively. The

sequence of absolute errors |»1 ’ ν (k) | is plotted in Figure 5.2 (right) where (S)

and (NS) refer to the computations on A and M, respectively. Notice the rapid

error reduction in the symmetric case, according to the quadratic convergence

properties of the power method (see Section 5.3.1).

We ¬nally employ the inverse power method (5.28) to compute the eigenvalue

of smallest module »3 = 0.512 of matrix A in (5.30). Running Program 27 with

√

q(0) = (1, 1, 1)T / 3, the method converges in 9 iterations, with absolute errors

|»3 ’ σ (9) | = 1.194 · 10’12 and x3 ’ x3 ∞ = 4.59 · 10’13 .

(9)

•

200 5. Approximation of Eigenvalues and Eigenvectors

5.4 The QR Iteration

In this section we present some iterative techniques for simultaneously ap-

proximating all the eigenvalues of a given matrix A. The basic idea consists

of reducing A, by means of suitable similarity transformations, into a form

for which the calculation of the eigenvalues is easier than on the starting

matrix.

The problem would be satisfactorily solved if the unitary matrix U of the

Schur decomposition theorem 1.5, such that T = UH AU, T being upper

triangular and with tii = »i (A) for i = 1, . . . , n, could be determined in a

direct way, that is, with a ¬nite number of operations. Unfortunately, it is

a consequence of Abel™s theorem that, for n ≥ 5, the matrix U cannot be

computed in an elementary way (see Exercise 8). Thus, our problem can

be solved only resorting to iterative techniques.

The reference algorithm in this context is the QR iteration method, that is

here examined only in the case of real matrices. (For some remarks on the

extension of the algorithms to the complex case, see [GL89], Section 5.2.10

and [Dem97], Section 4.2.1).

Let A ∈ Rn—n ; given an orthogonal matrix Q(0) ∈ Rn—n and letting

T(0) = (Q(0) )T AQ(0) , for k = 1, 2, . . . , until convergence, the QR iteration

consists of:

determine Q(k) , R(k) such that

Q(k) R(k) = T(k’1) (QR factorization);

(5.32)

then, let

T(k) = R(k) Q(k) .

At each step k ≥ 1, the ¬rst phase of the iteration is the factorization of

the matrix T(k’1) into the product of an orthogonal matrix Q(k) with an

upper triangular matrix R(k) (see Section 5.6.3). The second phase is a

simple matrix product. Notice that

T(k) = R(k) Q(k) = (Q(k) )T (Q(k) R(k) )Q(k) = (Q(k) )T T(k’1) Q(k)

(5.33)

k ≥ 0,

= (Q(0) Q(1) . . . Q(k) )T A(Q(0) Q(1) . . . Q(k) ),

i.e., every matrix T(k) is orthogonally similar to A. This is particularly

relevant for the stability of the method, since, as shown in Section 5.2, the

conditioning of the matrix eigenvalue problem for T(k) is not worse than it

is for A (see also [GL89], p. 360).

A basic implementation of the QR iteration (5.32), assuming Q(0) = In ,

is examined in Section 5.5, while a more computationally e¬cient version,

starting from T(0) in upper Hessenberg form, is described in detail in Sec-

tion 5.6.

5.5 The Basic QR Iteration 201

If A has real eigenvalues, distinct in module, it will be seen in Section 5.5

that the limit of T(k) is an upper triangular matrix (with the eigenvalues of

A on the main diagonal). However, if A has complex eigenvalues the limit

of T(k) cannot be an upper triangular matrix T. Indeed if it were T would

necessarily have real eigenvalues, although it is similar to A.

Failure to converge to a triangular matrix may also happen in more

general situations, as addressed in Example 5.9.

For this, it is necessary to introduce variants of the QR iteration (5.32),

based on de¬‚ation and shift techniques (see Section 5.7 and, for a more

detailed discussion of the subject, [GL89], Chapter 7, [Dat95], Chapter 8

and [Dem97], Chapter 4).

These techniques allow for T(k) to converge to an upper quasi-triangular

matrix, known as the real Schur decomposition of A, for which the following

result holds (for the proof we refer to [GL89], pp. 341-342).

Property 5.8 Given a matrix A ∈ Rn—n , there exists an orthogonal ma-

trix Q ∈ Rn—n such that

®

R11 R12 . . . R1m

0 R22 . . . R2m

QT AQ = . , (5.34)

.

. .

..

.

. .

..

.

° »

0 0 . . . Rmm

where each block Rii is either a real number or a matrix of order 2 having

complex conjugate eigenvalues, and

Q = lim Q(0) Q(1) · · · Q(k) (5.35)

k’∞

Q(k) being the orthogonal matrix generated by the k-th factorization step of

the QR iteration (5.32).

The QR iteration can be also employed to compute all the eigenvectors

of a given matrix. For this purpose, we describe in Section 5.8 two possi-

ble approaches, one based on the coupling between (5.32) and the inverse

iteration (5.28), the other working on the real Schur form (5.34).

5.5 The Basic QR Iteration

In the basic version of the QR method, one sets Q(0) = In in such a way

that T(0) = A. At each step k ≥ 1 the QR factorization of the matrix T(k’1)

202 5. Approximation of Eigenvalues and Eigenvectors

can be carried out using the modi¬ed Gram-Schmidt procedure introduced

in Section 3.4.3, with a cost of the order of 2n3 ¬‚ops (for a full matrix A).

The following convergence result holds (for the proof, see [GL89], Theorem

7.3.1, or [Wil65], pp. 517-519).

Property 5.9 (Convergence of QR method) Let A ∈ Rn—n be a ma-

trix such that

|»1 | > |»2 | > . . . > |»n |.

Then

®

»1 t12 ... t1n

...

0 »2 t23

=

lim T(k) . . (5.36)

. . ..

.

. .

k’+∞

.

. . .»

°

0 0 ... »n

As for the convergence rate, we have

k

»i

(k)

|ti,i’1 | =O for k ’ +∞.

, i = 2, . . . , n, (5.37)

»i’1

Under the additional assumption that A is symmetric, the sequence {T(k) }

tends to a diagonal matrix.

If the eigenvalues of A, although being distinct, are not well-separated, it

follows from (5.37) that the convergence of T(k) towards a triangular matrix

can be quite slow. With the aim of accelerating it, one can resort to the

so-called shift technique, which will be addressed in Section 5.7.

Remark 5.2 It is always possible to reduce the matrix A into a triangular

form by means of an iterative algorithm employing nonorthogonal similarity

transformations. In such a case, the so-called LR iteration (known also as

Rutishauser method, [Rut58]) can be used, from which the QR method has

actually been derived (see also [Fra61], [Wil65]). The LR iteration is based

on the factorization of the matrix A into the product of two matrices L

and R, respectively unit lower triangular and upper triangular, and on the

(nonorthogonal) similarity transformation

L’1 AL = L’1 (LR)L = RL.

The rare use of the LR method in practical computations is due to the loss

of accuracy that can arise in the LR factorization because of the increase

in module of the upper diagonal entries of R. This aspect, together with

the details of the implementation of the algorithm and some comparisons

with the QR method, is examined in [Wil65], Chapter 8.

5.6 The QR Method for Matrices in Hessenberg Form 203

Example 5.4 We apply the QR method to the symmetric matrix A∈ R4—4 such

that aii = 4, for i = 1, . . . , 4, and aij = 4 + i ’ j for i < j ¤ 4, whose eigenvalues

are (to three signi¬cant ¬gures) »1 = 11.09, »2 = 3.41, »3 = 0.90 and »4 = 0.59.

After 20 iterations, we get

®

6.44 · 10’10 ’3.62 · 10’15 9.49 · 10’15

11.09

’10 ’11 ’16

6.47 · 10 1.43 · 10 4.60 · 10

3.41

.

T(20) =

1.74 · 10’21

1.43 · 10’11 1.16 · 10’4

0.90

° »

2.32 · 10’25 2.68 · 10’15 1.16 · 10’4 0.58

Notice the “almost-diagonal” structure of the matrix T(20) and, at the same

time, the e¬ect of rounding errors which slightly alter its expected symmetry.

Good agreement can also be found between the under-diagonal entries and the

•

estimate (5.37).

A computer implementation of the basic QR iteration is given in Program

28. The QR factorization is executed using the modi¬ed Gram-Schmidt

method (Program 8). The input parameter niter denotes the maximum

admissible number of iterations, while the output parameters T, Q and R

are the matrices T, Q and R in (5.32) after niter iterations of the QR

procedure.

Program 28 - basicqr : Basic QR iteration

function [T,Q,R]=basicqr(A,niter)

T=A;

for i=1:niter,

[Q,R]=mod grams(T);

T=R*Q;

end

5.6 The QR Method for Matrices in Hessenberg

Form

The naive implementation of the QR method discussed in the previous

section requires (for a full matrix) a computational e¬ort of the order of

n3 ¬‚ops per iteration. In this section we illustrate a variant for the QR

iteration, known as Hessenberg-QR iteration, with a greatly reduced com-

putational cost. The idea consists of starting the iteration from a matrix

(0)

T(0) in upper Hessenberg form, that is, tij = 0 for i > j + 1. Indeed, it can

be checked that with this choice the computation of T(k) in (5.32) requires

only an order of n2 ¬‚ops per iteration.

204 5. Approximation of Eigenvalues and Eigenvectors

To achieve maximum e¬ciency and stability of the algorithm, suitable

transformation matrices are employed. Precisely, the preliminary reduc-

tion of matrix A into upper Hessenberg form is realized with Householder

matrices, whilst the QR factorization of T(k) is carried out using Givens

matrices, instead of the modi¬ed Gram-Schmidt procedure introduced in

Section 3.4.3.

We brie¬‚y describe Householder and Givens matrices in the next section,

referring to Section 5.6.5 for their implementation. The algorithm and ex-

amples of computations of the real Schur form of A starting from its upper

Hessenberg form are then discussed in Section 5.6.4.

5.6.1 Householder and Givens Transformation Matrices

For any vector v ∈ Rn , let us introduce the orthogonal and symmetric

matrix

P = I ’ 2vvT / v 2 . (5.38)

2

Given a vector x ∈ Rn , the vector y = Px is the re¬‚ection of x with respect

to the hyperplane π = span{v}⊥ formed by the set of the vectors that are

orthogonal to v (see Figure 5.3, left). Matrix P and the vector v are called

the Householder re¬‚ection matrix and the Householder vector, respectively.

xk

v

y

x θ

x

y

π

xi

FIGURE 5.3. Re¬‚ection across the hyperplane orthogonal to v (left); rotation by

an angle θ in the plane (xi , xk ) (right)

Householder matrices can be used to set to zero a block of components of

a given vector x ∈ Rn . If, in particular, one would like to set to zero all the

components of x, except the m-th one, the Householder vector ought to be

chosen as

v = x ± x 2 em , (5.39)

5.6 The QR Method for Matrices in Hessenberg Form 205

em being the m-th unit vector of Rn . The matrix P computed by (5.38)

depends on the vector x itself, and it can be checked that

® T

Px = °0, 0, . . . , ± x 2 , 0, . . . , 0» . (5.40)

m

Example 5.5 Let x = [1, 1, 1, 1]T and m = 3; then

® ® ®

5 ’1 ’3 ’1

1 0

1 ’3 ’1 0

1 ’1 5

v= , P = , Px = .

3 6 ’3 ’3 ’3 ’3 ’2

° » ° » ° »

’1 ’1 ’3 5

1 0

•

If, for some k ≥ 1, the ¬rst k components of x must remain unaltered,

while the components from k + 2 on are to be set to zero, the Householder

matrix P = P(k) takes the following form

®

Ik 0 (k) (k) T

» , Rn’k = In’k ’ 2 w (w ) .

°

P(k) = (5.41)

w(k) 22

0 Rn’k

As usual, Ik is the identity matrix of order k, while Rn’k is the elementary

Householder matrix of order n ’ k associated with the re¬‚ection across the

hyperplane orthogonal to the vector w(k) ∈ Rn’k . According to (5.39), the

Householder vector is given by

(n’k)

w(k) = x(n’k) ± x(n’k) 2 e1 , (5.42)

where x(n’k) ∈ Rn’k is the vector formed by the last n ’ k components

(n’k)

is the ¬rst unit vector of the canonical basis of Rn’k . We

of x and e1

notice that P(k) is a function of x through w(k) . The criterion for ¬xing

the sign in the de¬nition of w(k) will be discussed in Section 5.6.5.

The components of the transformed vector y = P(k) x read

±

j = 1, · · · , k,

yj = xj

j = k + 2, · · · , n,

yj = 0

yk+1 = ± x(n’k) 2 .

The Householder matrices will be employed in Section 5.6.2 to carry out the

reduction of a given matrix A to a matrix H(0) in upper Hessenberg form.

This is the ¬rst step for an e¬cient implementation of the QR iteration

(5.32) with T(0) = H(0) (see Section 5.6).

206 5. Approximation of Eigenvalues and Eigenvectors

Example 5.6 Let x=[1, 2, 3, 4, 5]T and k = 1 (this means that we want to set to

zero the components xj , with j = 3, 4, 5). The matrix P(1) and the transformed

vector y=P(1) x are given by

® ®

1 0 0 0 0 1

0 0.6804 7.3485

0.2722 0.4082 0.5443

= 0 ’0.3816 , y= 0 .

’0.3053

0.4082 0.7710

P(1)

0 ’0.5089 0

’0.3053

0.5443 0.5929

° » ° »

’0.3816 ’0.5089

0 0.6804 0.3639 0

•

The Givens elementary matrices are orthogonal rotation matrices that al-

low for setting to zero in a selective way the entries of a vector or matrix.

For a given pair of indices i and k, and a given angle θ, these matrices are

de¬ned as

G(i, k, θ) = In ’ Y (5.43)

where Y∈ Rn—n is a null matrix except for the following entries: yii =

ykk = 1 ’ cos(θ), yik = ’ sin(θ) = ’yki . A Givens matrix is of the form

i k

®

0

1

1

..

.

cos(θ) sin(θ)

i

..

.

G(i, k, θ) = .

’ sin(θ) cos(θ) k

..

.

1

° »

0 1

T

For a given vector x ∈ Rn , the product y = (G(i, k, θ)) x is equivalent to

rotating x counterclockwise by an angle θ in the coordinate plane (xi , xk )

(see Figure 5.3, right). After letting c = cos θ, s = sin θ, it follows that

±

xj j = i, k,

cxi ’ sxk j = i,

yj = (5.44)

sxi + cxk j = k.

5.6 The QR Method for Matrices in Hessenberg Form 207

Let ±ik = x2 + x2 and notice that if c and s satisfy c = xi /±ik , s =

i k

’xk /±ik (in such a case, θ = arctan(’xk /xi )), we get yk = 0, yi = ±ik

and yj = xj for j = i, k. Similarly, if c = xk /±ik , s = xi /±ik (that is,

θ = arctan(xi /xk )), then yi = 0, yk = ±ik and yj = xj for j = i, k.

The Givens rotation matrices will be employed in Section 5.6.3 to carry

out the QR factorization step in the algorithm (5.32) and in Section 5.10.1

where the Jacobi method for symmetric matrices is considered.

Remark 5.3 (Householder de¬‚ation for power iterations) The ele-

mentary Householder tranformations can be conveniently employed to com-

pute the ¬rst (largest or smallest) eigenvalues of a given matrix A ∈ Rn—n .

Assume that the eigenvalues of A are ordered as in (5.16) and suppose

that the eigenvalue/eigenvector pair (»1 , x1 ) has been computed using the

power method. Then the matrix A can be transformed into the following

block form (see for the proof [Dat95], Theorem 8.5.4, p. 418)

bT

»1

A1 = HAH =

0 A2

where b ∈ Rn’1 , H is the Householder matrix such that Hx1 = ±x1 for

some ± ∈ R, the matrix A2 ∈ R(n’1)—(n’1) and the eigenvalues of A2 are

the same as those of A except for »1 . The matrix H can be computed using

(5.38) with v = x1 ± x1 2 e1 .

The de¬‚ation procedure consists of computing the second dominant (sub-

dominant) eigenvalue of A by applying the power method to A2 provided

that |»2 | = |»3 |. Once »2 is available, the corresponding eigenvector x2

can be computed by applying the inverse power iteration to the matrix A

taking µ = »2 (see Section 5.3.2) and proceeding in the same manner with

the remaining eigenvalue/eigenvector pairs. An example of de¬‚ation will

be presented in Section 5.12.2.

5.6.2 Reducing a Matrix in Hessenberg Form

A given matrix A∈ Rn—n can be transformed by similarity transforma-

tions into upper Hessenberg form with a cost of the order of n3 ¬‚ops. The

algorithm takes n ’ 2 steps and the similarity transformation Q can be

computed as the product of Householder matrices P(1) · · · P(n’2) . For this,

the reduction procedure is commonly known as the Householder method.

Precisely, the k-th step consists of a similarity transformation of A through

the Householder matrix P(k) which aims at setting to zero the elements in

positions k + 2, . . . , n of the k-th column of A, for k = 1, . . . , (n ’ 2) (see

208 5. Approximation of Eigenvalues and Eigenvectors

Section 5.6.1). For example, in the case n = 4 the reduction process yields

® ® ®

• ••• • ••• ••••

• • • • ’’ • • • • ’’ • • • •

P(1) P(2) ,

• • • • 0 • • • 0 • • •

° » ° » ° »

• • • • • • • • •

0 0 0

having denoted by • the entries of the matrices that are a priori non zero.

Given A(0) = A, the method generates a sequence of matrices A(k) that

are orthogonally similar to A

A(k) = PT A(k’1) P(k) = (P(k) · · · P(1) )T A(P(k) · · · P(1) )

(k)

(5.45)

k ≥ 1.

= QT AQ(k) ,

(k)

For any k ≥ 1 the matrix P(k) is given by (5.41), where x is substituted

by the k-th column vector in matrix A(k’1) . From the de¬nition (5.41) it

is easy to check that the operation PT A(k’1) leaves the ¬rst k rows of

(k)

(k’1) T (k’1)

P(k) = A(k) does the same on the

A unchanged, whilst P(k) A

¬rst k columns. After n ’ 2 steps of the Householder reduction, we obtain

a matrix H = A(n’2) in upper Hessenberg form.

Remark 5.4 (The symmetric case) If A is symmetric, the transforma-

tion (5.45) maintains such a property. Indeed

∀k ≥ 1,

(A(k) )T = (QT AQ(k) )T = A(k) ,

(k)

so that H must be tridiagonal. Its eigenvalues can be e¬ciently computed

using the method of Sturm sequences with a cost of the order of n ¬‚ops, as

will be addressed in Section 5.10.2.

A coding of the Householder reduction method is provided in Program

29. To compute the Householder vector, Program 32 is employed. In output,

the two matrices H and Q, respectively in Hessenberg form and orthogonal,

are such that H = QT AQ.

Program 29 - houshess : Hessenberg-Householder method

function [H,Q]=houshess(A)

n=max(size(A)); Q=eye(n); H=A;

for k=1:(n-2),

[v,beta]=vhouse(H(k+1:n,k)); I=eye(k); N=zeros(k,n-k);

m=length(v); R=eye(m)-beta*v*v™; H(k+1:n,k:n)=R*H(k+1:n,k:n);

H(1:n,k+1:n)=H(1:n,k+1:n)*R; P=[I, N; N™, R]; Q=Q*P;

end

5.6 The QR Method for Matrices in Hessenberg Form 209

The algorithm coded in Program 29 requires a cost of 10n3 /3 ¬‚ops and

is well-conditioned with respect to rounding errors. Indeed, the following

estimate holds (see [Wil65], p. 351)

¤ cn2 u A

H = QT (A + E) Q, E (5.46)

F F

where H is the Hessenberg matrix computed by Program 29, Q is an or-

thogonal matrix, c is a constant, u is the roundo¬ unit and · F is the

Frobenius norm (see (1.18)).

Example 5.7 Consider the reduction in upper Hessenberg form of the Hilbert

matrix H4 ∈ R4—4 . Since H4 is symmetric, its Hessenberg form should be a

triadigonal symmetric matrix. Program 29 yields the following results

® ®

1.00 0 0 0 1.00 0.65 0 0

0 0.20 0.65 0

’0.61

0.77 0.65 0.06

Q= , H= .

0 ’0.76 0 0.001

0.51 0.40 0.06 0.02

° » ° »

0 0.38 0.69 0.61 0 0 0.001 0.0003

The accuracy of the transformation procedure (5.45) can be measured by com-

puting the · F norm of the di¬erence between H and QT H4 Q. This yields

H ’ QT H4 Q F = 3.38 · 10’17 , which con¬rms the stability estimate (5.46). •

5.6.3 QR Factorization of a Matrix in Hessenberg Form

In this section we explain how to e¬ciently implement the generic step of

the QR iteration, starting from a matrix T(0) = H(0) in upper Hessenberg

form.

For any k ≥ 1, the ¬rst phase consists of computing the QR factorization

of H(k’1) by means of n ’ 1 Givens rotations

T T T

(k) (k)

(k) (k’1)

H(k’1) = R(k) ,

Q H = Gn’1 ... G1 (5.47)

(k)

where, for any j = 1, . . . , n ’ 1, Gj = G(j, j + 1, θj )(k) is, for any k ≥ 1,

the j-th Givens rotation matrix (5.43) in which θj is chosen according

to (5.44) in such a way that the entry of indices (j + 1, j) of the matrix

T T

(k) (k)

· · · G1 H(k’1) is set equal to zero. The product (5.47) requires

Gj

a computational cost of the order of 3n2 ¬‚ops.

The next step consists of completing the orthogonal similarity transfor-

mation

(k) (k)

H(k) = R(k) Q(k) = R(k) G1 . . . Gn’1 . (5.48)

210 5. Approximation of Eigenvalues and Eigenvectors

(k) (k)

The orthogonal matrix Q(k) = G1 . . . Gn’1 is in upper Hessenberg

form. Indeed, taking for instance n = 3, and recalling Section 5.6.1, we get

® ® ®

•• •••

0 100

Q(k) = G1 G2 = • • 0 0 • • = • • • .

(k) (k)

° »° »° »

0•• 0••

00 1

Also (5.48) requires a cost of the order of 3n2 operations, for an overall e¬ort

of the order of 6n2 ¬‚ops. In conclusion, performing the QR factorization

with elementary Givens rotations on a starting matrix in upper Hessenberg

form yields a reduction of the operation count of one order of magnitude

with respect to the corresponding factorization with the modi¬ed Gram-

Schmidt procedure of Section 5.5.

5.6.4 The Basic QR Iteration starting from Upper Hessenberg

Form

A basic implementation of the QR iteration to generate the real Schur

decomposition of a matrix A is given in Program 30.

This program uses Program 29 to reduce A in upper Hessenberg form;

then each QR factorization step in (5.32) is carried out with Program 31

which utilizes Givens rotations. The overall e¬ciency of the algorithm is

ensured by pre- and post-multiplying with Givens matrices as explained in

(k) (k)

Section 5.6.5, and by constructing the matrix Q(k) = G1 . . . Gn’1 in the

function prodgiv, with a cost of n2 ’ 2 ¬‚ops and without explicitly forming

(k)

the Givens matrices Gj , for j = 1, . . . , n ’ 1.

As for the stability of the QR iteration with respect to rounding er-

ror propagation, it can be shown that the computed real Schur form T is

orthogonally similar to a matrix “close” to A, i.e.

T = QT (A + E)Q

where Q is orthogonal and E u A 2 , u being the machine roundo¬

2

unit.

Program 30 returns in output, after niter iterations of the QR proce-

dure, the matrices T, Q and R in (5.32).

Program 30 - hessqr : Hessenberg-QR method

function [T,Q,R]=hessqr(A,niter)

n=max(size(A));

[T,Qhess]=houshess(A);

for j=1:niter

[Q,R,c,s]= qrgivens(T);

5.6 The QR Method for Matrices in Hessenberg Form 211

T=R;

for k=1:n-1,

T=gacol(T,c(k),s(k),1,k+1,k,k+1);

end

end

Program 31 - givensqr : QR factorization with Givens rotations

function [Q,R,c,s]= qrgivens(H)

[m,n]=size(H);

for k=1:n-1

[c(k),s(k)]=givcos(H(k,k),H(k+1,k));

H=garow(H,c(k),s(k),k,k+1,k,n);

end

R=H; Q=prodgiv(c,s,n);

function Q=prodgiv(c,s,n)

n1=n-1; n2=n-2;

Q=eye(n); Q(n1,n1)=c(n1); Q(n,n)=c(n1);

Q(n1,n)=s(n1); Q(n,n1)=-s(n1);

for k=n2:-1:1,

k1=k+1; Q(k,k)=c(k); Q(k1,k)=-s(k);

q=Q(k1,k1:n); Q(k,k1:n)=s(k)*q;

Q(k1,k1:n)=c(k)*q;

end

Example 5.8 Consider the matrix A (already in Hessenberg form)

®

3 17 ’37 18 ’40

1 0 0

00

A= 0 1 0 .

00

0 0 0

10

° »

0 0 0 1 0

To compute its eigenvalues, given by ’4, ±i, 2 and 5, we apply the QR method

and we compute the matrix T(40) after 40 iterations of Program 30. Notice that

the algorithm converges to the real Schur decomposition of A (5.34), with three

blocks Rii of order 1 (i = 1, 2, 3) and with the block R44 = T(40) (4 : 5, 4 : 5)

having eigenvalues equal to ±i

®

4.9997 18.9739 ’34.2570 32.8760 ’28.4604

0 5.6216

’3.9997 6.7693 ’6.4968

= 0 1.1562 .

2 ’1.4557

(40)

0

T

0 ’0.8709

0 0 0.3129

° »

’0.3129

0 0 0 1.2607

212 5. Approximation of Eigenvalues and Eigenvectors

•

Example 5.9 Let us now employ the QR method to generate the Schur real

decomposition of the matrix A below, after reducing it to upper Hessenberg form

®

17 24 1 8 15

23 16

5 7 14

A= 4 22 .

6 13 20

10 3

12 19 21

° »

11 18 25 2 9

The eigenvalues of A are real and given (to four signi¬cant ¬gures) by »1 =

65, »2,3 = ±21.28 and »4,5 = ±13.13. After 40 iterations of Program 30, the

computed matrix reads

®

65 0 0 0 0

0 4.4848 ’3.4375

14.6701 14.2435

= 0 2.0416 .

16.6735 ’14.6701 ’1.2159

(40)

T

0 0 ’13.0293 ’0.7643

0

° »

’3.3173

0 0 0 13.0293

It is not upper triangular, but block upper triangular, with a diagonal block

R11 = 65 and the two blocks

’13.0293 ’0.7643

14.6701 14.2435

R22 = , R33 = ,

’14.6701 ’3.3173

16.6735 13.0293

having spectrums given by σ(R22 ) = »2,3 and σ(R33 ) = »4,5 respectively.

It is important to recognize that matrix T(40) is not the real Schur decomposi-

tion of A, but only a “cheating” version of it. In fact, in order for the QR method

to converge to the real Schur decomposition of A, it is mandatory to resort to

•

the shift techniques introduced in Section 5.7.

5.6.5 Implementation of Transformation Matrices

In the de¬nition (5.42) it is convenient to choose the minus sign, obtaining

(n’k)

w(k) = x(n’k) ’ x(n’k) 2 e1 , in such a way that the vector Rn’k x(n’k)

(n’k)

is a positive multiple of e1 . If xk+1 is positive, in order to avoid nu-

merical cancellations, the computation can be rationalized as follows

n

’ x2

j

’

x2 x(n’k) 2 j=k+2

(k) 2

k+1

w1 = = .

(n’k) + x(n’k)

xk+1 + x xk+1

2 2

5.6 The QR Method for Matrices in Hessenberg Form 213

The construction of the Householder vector is performed by Program 32,

which takes as input a vector p ∈ Rn’k (formerly, the vector x(n’k) ) and

returns a vector q ∈ Rn’k (the Householder vector w(k) ), with a cost of

the order of n ¬‚ops.

If M ∈ Rm—m is the generic matrix to which the Householder matrix P

(5.38) is applied (where I is the identity matrix of order m and v∈ Rm ),

letting w = MT v, then

PM = M ’ βvwT , β = 2/ v 2 . (5.49)

2

Therefore, performing the product PM amounts to a matrix-vector product

(w = MT v) plus an external product vector-vector (vwT ). The overall

computational cost of the product PM is thus equal to 2(m2 + m) ¬‚ops.

Similar considerations hold in the case where the product MP is to be

computed; de¬ning w = Mv, we get

MP = M ’ βwvT . (5.50)

Notice that (5.49) and (5.50) do not require the explicit construction of

the matrix P. This reduces the computational cost to an order of m2 ¬‚ops,

whilst executing the product PM without taking advantage of the special

structure of P would increase the operation count to an order of m3 ¬‚ops.

Program 32 - vhouse : Construction of the Householder vector

function [v,beta]=vhouse(x)

n=length(x); x=x/norm(x); s=x(2:n)™*x(2:n); v=[1; x(2:n)];

if (s==0), beta=0;

else

mu=sqrt(x(1)ˆ2+s);

if (x(1) <= 0), v(1)=x(1)-mu;

else, v(1)=-s/(x(1)+mu); end

beta=2*v(1)ˆ2/(s+v(1)ˆ2); v=v/v(1);

end

Concerning the Givens rotation matrices, the computation of c and s is

carried out as follows. Let i and k be two ¬xed indices and assume that

the k-th component of a given vector x ∈ Rn must be set to zero. Letting

r = x2 + x2 , relation (5.44) yields

i k

c ’s xi r

= (5.51)

s c xk 0

hence there is no need of explicitly computing θ, nor evaluating any trigono-

metric function.

Executing Program 33 to solve system (5.51), requires 5 ¬‚ops, plus the

evaluation of a square root. As already noticed in the case of Householder

214 5. Approximation of Eigenvalues and Eigenvectors

matrices, even for Givens rotations we don™t have to explicitly compute the

matrix G(i, k, θ) to perform its product with a given matrix M∈ Rm—m .

For that purpose Programs 34 and 35 are used, both at the cost of 6m

¬‚ops. Looking at the structure (5.43) of matrix G(i, k, θ), it is clear that

the ¬rst algorithm only modi¬es rows i and k of M, whilst the second one

only changes columns i and k of M.

We conclude by noticing that the computation of the Householder vector

v and of the Givens sine and cosine (c, s), are well-conditioned operations

with respect to rounding errors (see [GL89], pp. 212-217 and the references

therein).

The solution of system (5.51) is implemented in Program 33. The input

parameters are the vector components xi and xk , whilst the output data

are the Givens cosine and sine c and s.

Program 33 - givcos : Computation of Givens cosine and sine

function [c,s]=givcos(xi, xk)

if (xk==0), c=1; s=0; else,

if abs(xk) > abs(xi)

t=-xi/xk; s=1/sqrt(1+tˆ2); c=s*t;

else

t=-xk/xi; c=1/sqrt(1+tˆ2); s=c*t;

end

end

Programs 34 and 35 compute G(i, k, θ)T M and MG(i, k, θ) respectively.

The input parameters c and s are the Givens cosine and sine. In Program

34, the indices i and k identify the rows of the matrix M that are being

a¬ected by the update M ← G(i, k, θ)T M, while j1 and j2 are the indices

of the columns involved in the computation. Similarly, in Program 35 i

and k identify the columns e¬ected by the update M ← MG(i, k, θ), while

j1 and j2 are the indices of the rows involved in the computation.

Program 34 - garow : Product G(i, k, θ)T M

function [M]=garow(M,c,s,i,k,j1,j2)

for j=j1:j2

t1=M(i,j);

t2=M(k,j);

M(i,j)=c*t1-s*t2;

M(k,j)=s*t1+c*t2;

end

Program 35 - gacol : Product MG(i, k, θ)

function [M]=gacol(M,c,s,j1,j2,i,k)

for j=j1:j2

t1=M(j,i);

5.7 The QR Iteration with Shifting Techniques 215

t2=M(j,k);

M(j,i)=c*t1-s*t2;

M(j,k)=s*t1+c*t2;

end

5.7 The QR Iteration with Shifting Techniques

Example 5.9 reveals that the QR iteration does not always converge to the

real Schur form of a given matrix A. To make this happen, an e¬ective

approach consists of incorporating in the QR iteration (5.32) a shifting

technique similar to that introduced for inverse iteration in Section 5.3.2.

This leads to the QR method with single shift described in Section 5.7.1,

which is used to accelerate the convergence of the QR iteration when A has

eigenvalues with moduli very close to each other.

In Section 5.7.2, a more sophisticated shifting technique is considered,

which guarantees the convergence of the QR iteration to the (approximate)

Schur form of matrix A (see Property 5.8). The resulting method (known as

QR iteration with double shift) is the most popular version of the QR iter-

ation (5.32) for solving the matrix eigenvalue problem, and is implemented

in the MATLAB intrinsic function eig.

5.7.1 The QR Method with Single Shift

Given µ ∈ R, the shifted QR iteration is de¬ned as follows. For k = 1, 2, . . . ,

until convergence:

determine Q(k) , R(k) such that

Q(k) R(k) = T(k’1) ’ µI (QR factorization);

(5.52)

then, let

T(k) = R(k) Q(k) + µI.

T

where T(0) = Q(0) AQ(0) is in upper Hessenberg form. Since the QR

factorization in (5.52) is performed on the shifted matrix T(k’1) ’ µI, the

scalar µ is called shift. The sequence of matrices T(k) generated by (5.52)

is still similar to the initial matrix A, since for any k ≥ 1

T

R(k) Q(k) + µI = Q(k) Q(k) R(k) Q(k) + µQ(k)

T T

Q(k) Q(k) R(k) + µI Q(k) = Q(k) T(k’1) Q(k)

=

k ≥ 0.

= (Q(0) Q(1) . . . Q(k) )T A(Q(0) Q(1) . . . Q(k) ),

216 5. Approximation of Eigenvalues and Eigenvectors

Assume µ is ¬xed and that the eigenvalues of A are ordered in such a way

that

|»1 ’ µ| ≥ |»2 ’ µ| ≥ . . . ≥ |»n ’ µ|.

(k)

Then it can be shown that, for 1 < j ¤ n, the subdiagonal entry tj,j’1

tends to zero with a rate that is proportional to the ratio

|(»j ’ µ)/(»j’1 ’ µ)|k .

This extends the convergence result (5.37) to the shifted QR method (see

[GL89], Sections 7.5.2 and 7.3).

The result above suggests that if µ is chosen in such a way that

|»n ’ µ| < |»i ’ µ|, i = 1, . . . , n ’ 1,

(k)

then the matrix entry tn,n’1 in the iteration (5.52) tends rapidly to zero

as k increases. (In the limit, if µ were equal to an eigenvalue of T(k) , that

(k) (k)

is of A, then tn,n’1 = 0 and tn,n = µ). In practice one takes

µ = t(k) , (5.53)

n,n

yielding the so called QR iteration with single shift. Correspondingly, the

(k)

convergence to zero of the sequence tn,n’1 is quadratic in the sense that

(k) (k+1)

if |tn,n’1 |/ T(0) 2 = ·k < 1, for some k ≥ 0, then |tn,n’1 |/ T(0) 2 = O(·k )

2

(see [Dem97], pp. 161-163 and [GL89], pp. 354-355).

This can be pro¬tably taken into account when programming the QR

iteration with single shift by monitoring the size of the subdiagonal entry

(k) (k)

|tn,n’1 |. In practice, tn,n’1 is set equal to zero if

(k) (k)

|tn,n’1 | ¤ µ(|tn’1,n’1 | + |tn,n |), k ≥ 0,

(k)

(5.54)

for a prescribed µ, in general of the order of the roundo¬ unit. (This con-

vergence test is adopted in the library EISPACK). If A is an Hessenberg

(k) (k)

matrix, when for a certain k an,n’1 is set to zero, tn,n provides the desired

approximation of »n . Then the QR iteration with shift can continue on the

matrix T(k) (1 : n ’ 1, 1 : n ’ 1), and so on. This is a de¬‚ation algorithm

(for another example see Remark 5.3).

Example 5.10 We consider again the matrix A as in Example 5.9. Program 36,

with toll equal to the roundo¬ unit, converges in 14 iterations to the following

approximate real Schur form of A, which displays the correct eigenvalues of matrix

A on its diagonal (to six signi¬cant ¬gures)

®

65 0 0 0 0

0 2.5888 ’0.0445 ’4.2959

’21.2768

= 0 0 ’13.1263 ’4.0294 ’13.079 .

(40)

T

0 0 21.2768 ’2.6197

0

° »

0 0 0 0 13.1263

5.7 The QR Iteration with Shifting Techniques 217

(k)

We also report in Table 5.2 the convergence rate p(k) of the sequence tn,n’1

(n = 5) computed as

(k)

|tn,n’1 |

1

k ≥ 1.

(k)

p =1+ log (k’1) ,

|tn,n’1 |

log(·k )

The results show good agreement with the expected quadratic rate.

(k)

|tn,n’1 |/ T(0) 2 p(k)

k

0 0.13865

1.5401 · 10’2

1 2.1122

1.2213 · 10’4

2 2.1591

1.8268 · 10’8

3 1.9775

8.9036 · 10’16

4 1.9449

(k)

TABLE 5.2. Convergence rate of the sequence tn,n’1 in the QR iteration with

single shift

•

The coding of the QR iteration with single shift (5.52) is given in Pro-

gram 36. The code utilizes Program 29 to reduce the matrix A in upper

Hessenberg form and Program 31 to perform the QR factorization step.

The input parameters toll and itmax are the tolerance µ in (5.54) and

the maximum admissible number of iterations, respectively. In output, the

program returns the (approximate) real Schur form of A and the number

of iterations needed for its computation.

Program 36 - qrshift : QR iteration with single shift

function [T,iter]=qrshift(A,toll,itmax)

n=max(size(A)); iter=0; [T,Q]=houshess(A);

for k=n:-1:2

I=eye(k);

while abs(T(k,k-1)) > toll*(abs(T(k,k))+abs(T(k-1,k-1)))

iter=iter+1;

if (iter > itmax),

return

end

mu=T(k,k); [Q,R,c,s]=qrgivens(T(1:k,1:k)-mu*I);

T(1:k,1:k)=R*Q+mu*I;

end

T(k,k-1)=0;

end

218 5. Approximation of Eigenvalues and Eigenvectors

5.7.2 The QR Method with Double Shift

The single-shift QR iteration (5.52) with the choice (5.53) for µ is e¬ective

if the eigenvalues of A are real, but not necessarily when complex conjugate

eigenvalues are present, as happens in the following example.

Example 5.11 The matrix A ∈ R4—4 (reported below to ¬ve signi¬cant ¬gures)

®

’0.6392 ’1.3143

1.5726 3.7696

’1.2054

’0.0420

0.2166 0.4006

A=

’0.1411

0.0226 0.3592 0.2045

° »

’0.1814 ’3.2330

1.1146 1.2648

has eigenvalues {±i, 1, 2}, i being the imaginary unit. Running Program 36 with

toll equal to the roundo¬ unit yields after 100 iterations

®

2 1.1999 0.5148 4.9004

0 0.7182

’0.0001 ’0.8575

T(101) = .

0 ’0.8186

1.1662 0.0001

° »

0 0 0 1

The obtained matrix is the real Schur form of A, where the 2—2 block T(101) (2:3,

2:3) has complex conjugate eigenvalues ±i. These eigenvalues cannot be computed

•

by the algorithm (5.52)-(5.53) since µ is real.

The problem with this example is that working with real matrices neces-

sarily yields a real shift, whereas a complex one would be needed. The QR

iteration with double shift is set up to account for complex eigenvalues and

allows for removing the 2—2 diagonal blocks of the real Schur form of A.

Precisely, suppose that the QR iteration with single shift (5.52) detects

(k)

at some step k a 2—2 diagonal block Rkk that cannot be reduced into

upper triangular form. Since the iteration is converging to the real Schur

(k)

form of the matrix A the two eigenvalues of Rkk are complex conjugate

¯

and will be denoted by »(k) and »(k) . The double shift strategy consists of

5.7 The QR Iteration with Shifting Techniques 219

the following steps:

determine Q(k) , R(k) such that

Q(k) R(k) = T(k’1) ’ »(k) I (¬rst QR factorization);

then, let

T(k) = R(k) Q(k) + »(k) I;

(5.55)

determine Q(k+1) , R(k+1) such that

¯

Q(k+1) R(k+1) = T(k) ’ »(k) I (second QR factorization);

then, let

¯

T(k+1) = R(k+1) Q(k+1) + »(k) I.

Once the double shift has been carried out the QR iteration with single shift

is continued until a situation analogous to the one above is encountered.

The QR iteration incorporating the double shift strategy is the most e¬ec-

tive algorithm for computing eigenvalues and yields the approximate Schur

form of a given matrix A. Its actual implementation is far more sophisti-

cated than the outline above and is called QR iteration with Francis shift

(see [Fra61], and, also, [GL89], Section 7.5 and [Dem97], Section 4.4.5). As

for the case of the QR iteration with single shift, quadratic convergence can

also be proven for the QR method with Francis shift. However, special ma-

trices have recently been found for which the method fails to converge (see

for an example Exercise 14 and Remark 5.13). We refer for some analysis

and remedies to [Bat90], [Day96], although the ¬nding of a shift strategy

that guarantees convergence of the QR iteration for all matrices is still an

open problem.

Example 5.12 Let us apply the QR iteration with double shift to the matrix

A in Example 5.11. After 97 iterations of Program 37, with toll equal to the

roundo¬ unit, we get the following (approximate) Schur form of A, which displays

on its diagonal the four eigenvalues of A

®

’2.33 + 0.86i

2 1 + 2i 4.90

0 5.02 · 10’14 + i ’2.02 + 6.91 · 10’14 i

0.72

T(97) = (97)

t ’1.78 · 10’14 ’ i ’0.82

0

° 31 »

(97) (97)

t41 t42 0 1

where t31 = 2.06 · 10’17 + 7.15 · 10’49 i, t41 = ’5.59 · 10’17 and t42

(97) (97) (97)

=

’4.26 · 10’18 , respectively. •

Example 5.13 Consider the pseudo-spectral di¬erentiation matrix (10.73) of

order 5. This matrix is singular, with a unique eigenvalue » = 0 of algebraic

220 5. Approximation of Eigenvalues and Eigenvectors

multiplicity equal to 5 (see [CHQZ88], p. 44). In this case the QR method with

double shift provides an inaccurate approximation of the spectrum of the ma-

trix. Indeed, using Program 37, with toll=eps, the method converges after 59

iterations to an upper triangular matrix with diagonal entries given by 0.0020,

0.0006 ± 0.0019i and ’0.0017 ± 0.0012i, respectively. Using the MATLAB intrin-

sic function eig yields instead the eigenvalues ’0.0024, ’0.0007 ± 0.0023i and

0.0019 ± 0.0014i. •

A basic implementation of the QR iteration with double shift is provided

in Program 37. The input/output parameters are the same as those of

Program 36. The output matrix T is the approximate Schur form of matrix

A.

Program 37 - qr2shift : QR iteration with double shift

function [T,iter]=qr2shift(A,toll,itmax)

n=max(size(A)); iter=0; [T,Q]=houshess(A);

for k=n:-1:2

I=eye(k);

while abs(T(k,k-1)) > toll*(abs(T(k,k))+abs(T(k-1,k-1)))

iter=iter+1; if (iter > itmax), return, end

mu=T(k,k); [Q,R,c,s]=qrgivens(T(1:k,1:k)-mu*I);

T(1:k,1:k)=R*Q+mu*I;

if (k > 2),

Tdiag2=abs(T(k-1,k-1))+abs(T(k-2,k-2));

if abs(T(k-1,k-2)) ¡= toll*Tdiag2;

[lambda]=eig(T(k-1:k,k-1:k));

[Q,R,c,s]=qrgivens(T(1:k,1:k)-lambda(1)*I);

T(1:k,1:k)=R*Q+lambda(1)*I;

[Q,R,c,s]=qrgivens(T(1:k,1:k)-lambda(2)*I);

T(1:k,1:k)=R*Q+lambda(2)*I;

end

end

end, T(k,k-1)=0;

end

I=eye(2);

while (abs(T(2,1)) > toll*(abs(T(2,2))+abs(T(1,1)))) & (iter <= itmax)

iter=iter+1; mu=T(2,2);

[Q,R,c,s]=qrgivens(T(1:2,1:2)-mu*I); T(1:2,1:2)=R*Q+mu*I;

end

5.8 Computing the Eigenvectors and the SVD of a Matrix 221

5.8 Computing the Eigenvectors and the SVD of a

Matrix

The power and inverse iterations described in Section 5.3.2 can be used

to compute a selected number of eigenvalue/eigenvector pairs. If all the

eigenvalues and eigenvectors of a matrix are needed, the QR iteration can be

pro¬tably employed to compute the eigenvectors as shown in Sections 5.8.1

and 5.8.2. In Section 5.8.3 we deal with the computation of the singular

value decomposition (SVD) of a given matrix.

5.8.1 The Hessenberg Inverse Iteration

For any approximate eigenvalue » computed by the QR iteration as de-

scribed in Section 5.7.2, the inverse iteration (5.28) can be applied to the

matrix H = QT AQ in Hessenberg form, yielding an approximate eigenvec-

tor q. Then, the eigenvector x associated with » is computed as x = Qq.

Clearly, one can take advantage of the structure of the Hessenberg matrix

for an e¬cient solution of the linear system at each step of (5.28). Typi-

cally, only one iteration is required to produce an adequate approximation

of the desired eigenvector x (see [GL89], Section 7.6.1 and [PW79] for more

details).

5.8.2 Computing the Eigenvectors from the Schur Form of a

Matrix

Suppose that the (approximate) Schur form QH AQ=T of a given matrix

A∈ Rn—n has been computed by the QR iteration with double shift, Q

being a unitary matrix and T being upper triangular.

Then, if Ax=»x, we have QH AQQH x= QH »x, i.e., letting y=QH x, T

y=»y holds. Therefore y is an eigenvector of T, so that to compute the

eigenvectors of A we can work directly on the Schur form T.

Assume for simplicity that » = tkk ∈ C is a simple eigenvalue of A. Then

the upper triangular matrix T can be decomposed as

®

T11 v T13

T= wT ,

0 »

° »

0 0 T33

where T11 ∈ C(k’1)—(k’1) and T33 ∈ C(n’k)—(n’k) are upper triangular

matrices, v∈ Ck’1 , w∈ Cn’k and » ∈ σ(T11 ) ∪ σ(T33 ).

222 5. Approximation of Eigenvalues and Eigenvectors

Thus, letting y = yk’1 , y, yn’k , with yk’1 ∈ Ck’1 , y ∈ C and yn’k ∈

T T

Cn’k , the matrix eigenvector problem (T - »I) y=0 can be written as

±

(T11 ’ »Ik’1 )yk’1 + vy+ T13 yn’k =0

wT yn’k =0 (5.56)

(T33 ’ »In’k )yn’k = 0.

Since » is simple, both matrices T11 ’ »Ik’1 and T33 ’ »In’k are nonsin-

gular, so that the third equation in (5.56) yields yn’k = 0 and the ¬rst

equation becomes

(T11 ’ »Ik’1 )yk’1 = ’vy.

Setting arbitrarily y = 1 and solving the triangular system above for yk’1

yields (formally)

«

’(T11 ’ »Ik’1 )’1 v

¬ ·

y=¬ ·.

1

0

The desired eigenvector x can then be computed as x=Qy.

An e¬cient implementation of the above procedure is carried out in the

intrinsic MATLAB function eig. Invoking this function with the format [V,

D]= eig(A) yields the matrix V whose columns are the right eigenvectors

of A and the diagonal matrix D contains its eigenvalues. Further details can

be found in the strvec subroutine in the LAPACK library, while for the

computation of eigenvectors in the case where A is symmetric, we refer to

[GL89], Chapter 8 and [Dem97], Section 5.3.

5.8.3 Approximate Computation of the SVD of a Matrix

In this section we describe the Golub-Kahan-Reinsch algorithm for the

computation of the SVD of a matrix A ∈ Rm—n with m ≥ n (see [GL89],

Section 5.4). The method consists of two phases, a direct one and an iter-

ative one.

In the ¬rst phase A is transformed into an upper trapezoidal matrix of

the form

B

U T AV = , (5.57)

0

where U and V are two orthogonal matrices and B ∈ Rn—n is upper bidi-

agonal. The matrices U and V are generated using n + m ’ 3 Householder

matrices U1 , . . . , Um’1 , V1 , . . . , Vn’2 as follows.

5.8 Computing the Eigenvectors and the SVD of a Matrix 223

The algorithm initially generates U1 in such a way that the matrix A(1) =

(1)

U1 A has ai1 = 0 if i > 1. Then, V1 is determined so that A(2) = A(1) V1

(2)

has a1j = 0 for j > 2, preserving at the same time the null entries of the

previous step. The procedure is repeated starting from A(2) , and taking U2

(3)

such that A(3) = U2 A(2) has ai2 = 0 for i > 2 and V2 in such a way that

(4)

A(4) = A(3) V2 has a2j = 0 for j > 3, yet preserving the null entries already

generated. For example, in the case m = 5, n = 4 the ¬rst two steps of the

reduction process yield

® ®

•••• ••00

0 • • • 0 • • •

A = U1 A = 0 • • • ’’ A = A V1 = 0 • • • ,

(1) (2) (1)

°0 • • •» °0 • • •»

• • • • • •

0 0

having denoted by • the entries of the matrices that in principle are di¬erent

than zero. After at most m ’ 1 steps, we ¬nd (5.57) with

U = U1 U2 . . . Um’1 , V = V1 V2 . . . Vn’2 .

In the second phase, the obtained matrix B is reduced into a diagonal

matrix Σ using the QR iteration. Precisely, a sequence of upper bidiagonal

matrices B(k) are constructed such that, as k ’ ∞, their o¬-diagonal

entries tend to zero quadratically and the diagonal entries tend to the

singular values σi of A. In the limit, the process generates two orthogonal

matrices W and Z such that

W T BZ = Σ = diag(σ1 , . . . , σn ).

The SVD of A is then given by

Σ

UT AV = ,

0

with U = Udiag(W, Im’n ) and V = VZ.

The computational cost of this procedure is 2m2 n + 4mn2 + 9 n3 ¬‚ops,

2

which reduces to 2mn2 ’ 2 n3 ¬‚ops if only the singular values are computed.

3

In this case, recalling what was stated in Section 3.13 about AT A, the

method described in the present section is preferable to computing directly

the eigenvalues of AT A and then taking their square roots.

As for the stability of this procedure, it can be shown that the computed

σi turn out to be the singular values of the matrix A + δA with

¤ Cmn u A 2 ,

δA 2

224 5. Approximation of Eigenvalues and Eigenvectors

Cmn being a constant dependent on n, m and the roundo¬ unit u. For other

approaches to the computation of the SVD of a matrix, see [Dat95] and

[GL89].

5.9 The Generalized Eigenvalue Problem

Let A, B ∈ Cn—n be two given matrices; for any z ∈ C, we call A ’ zB a

matrix pencil and denote it by (A,B). The set σ(A,B) of the eigenvalues of

(A,B) is de¬ned as

σ(A, B) = {µ ∈ C : det(A ’ µB) = 0} .

The generalized matrix eigenvalue problem can be formulated as: ¬nd » ∈

σ(A,B) and a nonnull vector x ∈ Cn such that

Ax = »Bx. (5.58)

The pair (», x) satisfying (5.58) is an eigenvalue/eigenvector pair of the

pencil (A,B). Note that by setting B=In in (5.58) we recover the standard

matrix eigenvalue problem considered thus far.

Problems like (5.58) arise frequently in engineering applications, e.g.,

in the study of vibrations of structures (buildings, aircrafts and bridges)

or in the mode analysis for waveguides (see [Inm94] and [Bos93]). Another

example is the computation of the extremal eigenvalues of a preconditioned

matrix P’1 A (in which case B = P in (5.58)) when solving a linear system

with an iterative method (see Remark 4.2).

Let us introduce some de¬nitions. We say that the pencil (A,B) is regular

if det(A-zB) is not identically zero, otherwise the pencil is singular. When

(A,B) is regular, p(z) = det(A ’ zB) is the characteristic polynomial of the

pencil; denoting by k the degree of p, the eigenvalues of (A,B) are de¬ned

as:

1. the roots of p(z) = 0, if k = n;

2. ∞ if k < n (with multiplicity equal to n ’ k).

Example 5.14 (Taken from [Par80], [Saa92] and [GL89])

’1 0 0 1

σ(A, B) = ±i

p(z) = z 2 + 1

A= , B= =’

0 1 1 0

’1 0 0 0

σ(A, B) = {0, ∞}

A= , B= p(z) = z =’

0 0 0 1

1 2 1 0

σ(A, B) = C.

A= , B= p(z) = 0 =’

0 0 0 0

The ¬rst pair of matrices shows that symmetric pencils, unlike symmetric ma-

trices, may exhibit complex conjugate eigenvalues. The second pair is a regular

pencil displaying an eigenvalue equal to in¬nity, while the third pair is an example

•

of singular pencil.

5.9 The Generalized Eigenvalue Problem 225

5.9.1 Computing the Generalized Real Schur Form

The de¬nitions and examples above imply that the pencil (A,B) has n ¬nite

eigenvalues i¬ B is nonsingular.

In such a case, a possible approach to the solution of problem (5.58) is

to transform it into the equivalent eigenvalue problem Cx = »x, where the

matrix C is the solution of the system BC = A, then apply the QR iteration

to C. For actually computing the matrix C, one can use Gauss elimination

with pivoting or the techniques shown in Section 3.6. This procedure can

yield inaccurate results if B is ill-conditioned, since computing C is a¬ected

by rounding errors of the order of u A 2 B’1 2 (see [GL89], p. 376).

A more attractive approach is based on the following result, which gen-

eralizes the Schur decomposition theorem 1.5 to the case of regular pencils

to (for a proof, see [Dat95], p. 497).

Property 5.10 (Generalized Schur decomposition) Let (A,B) be a

regular pencil. Then, there exist two unitary matrices U and Z such that

UH AZ = T, UH BZ = S, where T and S are upper triangular. For i =

1, . . . , n the eigenvalues of (A,B) are given by