<<

. 9
( 26)



>>

(215)
and x1 ’ x1 ∞ = 1.38 · 10 .
Figure 5.2 (left) shows the reliability of the a posteriori estimate (5.26). The
sequences |»1 ’ ν (k) | (solid line) and the corresponding a posteriori estimates
(5.26) (dashed line) are plotted as a function of the number of iterations (in
abscissae). Notice the excellent agreement between the two curves.

The symmetric matrix A in (5.31)
®  ® 
1 3 4 8 1 6
   
A= 3 2 , T= 3 7
1 5 (5.31)
° » ° »
4 2 1 4 9 2

has the following spectrum: »1 = 7.047, »2 = ’3.1879 and »3 = ’0.8868 (to ¬ve
signi¬cant ¬gures).
It is interesting to compare the behaviour of the power method when computing
»1 for the symmetric matrix A and for its similar matrix M = T’1 AT, where T
is the nonsingular (and nonorthogonal) matrix in (5.31).
Running Program 26 with z(0) = (1, 1, 1)T , the power method converges to the
eigenvalue »1 in 18 and 30 iterations, for matrices A and M, respectively. The
sequence of absolute errors |»1 ’ ν (k) | is plotted in Figure 5.2 (right) where (S)
and (NS) refer to the computations on A and M, respectively. Notice the rapid
error reduction in the symmetric case, according to the quadratic convergence
properties of the power method (see Section 5.3.1).
We ¬nally employ the inverse power method (5.28) to compute the eigenvalue
of smallest module »3 = 0.512 of matrix A in (5.30). Running Program 27 with

q(0) = (1, 1, 1)T / 3, the method converges in 9 iterations, with absolute errors
|»3 ’ σ (9) | = 1.194 · 10’12 and x3 ’ x3 ∞ = 4.59 · 10’13 .
(9)

200 5. Approximation of Eigenvalues and Eigenvectors

5.4 The QR Iteration
In this section we present some iterative techniques for simultaneously ap-
proximating all the eigenvalues of a given matrix A. The basic idea consists
of reducing A, by means of suitable similarity transformations, into a form
for which the calculation of the eigenvalues is easier than on the starting
matrix.
The problem would be satisfactorily solved if the unitary matrix U of the
Schur decomposition theorem 1.5, such that T = UH AU, T being upper
triangular and with tii = »i (A) for i = 1, . . . , n, could be determined in a
direct way, that is, with a ¬nite number of operations. Unfortunately, it is
a consequence of Abel™s theorem that, for n ≥ 5, the matrix U cannot be
computed in an elementary way (see Exercise 8). Thus, our problem can
be solved only resorting to iterative techniques.
The reference algorithm in this context is the QR iteration method, that is
here examined only in the case of real matrices. (For some remarks on the
extension of the algorithms to the complex case, see [GL89], Section 5.2.10
and [Dem97], Section 4.2.1).

Let A ∈ Rn—n ; given an orthogonal matrix Q(0) ∈ Rn—n and letting
T(0) = (Q(0) )T AQ(0) , for k = 1, 2, . . . , until convergence, the QR iteration
consists of:
determine Q(k) , R(k) such that
Q(k) R(k) = T(k’1) (QR factorization);
(5.32)
then, let
T(k) = R(k) Q(k) .

At each step k ≥ 1, the ¬rst phase of the iteration is the factorization of
the matrix T(k’1) into the product of an orthogonal matrix Q(k) with an
upper triangular matrix R(k) (see Section 5.6.3). The second phase is a
simple matrix product. Notice that

T(k) = R(k) Q(k) = (Q(k) )T (Q(k) R(k) )Q(k) = (Q(k) )T T(k’1) Q(k)
(5.33)
k ≥ 0,
= (Q(0) Q(1) . . . Q(k) )T A(Q(0) Q(1) . . . Q(k) ),

i.e., every matrix T(k) is orthogonally similar to A. This is particularly
relevant for the stability of the method, since, as shown in Section 5.2, the
conditioning of the matrix eigenvalue problem for T(k) is not worse than it
is for A (see also [GL89], p. 360).
A basic implementation of the QR iteration (5.32), assuming Q(0) = In ,
is examined in Section 5.5, while a more computationally e¬cient version,
starting from T(0) in upper Hessenberg form, is described in detail in Sec-
tion 5.6.
5.5 The Basic QR Iteration 201



If A has real eigenvalues, distinct in module, it will be seen in Section 5.5
that the limit of T(k) is an upper triangular matrix (with the eigenvalues of
A on the main diagonal). However, if A has complex eigenvalues the limit
of T(k) cannot be an upper triangular matrix T. Indeed if it were T would
necessarily have real eigenvalues, although it is similar to A.
Failure to converge to a triangular matrix may also happen in more
general situations, as addressed in Example 5.9.
For this, it is necessary to introduce variants of the QR iteration (5.32),
based on de¬‚ation and shift techniques (see Section 5.7 and, for a more
detailed discussion of the subject, [GL89], Chapter 7, [Dat95], Chapter 8
and [Dem97], Chapter 4).
These techniques allow for T(k) to converge to an upper quasi-triangular
matrix, known as the real Schur decomposition of A, for which the following
result holds (for the proof we refer to [GL89], pp. 341-342).

Property 5.8 Given a matrix A ∈ Rn—n , there exists an orthogonal ma-
trix Q ∈ Rn—n such that
® 
R11 R12 . . . R1m
 
0 R22 . . . R2m 
 
QT AQ =  . , (5.34)
. 
. .
..
. 
. .
..
.
° »
0 0 . . . Rmm

where each block Rii is either a real number or a matrix of order 2 having
complex conjugate eigenvalues, and

Q = lim Q(0) Q(1) · · · Q(k) (5.35)
k’∞

Q(k) being the orthogonal matrix generated by the k-th factorization step of
the QR iteration (5.32).


The QR iteration can be also employed to compute all the eigenvectors
of a given matrix. For this purpose, we describe in Section 5.8 two possi-
ble approaches, one based on the coupling between (5.32) and the inverse
iteration (5.28), the other working on the real Schur form (5.34).



5.5 The Basic QR Iteration
In the basic version of the QR method, one sets Q(0) = In in such a way
that T(0) = A. At each step k ≥ 1 the QR factorization of the matrix T(k’1)
202 5. Approximation of Eigenvalues and Eigenvectors

can be carried out using the modi¬ed Gram-Schmidt procedure introduced
in Section 3.4.3, with a cost of the order of 2n3 ¬‚ops (for a full matrix A).
The following convergence result holds (for the proof, see [GL89], Theorem
7.3.1, or [Wil65], pp. 517-519).

Property 5.9 (Convergence of QR method) Let A ∈ Rn—n be a ma-
trix such that
|»1 | > |»2 | > . . . > |»n |.
Then
® 
»1 t12 ... t1n
 
 ... 
0 »2 t23
 
= 
lim T(k) . . (5.36)
 . . ..
 .
. .
k’+∞
.
. . .»
°
0 0 ... »n

As for the convergence rate, we have
k
»i
(k)
|ti,i’1 | =O for k ’ +∞.
, i = 2, . . . , n, (5.37)
»i’1

Under the additional assumption that A is symmetric, the sequence {T(k) }
tends to a diagonal matrix.
If the eigenvalues of A, although being distinct, are not well-separated, it
follows from (5.37) that the convergence of T(k) towards a triangular matrix
can be quite slow. With the aim of accelerating it, one can resort to the
so-called shift technique, which will be addressed in Section 5.7.

Remark 5.2 It is always possible to reduce the matrix A into a triangular
form by means of an iterative algorithm employing nonorthogonal similarity
transformations. In such a case, the so-called LR iteration (known also as
Rutishauser method, [Rut58]) can be used, from which the QR method has
actually been derived (see also [Fra61], [Wil65]). The LR iteration is based
on the factorization of the matrix A into the product of two matrices L
and R, respectively unit lower triangular and upper triangular, and on the
(nonorthogonal) similarity transformation

L’1 AL = L’1 (LR)L = RL.

The rare use of the LR method in practical computations is due to the loss
of accuracy that can arise in the LR factorization because of the increase
in module of the upper diagonal entries of R. This aspect, together with
the details of the implementation of the algorithm and some comparisons
with the QR method, is examined in [Wil65], Chapter 8.
5.6 The QR Method for Matrices in Hessenberg Form 203

Example 5.4 We apply the QR method to the symmetric matrix A∈ R4—4 such
that aii = 4, for i = 1, . . . , 4, and aij = 4 + i ’ j for i < j ¤ 4, whose eigenvalues
are (to three signi¬cant ¬gures) »1 = 11.09, »2 = 3.41, »3 = 0.90 and »4 = 0.59.
After 20 iterations, we get
® 
6.44 · 10’10 ’3.62 · 10’15 9.49 · 10’15
11.09
 
 
’10 ’11 ’16
 6.47 · 10 1.43 · 10 4.60 · 10 
3.41
 .
T(20) = 
 1.74 · 10’21 
1.43 · 10’11 1.16 · 10’4
0.90
° »
2.32 · 10’25 2.68 · 10’15 1.16 · 10’4 0.58

Notice the “almost-diagonal” structure of the matrix T(20) and, at the same
time, the e¬ect of rounding errors which slightly alter its expected symmetry.
Good agreement can also be found between the under-diagonal entries and the

estimate (5.37).

A computer implementation of the basic QR iteration is given in Program
28. The QR factorization is executed using the modi¬ed Gram-Schmidt
method (Program 8). The input parameter niter denotes the maximum
admissible number of iterations, while the output parameters T, Q and R
are the matrices T, Q and R in (5.32) after niter iterations of the QR
procedure.

Program 28 - basicqr : Basic QR iteration

function [T,Q,R]=basicqr(A,niter)
T=A;
for i=1:niter,
[Q,R]=mod grams(T);
T=R*Q;
end




5.6 The QR Method for Matrices in Hessenberg
Form
The naive implementation of the QR method discussed in the previous
section requires (for a full matrix) a computational e¬ort of the order of
n3 ¬‚ops per iteration. In this section we illustrate a variant for the QR
iteration, known as Hessenberg-QR iteration, with a greatly reduced com-
putational cost. The idea consists of starting the iteration from a matrix
(0)
T(0) in upper Hessenberg form, that is, tij = 0 for i > j + 1. Indeed, it can
be checked that with this choice the computation of T(k) in (5.32) requires
only an order of n2 ¬‚ops per iteration.
204 5. Approximation of Eigenvalues and Eigenvectors

To achieve maximum e¬ciency and stability of the algorithm, suitable
transformation matrices are employed. Precisely, the preliminary reduc-
tion of matrix A into upper Hessenberg form is realized with Householder
matrices, whilst the QR factorization of T(k) is carried out using Givens
matrices, instead of the modi¬ed Gram-Schmidt procedure introduced in
Section 3.4.3.
We brie¬‚y describe Householder and Givens matrices in the next section,
referring to Section 5.6.5 for their implementation. The algorithm and ex-
amples of computations of the real Schur form of A starting from its upper
Hessenberg form are then discussed in Section 5.6.4.


5.6.1 Householder and Givens Transformation Matrices
For any vector v ∈ Rn , let us introduce the orthogonal and symmetric
matrix

P = I ’ 2vvT / v 2 . (5.38)
2


Given a vector x ∈ Rn , the vector y = Px is the re¬‚ection of x with respect
to the hyperplane π = span{v}⊥ formed by the set of the vectors that are
orthogonal to v (see Figure 5.3, left). Matrix P and the vector v are called
the Householder re¬‚ection matrix and the Householder vector, respectively.


xk
v
y
x θ
x
y
π
xi

FIGURE 5.3. Re¬‚ection across the hyperplane orthogonal to v (left); rotation by
an angle θ in the plane (xi , xk ) (right)


Householder matrices can be used to set to zero a block of components of
a given vector x ∈ Rn . If, in particular, one would like to set to zero all the
components of x, except the m-th one, the Householder vector ought to be
chosen as

v = x ± x 2 em , (5.39)
5.6 The QR Method for Matrices in Hessenberg Form 205

em being the m-th unit vector of Rn . The matrix P computed by (5.38)
depends on the vector x itself, and it can be checked that
® T

Px = °0, 0, . . . , ± x 2 , 0, . . . , 0» . (5.40)
m


Example 5.5 Let x = [1, 1, 1, 1]T and m = 3; then
®  ®  ® 
5 ’1 ’3 ’1
1 0
     
1  ’3 ’1   0
1  ’1 5
    
v= , P =  , Px =  .
3 6  ’3 ’3 ’3 ’3   ’2 
° » ° » ° »
’1 ’1 ’3 5
1 0


If, for some k ≥ 1, the ¬rst k components of x must remain unaltered,
while the components from k + 2 on are to be set to zero, the Householder
matrix P = P(k) takes the following form
® 
Ik 0 (k) (k) T
» , Rn’k = In’k ’ 2 w (w ) .
°
P(k) = (5.41)
w(k) 22
0 Rn’k

As usual, Ik is the identity matrix of order k, while Rn’k is the elementary
Householder matrix of order n ’ k associated with the re¬‚ection across the
hyperplane orthogonal to the vector w(k) ∈ Rn’k . According to (5.39), the
Householder vector is given by
(n’k)
w(k) = x(n’k) ± x(n’k) 2 e1 , (5.42)

where x(n’k) ∈ Rn’k is the vector formed by the last n ’ k components
(n’k)
is the ¬rst unit vector of the canonical basis of Rn’k . We
of x and e1
notice that P(k) is a function of x through w(k) . The criterion for ¬xing
the sign in the de¬nition of w(k) will be discussed in Section 5.6.5.
The components of the transformed vector y = P(k) x read
±
j = 1, · · · , k,
 yj = xj


j = k + 2, · · · , n,
yj = 0



yk+1 = ± x(n’k) 2 .

The Householder matrices will be employed in Section 5.6.2 to carry out the
reduction of a given matrix A to a matrix H(0) in upper Hessenberg form.
This is the ¬rst step for an e¬cient implementation of the QR iteration
(5.32) with T(0) = H(0) (see Section 5.6).
206 5. Approximation of Eigenvalues and Eigenvectors

Example 5.6 Let x=[1, 2, 3, 4, 5]T and k = 1 (this means that we want to set to
zero the components xj , with j = 3, 4, 5). The matrix P(1) and the transformed
vector y=P(1) x are given by
®  ® 
1 0 0 0 0 1
   
0 0.6804   7.3485 
0.2722 0.4082 0.5443
   
   
= 0 ’0.3816  , y= 0 .
’0.3053
0.4082 0.7710
P(1)    
   
0 ’0.5089  0 
’0.3053
0.5443 0.5929
° » ° »
’0.3816 ’0.5089
0 0.6804 0.3639 0



The Givens elementary matrices are orthogonal rotation matrices that al-
low for setting to zero in a selective way the entries of a vector or matrix.
For a given pair of indices i and k, and a given angle θ, these matrices are
de¬ned as

G(i, k, θ) = In ’ Y (5.43)

where Y∈ Rn—n is a null matrix except for the following entries: yii =
ykk = 1 ’ cos(θ), yik = ’ sin(θ) = ’yki . A Givens matrix is of the form

i k

® 
0
1
 1
 
 
..
 
.
 
 
 
cos(θ) sin(θ)
  i
 
..
 .
G(i, k, θ) = .
 
 
’ sin(θ) cos(θ) k
 
 
..
 
.
 
 
1
° »
0 1

T
For a given vector x ∈ Rn , the product y = (G(i, k, θ)) x is equivalent to
rotating x counterclockwise by an angle θ in the coordinate plane (xi , xk )
(see Figure 5.3, right). After letting c = cos θ, s = sin θ, it follows that
±
 xj j = i, k,


cxi ’ sxk j = i,
yj = (5.44)



sxi + cxk j = k.
5.6 The QR Method for Matrices in Hessenberg Form 207

Let ±ik = x2 + x2 and notice that if c and s satisfy c = xi /±ik , s =
i k
’xk /±ik (in such a case, θ = arctan(’xk /xi )), we get yk = 0, yi = ±ik
and yj = xj for j = i, k. Similarly, if c = xk /±ik , s = xi /±ik (that is,
θ = arctan(xi /xk )), then yi = 0, yk = ±ik and yj = xj for j = i, k.
The Givens rotation matrices will be employed in Section 5.6.3 to carry
out the QR factorization step in the algorithm (5.32) and in Section 5.10.1
where the Jacobi method for symmetric matrices is considered.


Remark 5.3 (Householder de¬‚ation for power iterations) The ele-
mentary Householder tranformations can be conveniently employed to com-
pute the ¬rst (largest or smallest) eigenvalues of a given matrix A ∈ Rn—n .
Assume that the eigenvalues of A are ordered as in (5.16) and suppose
that the eigenvalue/eigenvector pair (»1 , x1 ) has been computed using the
power method. Then the matrix A can be transformed into the following
block form (see for the proof [Dat95], Theorem 8.5.4, p. 418)

bT
»1
A1 = HAH =
0 A2

where b ∈ Rn’1 , H is the Householder matrix such that Hx1 = ±x1 for
some ± ∈ R, the matrix A2 ∈ R(n’1)—(n’1) and the eigenvalues of A2 are
the same as those of A except for »1 . The matrix H can be computed using
(5.38) with v = x1 ± x1 2 e1 .
The de¬‚ation procedure consists of computing the second dominant (sub-
dominant) eigenvalue of A by applying the power method to A2 provided
that |»2 | = |»3 |. Once »2 is available, the corresponding eigenvector x2
can be computed by applying the inverse power iteration to the matrix A
taking µ = »2 (see Section 5.3.2) and proceeding in the same manner with
the remaining eigenvalue/eigenvector pairs. An example of de¬‚ation will
be presented in Section 5.12.2.



5.6.2 Reducing a Matrix in Hessenberg Form
A given matrix A∈ Rn—n can be transformed by similarity transforma-
tions into upper Hessenberg form with a cost of the order of n3 ¬‚ops. The
algorithm takes n ’ 2 steps and the similarity transformation Q can be
computed as the product of Householder matrices P(1) · · · P(n’2) . For this,
the reduction procedure is commonly known as the Householder method.
Precisely, the k-th step consists of a similarity transformation of A through
the Householder matrix P(k) which aims at setting to zero the elements in
positions k + 2, . . . , n of the k-th column of A, for k = 1, . . . , (n ’ 2) (see
208 5. Approximation of Eigenvalues and Eigenvectors

Section 5.6.1). For example, in the case n = 4 the reduction process yields
®  ®  ® 
• ••• • ••• ••••
     
• • • •  ’’  • • • •  ’’  • • • • 
     
  P(1)   P(2)  ,
• • • • 0 • • • 0 • • •
° » ° » ° »
• • • • • • • • •
0 0 0

having denoted by • the entries of the matrices that are a priori non zero.
Given A(0) = A, the method generates a sequence of matrices A(k) that
are orthogonally similar to A

A(k) = PT A(k’1) P(k) = (P(k) · · · P(1) )T A(P(k) · · · P(1) )
(k)
(5.45)
k ≥ 1.
= QT AQ(k) ,
(k)


For any k ≥ 1 the matrix P(k) is given by (5.41), where x is substituted
by the k-th column vector in matrix A(k’1) . From the de¬nition (5.41) it
is easy to check that the operation PT A(k’1) leaves the ¬rst k rows of
(k)
(k’1) T (k’1)
P(k) = A(k) does the same on the
A unchanged, whilst P(k) A
¬rst k columns. After n ’ 2 steps of the Householder reduction, we obtain
a matrix H = A(n’2) in upper Hessenberg form.

Remark 5.4 (The symmetric case) If A is symmetric, the transforma-
tion (5.45) maintains such a property. Indeed

∀k ≥ 1,
(A(k) )T = (QT AQ(k) )T = A(k) ,
(k)

so that H must be tridiagonal. Its eigenvalues can be e¬ciently computed
using the method of Sturm sequences with a cost of the order of n ¬‚ops, as
will be addressed in Section 5.10.2.

A coding of the Householder reduction method is provided in Program
29. To compute the Householder vector, Program 32 is employed. In output,
the two matrices H and Q, respectively in Hessenberg form and orthogonal,
are such that H = QT AQ.

Program 29 - houshess : Hessenberg-Householder method
function [H,Q]=houshess(A)
n=max(size(A)); Q=eye(n); H=A;
for k=1:(n-2),
[v,beta]=vhouse(H(k+1:n,k)); I=eye(k); N=zeros(k,n-k);
m=length(v); R=eye(m)-beta*v*v™; H(k+1:n,k:n)=R*H(k+1:n,k:n);
H(1:n,k+1:n)=H(1:n,k+1:n)*R; P=[I, N; N™, R]; Q=Q*P;
end
5.6 The QR Method for Matrices in Hessenberg Form 209

The algorithm coded in Program 29 requires a cost of 10n3 /3 ¬‚ops and
is well-conditioned with respect to rounding errors. Indeed, the following
estimate holds (see [Wil65], p. 351)

¤ cn2 u A
H = QT (A + E) Q, E (5.46)
F F


where H is the Hessenberg matrix computed by Program 29, Q is an or-
thogonal matrix, c is a constant, u is the roundo¬ unit and · F is the
Frobenius norm (see (1.18)).

Example 5.7 Consider the reduction in upper Hessenberg form of the Hilbert
matrix H4 ∈ R4—4 . Since H4 is symmetric, its Hessenberg form should be a
triadigonal symmetric matrix. Program 29 yields the following results
®  ® 
1.00 0 0 0 1.00 0.65 0 0
   
0 0.20   0.65 0
’0.61
0.77 0.65 0.06
   
Q= , H= .
0 ’0.76  0 0.001 
0.51 0.40 0.06 0.02
° » ° »
0 0.38 0.69 0.61 0 0 0.001 0.0003

The accuracy of the transformation procedure (5.45) can be measured by com-
puting the · F norm of the di¬erence between H and QT H4 Q. This yields
H ’ QT H4 Q F = 3.38 · 10’17 , which con¬rms the stability estimate (5.46). •


5.6.3 QR Factorization of a Matrix in Hessenberg Form
In this section we explain how to e¬ciently implement the generic step of
the QR iteration, starting from a matrix T(0) = H(0) in upper Hessenberg
form.
For any k ≥ 1, the ¬rst phase consists of computing the QR factorization
of H(k’1) by means of n ’ 1 Givens rotations
T T T
(k) (k)
(k) (k’1)
H(k’1) = R(k) ,
Q H = Gn’1 ... G1 (5.47)

(k)
where, for any j = 1, . . . , n ’ 1, Gj = G(j, j + 1, θj )(k) is, for any k ≥ 1,
the j-th Givens rotation matrix (5.43) in which θj is chosen according
to (5.44) in such a way that the entry of indices (j + 1, j) of the matrix
T T
(k) (k)
· · · G1 H(k’1) is set equal to zero. The product (5.47) requires
Gj
a computational cost of the order of 3n2 ¬‚ops.
The next step consists of completing the orthogonal similarity transfor-
mation
(k) (k)
H(k) = R(k) Q(k) = R(k) G1 . . . Gn’1 . (5.48)
210 5. Approximation of Eigenvalues and Eigenvectors

(k) (k)
The orthogonal matrix Q(k) = G1 . . . Gn’1 is in upper Hessenberg
form. Indeed, taking for instance n = 3, and recalling Section 5.6.1, we get
® ® ® 
•• •••
0 100
   
Q(k) = G1 G2 =  • • 0   0 • •  =  • • • .
(k) (k)
° »° »° »
0•• 0••
00 1

Also (5.48) requires a cost of the order of 3n2 operations, for an overall e¬ort
of the order of 6n2 ¬‚ops. In conclusion, performing the QR factorization
with elementary Givens rotations on a starting matrix in upper Hessenberg
form yields a reduction of the operation count of one order of magnitude
with respect to the corresponding factorization with the modi¬ed Gram-
Schmidt procedure of Section 5.5.


5.6.4 The Basic QR Iteration starting from Upper Hessenberg
Form
A basic implementation of the QR iteration to generate the real Schur
decomposition of a matrix A is given in Program 30.
This program uses Program 29 to reduce A in upper Hessenberg form;
then each QR factorization step in (5.32) is carried out with Program 31
which utilizes Givens rotations. The overall e¬ciency of the algorithm is
ensured by pre- and post-multiplying with Givens matrices as explained in
(k) (k)
Section 5.6.5, and by constructing the matrix Q(k) = G1 . . . Gn’1 in the
function prodgiv, with a cost of n2 ’ 2 ¬‚ops and without explicitly forming
(k)
the Givens matrices Gj , for j = 1, . . . , n ’ 1.
As for the stability of the QR iteration with respect to rounding er-
ror propagation, it can be shown that the computed real Schur form T is
orthogonally similar to a matrix “close” to A, i.e.

T = QT (A + E)Q

where Q is orthogonal and E u A 2 , u being the machine roundo¬
2
unit.

Program 30 returns in output, after niter iterations of the QR proce-
dure, the matrices T, Q and R in (5.32).

Program 30 - hessqr : Hessenberg-QR method
function [T,Q,R]=hessqr(A,niter)
n=max(size(A));
[T,Qhess]=houshess(A);
for j=1:niter
[Q,R,c,s]= qrgivens(T);
5.6 The QR Method for Matrices in Hessenberg Form 211

T=R;
for k=1:n-1,
T=gacol(T,c(k),s(k),1,k+1,k,k+1);
end
end


Program 31 - givensqr : QR factorization with Givens rotations
function [Q,R,c,s]= qrgivens(H)
[m,n]=size(H);
for k=1:n-1
[c(k),s(k)]=givcos(H(k,k),H(k+1,k));
H=garow(H,c(k),s(k),k,k+1,k,n);
end
R=H; Q=prodgiv(c,s,n);

function Q=prodgiv(c,s,n)
n1=n-1; n2=n-2;
Q=eye(n); Q(n1,n1)=c(n1); Q(n,n)=c(n1);
Q(n1,n)=s(n1); Q(n,n1)=-s(n1);
for k=n2:-1:1,
k1=k+1; Q(k,k)=c(k); Q(k1,k)=-s(k);
q=Q(k1,k1:n); Q(k,k1:n)=s(k)*q;
Q(k1,k1:n)=c(k)*q;
end


Example 5.8 Consider the matrix A (already in Hessenberg form)
® 
3 17 ’37 18 ’40
 
1 0 0
00
 
 
A= 0 1 0 .
00
 
 
0 0 0
10
° »
0 0 0 1 0

To compute its eigenvalues, given by ’4, ±i, 2 and 5, we apply the QR method
and we compute the matrix T(40) after 40 iterations of Program 30. Notice that
the algorithm converges to the real Schur decomposition of A (5.34), with three
blocks Rii of order 1 (i = 1, 2, 3) and with the block R44 = T(40) (4 : 5, 4 : 5)
having eigenvalues equal to ±i
® 
4.9997 18.9739 ’34.2570 32.8760 ’28.4604
 
0 5.6216 
’3.9997 6.7693 ’6.4968
 
 
= 0 1.1562  .
2 ’1.4557
(40)
0
T  
 
0 ’0.8709 
0 0 0.3129
° »
’0.3129
0 0 0 1.2607
212 5. Approximation of Eigenvalues and Eigenvectors



Example 5.9 Let us now employ the QR method to generate the Schur real
decomposition of the matrix A below, after reducing it to upper Hessenberg form

® 
17 24 1 8 15
 
 23 16 
5 7 14
 
 
A= 4 22  .
6 13 20
 
 
 10 3
12 19 21
° »
11 18 25 2 9

The eigenvalues of A are real and given (to four signi¬cant ¬gures) by »1 =
65, »2,3 = ±21.28 and »4,5 = ±13.13. After 40 iterations of Program 30, the
computed matrix reads
® 
65 0 0 0 0
 
0 4.4848 ’3.4375 
14.6701 14.2435
 
 
= 0 2.0416  .
16.6735 ’14.6701 ’1.2159
(40)
T  
 
0 0 ’13.0293 ’0.7643 
0
° »
’3.3173
0 0 0 13.0293

It is not upper triangular, but block upper triangular, with a diagonal block
R11 = 65 and the two blocks
’13.0293 ’0.7643
14.6701 14.2435
R22 = , R33 = ,
’14.6701 ’3.3173
16.6735 13.0293

having spectrums given by σ(R22 ) = »2,3 and σ(R33 ) = »4,5 respectively.
It is important to recognize that matrix T(40) is not the real Schur decomposi-
tion of A, but only a “cheating” version of it. In fact, in order for the QR method
to converge to the real Schur decomposition of A, it is mandatory to resort to

the shift techniques introduced in Section 5.7.


5.6.5 Implementation of Transformation Matrices
In the de¬nition (5.42) it is convenient to choose the minus sign, obtaining
(n’k)
w(k) = x(n’k) ’ x(n’k) 2 e1 , in such a way that the vector Rn’k x(n’k)
(n’k)
is a positive multiple of e1 . If xk+1 is positive, in order to avoid nu-
merical cancellations, the computation can be rationalized as follows
n
’ x2
j

x2 x(n’k) 2 j=k+2
(k) 2
k+1
w1 = = .
(n’k) + x(n’k)
xk+1 + x xk+1
2 2
5.6 The QR Method for Matrices in Hessenberg Form 213

The construction of the Householder vector is performed by Program 32,
which takes as input a vector p ∈ Rn’k (formerly, the vector x(n’k) ) and
returns a vector q ∈ Rn’k (the Householder vector w(k) ), with a cost of
the order of n ¬‚ops.
If M ∈ Rm—m is the generic matrix to which the Householder matrix P
(5.38) is applied (where I is the identity matrix of order m and v∈ Rm ),
letting w = MT v, then

PM = M ’ βvwT , β = 2/ v 2 . (5.49)
2

Therefore, performing the product PM amounts to a matrix-vector product
(w = MT v) plus an external product vector-vector (vwT ). The overall
computational cost of the product PM is thus equal to 2(m2 + m) ¬‚ops.
Similar considerations hold in the case where the product MP is to be
computed; de¬ning w = Mv, we get

MP = M ’ βwvT . (5.50)

Notice that (5.49) and (5.50) do not require the explicit construction of
the matrix P. This reduces the computational cost to an order of m2 ¬‚ops,
whilst executing the product PM without taking advantage of the special
structure of P would increase the operation count to an order of m3 ¬‚ops.

Program 32 - vhouse : Construction of the Householder vector
function [v,beta]=vhouse(x)
n=length(x); x=x/norm(x); s=x(2:n)™*x(2:n); v=[1; x(2:n)];
if (s==0), beta=0;
else
mu=sqrt(x(1)ˆ2+s);
if (x(1) <= 0), v(1)=x(1)-mu;
else, v(1)=-s/(x(1)+mu); end
beta=2*v(1)ˆ2/(s+v(1)ˆ2); v=v/v(1);
end


Concerning the Givens rotation matrices, the computation of c and s is
carried out as follows. Let i and k be two ¬xed indices and assume that
the k-th component of a given vector x ∈ Rn must be set to zero. Letting
r = x2 + x2 , relation (5.44) yields
i k

c ’s xi r
= (5.51)
s c xk 0

hence there is no need of explicitly computing θ, nor evaluating any trigono-
metric function.
Executing Program 33 to solve system (5.51), requires 5 ¬‚ops, plus the
evaluation of a square root. As already noticed in the case of Householder
214 5. Approximation of Eigenvalues and Eigenvectors

matrices, even for Givens rotations we don™t have to explicitly compute the
matrix G(i, k, θ) to perform its product with a given matrix M∈ Rm—m .
For that purpose Programs 34 and 35 are used, both at the cost of 6m
¬‚ops. Looking at the structure (5.43) of matrix G(i, k, θ), it is clear that
the ¬rst algorithm only modi¬es rows i and k of M, whilst the second one
only changes columns i and k of M.
We conclude by noticing that the computation of the Householder vector
v and of the Givens sine and cosine (c, s), are well-conditioned operations
with respect to rounding errors (see [GL89], pp. 212-217 and the references
therein).
The solution of system (5.51) is implemented in Program 33. The input
parameters are the vector components xi and xk , whilst the output data
are the Givens cosine and sine c and s.

Program 33 - givcos : Computation of Givens cosine and sine
function [c,s]=givcos(xi, xk)
if (xk==0), c=1; s=0; else,
if abs(xk) > abs(xi)
t=-xi/xk; s=1/sqrt(1+tˆ2); c=s*t;
else
t=-xk/xi; c=1/sqrt(1+tˆ2); s=c*t;
end
end
Programs 34 and 35 compute G(i, k, θ)T M and MG(i, k, θ) respectively.
The input parameters c and s are the Givens cosine and sine. In Program
34, the indices i and k identify the rows of the matrix M that are being
a¬ected by the update M ← G(i, k, θ)T M, while j1 and j2 are the indices
of the columns involved in the computation. Similarly, in Program 35 i
and k identify the columns e¬ected by the update M ← MG(i, k, θ), while
j1 and j2 are the indices of the rows involved in the computation.

Program 34 - garow : Product G(i, k, θ)T M
function [M]=garow(M,c,s,i,k,j1,j2)
for j=j1:j2
t1=M(i,j);
t2=M(k,j);
M(i,j)=c*t1-s*t2;
M(k,j)=s*t1+c*t2;
end


Program 35 - gacol : Product MG(i, k, θ)
function [M]=gacol(M,c,s,j1,j2,i,k)
for j=j1:j2
t1=M(j,i);
5.7 The QR Iteration with Shifting Techniques 215

t2=M(j,k);
M(j,i)=c*t1-s*t2;
M(j,k)=s*t1+c*t2;
end




5.7 The QR Iteration with Shifting Techniques
Example 5.9 reveals that the QR iteration does not always converge to the
real Schur form of a given matrix A. To make this happen, an e¬ective
approach consists of incorporating in the QR iteration (5.32) a shifting
technique similar to that introduced for inverse iteration in Section 5.3.2.
This leads to the QR method with single shift described in Section 5.7.1,
which is used to accelerate the convergence of the QR iteration when A has
eigenvalues with moduli very close to each other.
In Section 5.7.2, a more sophisticated shifting technique is considered,
which guarantees the convergence of the QR iteration to the (approximate)
Schur form of matrix A (see Property 5.8). The resulting method (known as
QR iteration with double shift) is the most popular version of the QR iter-
ation (5.32) for solving the matrix eigenvalue problem, and is implemented
in the MATLAB intrinsic function eig.


5.7.1 The QR Method with Single Shift
Given µ ∈ R, the shifted QR iteration is de¬ned as follows. For k = 1, 2, . . . ,
until convergence:

determine Q(k) , R(k) such that
Q(k) R(k) = T(k’1) ’ µI (QR factorization);
(5.52)
then, let
T(k) = R(k) Q(k) + µI.

T
where T(0) = Q(0) AQ(0) is in upper Hessenberg form. Since the QR
factorization in (5.52) is performed on the shifted matrix T(k’1) ’ µI, the
scalar µ is called shift. The sequence of matrices T(k) generated by (5.52)
is still similar to the initial matrix A, since for any k ≥ 1
T
R(k) Q(k) + µI = Q(k) Q(k) R(k) Q(k) + µQ(k)
T T
Q(k) Q(k) R(k) + µI Q(k) = Q(k) T(k’1) Q(k)
=

k ≥ 0.
= (Q(0) Q(1) . . . Q(k) )T A(Q(0) Q(1) . . . Q(k) ),
216 5. Approximation of Eigenvalues and Eigenvectors

Assume µ is ¬xed and that the eigenvalues of A are ordered in such a way
that
|»1 ’ µ| ≥ |»2 ’ µ| ≥ . . . ≥ |»n ’ µ|.
(k)
Then it can be shown that, for 1 < j ¤ n, the subdiagonal entry tj,j’1
tends to zero with a rate that is proportional to the ratio
|(»j ’ µ)/(»j’1 ’ µ)|k .
This extends the convergence result (5.37) to the shifted QR method (see
[GL89], Sections 7.5.2 and 7.3).
The result above suggests that if µ is chosen in such a way that
|»n ’ µ| < |»i ’ µ|, i = 1, . . . , n ’ 1,
(k)
then the matrix entry tn,n’1 in the iteration (5.52) tends rapidly to zero
as k increases. (In the limit, if µ were equal to an eigenvalue of T(k) , that
(k) (k)
is of A, then tn,n’1 = 0 and tn,n = µ). In practice one takes
µ = t(k) , (5.53)
n,n

yielding the so called QR iteration with single shift. Correspondingly, the
(k)
convergence to zero of the sequence tn,n’1 is quadratic in the sense that
(k) (k+1)
if |tn,n’1 |/ T(0) 2 = ·k < 1, for some k ≥ 0, then |tn,n’1 |/ T(0) 2 = O(·k )
2

(see [Dem97], pp. 161-163 and [GL89], pp. 354-355).
This can be pro¬tably taken into account when programming the QR
iteration with single shift by monitoring the size of the subdiagonal entry
(k) (k)
|tn,n’1 |. In practice, tn,n’1 is set equal to zero if
(k) (k)
|tn,n’1 | ¤ µ(|tn’1,n’1 | + |tn,n |), k ≥ 0,
(k)
(5.54)
for a prescribed µ, in general of the order of the roundo¬ unit. (This con-
vergence test is adopted in the library EISPACK). If A is an Hessenberg
(k) (k)
matrix, when for a certain k an,n’1 is set to zero, tn,n provides the desired
approximation of »n . Then the QR iteration with shift can continue on the
matrix T(k) (1 : n ’ 1, 1 : n ’ 1), and so on. This is a de¬‚ation algorithm
(for another example see Remark 5.3).

Example 5.10 We consider again the matrix A as in Example 5.9. Program 36,
with toll equal to the roundo¬ unit, converges in 14 iterations to the following
approximate real Schur form of A, which displays the correct eigenvalues of matrix
A on its diagonal (to six signi¬cant ¬gures)
® 
65 0 0 0 0
 
0 2.5888 ’0.0445 ’4.2959 
’21.2768
 
 
= 0 0 ’13.1263 ’4.0294 ’13.079  .
(40)
T  
 
0 0 21.2768 ’2.6197 
0
° »
0 0 0 0 13.1263
5.7 The QR Iteration with Shifting Techniques 217

(k)
We also report in Table 5.2 the convergence rate p(k) of the sequence tn,n’1
(n = 5) computed as

(k)
|tn,n’1 |
1
k ≥ 1.
(k)
p =1+ log (k’1) ,
|tn,n’1 |
log(·k )

The results show good agreement with the expected quadratic rate.

(k)
|tn,n’1 |/ T(0) 2 p(k)
k
0 0.13865
1.5401 · 10’2
1 2.1122
1.2213 · 10’4
2 2.1591
1.8268 · 10’8
3 1.9775
8.9036 · 10’16
4 1.9449
(k)
TABLE 5.2. Convergence rate of the sequence tn,n’1 in the QR iteration with
single shift




The coding of the QR iteration with single shift (5.52) is given in Pro-
gram 36. The code utilizes Program 29 to reduce the matrix A in upper
Hessenberg form and Program 31 to perform the QR factorization step.
The input parameters toll and itmax are the tolerance µ in (5.54) and
the maximum admissible number of iterations, respectively. In output, the
program returns the (approximate) real Schur form of A and the number
of iterations needed for its computation.

Program 36 - qrshift : QR iteration with single shift

function [T,iter]=qrshift(A,toll,itmax)
n=max(size(A)); iter=0; [T,Q]=houshess(A);
for k=n:-1:2
I=eye(k);
while abs(T(k,k-1)) > toll*(abs(T(k,k))+abs(T(k-1,k-1)))
iter=iter+1;
if (iter > itmax),
return
end
mu=T(k,k); [Q,R,c,s]=qrgivens(T(1:k,1:k)-mu*I);
T(1:k,1:k)=R*Q+mu*I;
end
T(k,k-1)=0;
end
218 5. Approximation of Eigenvalues and Eigenvectors

5.7.2 The QR Method with Double Shift
The single-shift QR iteration (5.52) with the choice (5.53) for µ is e¬ective
if the eigenvalues of A are real, but not necessarily when complex conjugate
eigenvalues are present, as happens in the following example.




Example 5.11 The matrix A ∈ R4—4 (reported below to ¬ve signi¬cant ¬gures)


® 
’0.6392 ’1.3143
1.5726 3.7696
 
 ’1.2054 
’0.0420
0.2166 0.4006
 
A= 
 ’0.1411 
0.0226 0.3592 0.2045
° »
’0.1814 ’3.2330
1.1146 1.2648



has eigenvalues {±i, 1, 2}, i being the imaginary unit. Running Program 36 with
toll equal to the roundo¬ unit yields after 100 iterations


® 
2 1.1999 0.5148 4.9004
 
0 0.7182 
’0.0001 ’0.8575
 
T(101) = .
0 ’0.8186 
1.1662 0.0001
° »
0 0 0 1



The obtained matrix is the real Schur form of A, where the 2—2 block T(101) (2:3,
2:3) has complex conjugate eigenvalues ±i. These eigenvalues cannot be computed

by the algorithm (5.52)-(5.53) since µ is real.




The problem with this example is that working with real matrices neces-
sarily yields a real shift, whereas a complex one would be needed. The QR
iteration with double shift is set up to account for complex eigenvalues and
allows for removing the 2—2 diagonal blocks of the real Schur form of A.
Precisely, suppose that the QR iteration with single shift (5.52) detects
(k)
at some step k a 2—2 diagonal block Rkk that cannot be reduced into
upper triangular form. Since the iteration is converging to the real Schur
(k)
form of the matrix A the two eigenvalues of Rkk are complex conjugate
¯
and will be denoted by »(k) and »(k) . The double shift strategy consists of
5.7 The QR Iteration with Shifting Techniques 219

the following steps:

determine Q(k) , R(k) such that
Q(k) R(k) = T(k’1) ’ »(k) I (¬rst QR factorization);
then, let
T(k) = R(k) Q(k) + »(k) I;
(5.55)
determine Q(k+1) , R(k+1) such that
¯
Q(k+1) R(k+1) = T(k) ’ »(k) I (second QR factorization);
then, let
¯
T(k+1) = R(k+1) Q(k+1) + »(k) I.

Once the double shift has been carried out the QR iteration with single shift
is continued until a situation analogous to the one above is encountered.

The QR iteration incorporating the double shift strategy is the most e¬ec-
tive algorithm for computing eigenvalues and yields the approximate Schur
form of a given matrix A. Its actual implementation is far more sophisti-
cated than the outline above and is called QR iteration with Francis shift
(see [Fra61], and, also, [GL89], Section 7.5 and [Dem97], Section 4.4.5). As
for the case of the QR iteration with single shift, quadratic convergence can
also be proven for the QR method with Francis shift. However, special ma-
trices have recently been found for which the method fails to converge (see
for an example Exercise 14 and Remark 5.13). We refer for some analysis
and remedies to [Bat90], [Day96], although the ¬nding of a shift strategy
that guarantees convergence of the QR iteration for all matrices is still an
open problem.

Example 5.12 Let us apply the QR iteration with double shift to the matrix
A in Example 5.11. After 97 iterations of Program 37, with toll equal to the
roundo¬ unit, we get the following (approximate) Schur form of A, which displays
on its diagonal the four eigenvalues of A
® 
’2.33 + 0.86i
2 1 + 2i 4.90
 
0 5.02 · 10’14 + i ’2.02 + 6.91 · 10’14 i
 0.72 
 
T(97) =  (97) 
t ’1.78 · 10’14 ’ i ’0.82 
0
° 31 »
(97) (97)
t41 t42 0 1

where t31 = 2.06 · 10’17 + 7.15 · 10’49 i, t41 = ’5.59 · 10’17 and t42
(97) (97) (97)
=
’4.26 · 10’18 , respectively. •

Example 5.13 Consider the pseudo-spectral di¬erentiation matrix (10.73) of
order 5. This matrix is singular, with a unique eigenvalue » = 0 of algebraic
220 5. Approximation of Eigenvalues and Eigenvectors

multiplicity equal to 5 (see [CHQZ88], p. 44). In this case the QR method with
double shift provides an inaccurate approximation of the spectrum of the ma-
trix. Indeed, using Program 37, with toll=eps, the method converges after 59
iterations to an upper triangular matrix with diagonal entries given by 0.0020,
0.0006 ± 0.0019i and ’0.0017 ± 0.0012i, respectively. Using the MATLAB intrin-
sic function eig yields instead the eigenvalues ’0.0024, ’0.0007 ± 0.0023i and
0.0019 ± 0.0014i. •




A basic implementation of the QR iteration with double shift is provided
in Program 37. The input/output parameters are the same as those of
Program 36. The output matrix T is the approximate Schur form of matrix
A.

Program 37 - qr2shift : QR iteration with double shift



function [T,iter]=qr2shift(A,toll,itmax)
n=max(size(A)); iter=0; [T,Q]=houshess(A);
for k=n:-1:2
I=eye(k);
while abs(T(k,k-1)) > toll*(abs(T(k,k))+abs(T(k-1,k-1)))
iter=iter+1; if (iter > itmax), return, end
mu=T(k,k); [Q,R,c,s]=qrgivens(T(1:k,1:k)-mu*I);
T(1:k,1:k)=R*Q+mu*I;
if (k > 2),
Tdiag2=abs(T(k-1,k-1))+abs(T(k-2,k-2));
if abs(T(k-1,k-2)) ¡= toll*Tdiag2;
[lambda]=eig(T(k-1:k,k-1:k));
[Q,R,c,s]=qrgivens(T(1:k,1:k)-lambda(1)*I);
T(1:k,1:k)=R*Q+lambda(1)*I;
[Q,R,c,s]=qrgivens(T(1:k,1:k)-lambda(2)*I);
T(1:k,1:k)=R*Q+lambda(2)*I;
end
end
end, T(k,k-1)=0;
end
I=eye(2);
while (abs(T(2,1)) > toll*(abs(T(2,2))+abs(T(1,1)))) & (iter <= itmax)
iter=iter+1; mu=T(2,2);
[Q,R,c,s]=qrgivens(T(1:2,1:2)-mu*I); T(1:2,1:2)=R*Q+mu*I;
end
5.8 Computing the Eigenvectors and the SVD of a Matrix 221

5.8 Computing the Eigenvectors and the SVD of a
Matrix
The power and inverse iterations described in Section 5.3.2 can be used
to compute a selected number of eigenvalue/eigenvector pairs. If all the
eigenvalues and eigenvectors of a matrix are needed, the QR iteration can be
pro¬tably employed to compute the eigenvectors as shown in Sections 5.8.1
and 5.8.2. In Section 5.8.3 we deal with the computation of the singular
value decomposition (SVD) of a given matrix.



5.8.1 The Hessenberg Inverse Iteration
For any approximate eigenvalue » computed by the QR iteration as de-
scribed in Section 5.7.2, the inverse iteration (5.28) can be applied to the
matrix H = QT AQ in Hessenberg form, yielding an approximate eigenvec-
tor q. Then, the eigenvector x associated with » is computed as x = Qq.
Clearly, one can take advantage of the structure of the Hessenberg matrix
for an e¬cient solution of the linear system at each step of (5.28). Typi-
cally, only one iteration is required to produce an adequate approximation
of the desired eigenvector x (see [GL89], Section 7.6.1 and [PW79] for more
details).



5.8.2 Computing the Eigenvectors from the Schur Form of a
Matrix
Suppose that the (approximate) Schur form QH AQ=T of a given matrix
A∈ Rn—n has been computed by the QR iteration with double shift, Q
being a unitary matrix and T being upper triangular.
Then, if Ax=»x, we have QH AQQH x= QH »x, i.e., letting y=QH x, T
y=»y holds. Therefore y is an eigenvector of T, so that to compute the
eigenvectors of A we can work directly on the Schur form T.
Assume for simplicity that » = tkk ∈ C is a simple eigenvalue of A. Then
the upper triangular matrix T can be decomposed as
® 
T11 v T13
 
T= wT  ,
0 »
° »
0 0 T33

where T11 ∈ C(k’1)—(k’1) and T33 ∈ C(n’k)—(n’k) are upper triangular
matrices, v∈ Ck’1 , w∈ Cn’k and » ∈ σ(T11 ) ∪ σ(T33 ).
222 5. Approximation of Eigenvalues and Eigenvectors

Thus, letting y = yk’1 , y, yn’k , with yk’1 ∈ Ck’1 , y ∈ C and yn’k ∈
T T

Cn’k , the matrix eigenvector problem (T - »I) y=0 can be written as
±
 (T11 ’ »Ik’1 )yk’1 + vy+ T13 yn’k =0


wT yn’k =0 (5.56)



(T33 ’ »In’k )yn’k = 0.

Since » is simple, both matrices T11 ’ »Ik’1 and T33 ’ »In’k are nonsin-
gular, so that the third equation in (5.56) yields yn’k = 0 and the ¬rst
equation becomes
(T11 ’ »Ik’1 )yk’1 = ’vy.
Setting arbitrarily y = 1 and solving the triangular system above for yk’1
yields (formally)
« 
’(T11 ’ »Ik’1 )’1 v
¬ ·
y=¬ ·.
1
 
0

The desired eigenvector x can then be computed as x=Qy.
An e¬cient implementation of the above procedure is carried out in the
intrinsic MATLAB function eig. Invoking this function with the format [V,
D]= eig(A) yields the matrix V whose columns are the right eigenvectors
of A and the diagonal matrix D contains its eigenvalues. Further details can
be found in the strvec subroutine in the LAPACK library, while for the
computation of eigenvectors in the case where A is symmetric, we refer to
[GL89], Chapter 8 and [Dem97], Section 5.3.


5.8.3 Approximate Computation of the SVD of a Matrix
In this section we describe the Golub-Kahan-Reinsch algorithm for the
computation of the SVD of a matrix A ∈ Rm—n with m ≥ n (see [GL89],
Section 5.4). The method consists of two phases, a direct one and an iter-
ative one.
In the ¬rst phase A is transformed into an upper trapezoidal matrix of
the form
B
U T AV = , (5.57)
0

where U and V are two orthogonal matrices and B ∈ Rn—n is upper bidi-
agonal. The matrices U and V are generated using n + m ’ 3 Householder
matrices U1 , . . . , Um’1 , V1 , . . . , Vn’2 as follows.
5.8 Computing the Eigenvectors and the SVD of a Matrix 223

The algorithm initially generates U1 in such a way that the matrix A(1) =
(1)
U1 A has ai1 = 0 if i > 1. Then, V1 is determined so that A(2) = A(1) V1
(2)
has a1j = 0 for j > 2, preserving at the same time the null entries of the
previous step. The procedure is repeated starting from A(2) , and taking U2
(3)
such that A(3) = U2 A(2) has ai2 = 0 for i > 2 and V2 in such a way that
(4)
A(4) = A(3) V2 has a2j = 0 for j > 3, yet preserving the null entries already
generated. For example, in the case m = 5, n = 4 the ¬rst two steps of the
reduction process yield
®  ® 
•••• ••00
   
0 • • • 0 • • •
   
   
A = U1 A =  0 • • •  ’’ A = A V1 =  0 • • •  ,
(1) (2) (1)
   
   
°0 • • •» °0 • • •»
• • • • • •
0 0

having denoted by • the entries of the matrices that in principle are di¬erent
than zero. After at most m ’ 1 steps, we ¬nd (5.57) with

U = U1 U2 . . . Um’1 , V = V1 V2 . . . Vn’2 .


In the second phase, the obtained matrix B is reduced into a diagonal
matrix Σ using the QR iteration. Precisely, a sequence of upper bidiagonal
matrices B(k) are constructed such that, as k ’ ∞, their o¬-diagonal
entries tend to zero quadratically and the diagonal entries tend to the
singular values σi of A. In the limit, the process generates two orthogonal
matrices W and Z such that

W T BZ = Σ = diag(σ1 , . . . , σn ).

The SVD of A is then given by
Σ
UT AV = ,
0

with U = Udiag(W, Im’n ) and V = VZ.
The computational cost of this procedure is 2m2 n + 4mn2 + 9 n3 ¬‚ops,
2
which reduces to 2mn2 ’ 2 n3 ¬‚ops if only the singular values are computed.
3
In this case, recalling what was stated in Section 3.13 about AT A, the
method described in the present section is preferable to computing directly
the eigenvalues of AT A and then taking their square roots.
As for the stability of this procedure, it can be shown that the computed
σi turn out to be the singular values of the matrix A + δA with

¤ Cmn u A 2 ,
δA 2
224 5. Approximation of Eigenvalues and Eigenvectors

Cmn being a constant dependent on n, m and the roundo¬ unit u. For other
approaches to the computation of the SVD of a matrix, see [Dat95] and
[GL89].


5.9 The Generalized Eigenvalue Problem
Let A, B ∈ Cn—n be two given matrices; for any z ∈ C, we call A ’ zB a
matrix pencil and denote it by (A,B). The set σ(A,B) of the eigenvalues of
(A,B) is de¬ned as
σ(A, B) = {µ ∈ C : det(A ’ µB) = 0} .
The generalized matrix eigenvalue problem can be formulated as: ¬nd » ∈
σ(A,B) and a nonnull vector x ∈ Cn such that
Ax = »Bx. (5.58)
The pair (», x) satisfying (5.58) is an eigenvalue/eigenvector pair of the
pencil (A,B). Note that by setting B=In in (5.58) we recover the standard
matrix eigenvalue problem considered thus far.
Problems like (5.58) arise frequently in engineering applications, e.g.,
in the study of vibrations of structures (buildings, aircrafts and bridges)
or in the mode analysis for waveguides (see [Inm94] and [Bos93]). Another
example is the computation of the extremal eigenvalues of a preconditioned
matrix P’1 A (in which case B = P in (5.58)) when solving a linear system
with an iterative method (see Remark 4.2).
Let us introduce some de¬nitions. We say that the pencil (A,B) is regular
if det(A-zB) is not identically zero, otherwise the pencil is singular. When
(A,B) is regular, p(z) = det(A ’ zB) is the characteristic polynomial of the
pencil; denoting by k the degree of p, the eigenvalues of (A,B) are de¬ned
as:
1. the roots of p(z) = 0, if k = n;
2. ∞ if k < n (with multiplicity equal to n ’ k).

Example 5.14 (Taken from [Par80], [Saa92] and [GL89])
’1 0 0 1
σ(A, B) = ±i
p(z) = z 2 + 1
A= , B= =’
0 1 1 0
’1 0 0 0
σ(A, B) = {0, ∞}
A= , B= p(z) = z =’
0 0 0 1
1 2 1 0
σ(A, B) = C.
A= , B= p(z) = 0 =’
0 0 0 0
The ¬rst pair of matrices shows that symmetric pencils, unlike symmetric ma-
trices, may exhibit complex conjugate eigenvalues. The second pair is a regular
pencil displaying an eigenvalue equal to in¬nity, while the third pair is an example

of singular pencil.
5.9 The Generalized Eigenvalue Problem 225

5.9.1 Computing the Generalized Real Schur Form
The de¬nitions and examples above imply that the pencil (A,B) has n ¬nite
eigenvalues i¬ B is nonsingular.
In such a case, a possible approach to the solution of problem (5.58) is
to transform it into the equivalent eigenvalue problem Cx = »x, where the
matrix C is the solution of the system BC = A, then apply the QR iteration
to C. For actually computing the matrix C, one can use Gauss elimination
with pivoting or the techniques shown in Section 3.6. This procedure can
yield inaccurate results if B is ill-conditioned, since computing C is a¬ected
by rounding errors of the order of u A 2 B’1 2 (see [GL89], p. 376).
A more attractive approach is based on the following result, which gen-
eralizes the Schur decomposition theorem 1.5 to the case of regular pencils
to (for a proof, see [Dat95], p. 497).

Property 5.10 (Generalized Schur decomposition) Let (A,B) be a
regular pencil. Then, there exist two unitary matrices U and Z such that
UH AZ = T, UH BZ = S, where T and S are upper triangular. For i =
1, . . . , n the eigenvalues of (A,B) are given by

<<

. 9
( 26)



>>