. 1
( 26)



>>

Numerical Mathematics




Alfio Quarteroni
Riccardo Sacco
Fausto Saleri



Springer
Texts in Applied Mathematicsm37



Editors
J.E. Marsden
L. Sirovich
M. Golubitsky
W. J¤ger

Advisors
G. Iooss
P. Holmes
D. Barkley
M. Dellnitz
P. Newton




Springer
New York
Berlin
Heidelberg
Barcelona
Hong Kong
London
Milan
Paris
Singapore
Tokyo
Alfio QuarteroniMMRiccardo Sacco
Fausto Saleri




Numerical Mathematics


With 134 Illustrations




123
Riccardo Sacco
Alfio Quarteroni Fausto Saleri
Dipartimento di Matematica
Department of Mathematics Dipartimento di Matematica,
Politecnico di Milano
Ecole Polytechnique M“F. Enriques”
Piazza Leonardo da Vinci 32
MFe ´rale de Lausanne
´de Università degli Studi di
20133 Milan
CH-1015 Lausanne MMilano
Italy
Switzerland Via Saldini 50
ricsac@mate.polimi.it
alfio.quarteroni@epfl.ch 20133 Milan
Italy
fausto.saleri@unimi.it
Series Editors
J.E. Marsden L. Sirovich
Control and Dynamical Systems, 107“81 Division of Applied Mathematics
California Institute of Technology Brown University
Pasadena, CA 91125 Providence, RI 02912
USA USA

M. Golubitsky W. J¨ ger
a
Department of Mathematics Department of Applied Mathematics
University of Houston Universit a t Heidelberg
¨
Houston, TX 77204-3476 Im Neuenheimer Feld 294
USA 69120 Heidelberg
Germany

Mathematics Subject Classification (1991): 15-01, 34-01, 35-01, 65-01

Library of Congress Cataloging-in-Publication Data
Quarteroni, Alfio.
Numerical mathematics/Alfio Quarteroni, Riccardo Sacco, Fausto Saleri.
p.Mcm. ” (Texts in applied mathematics; 37)
Includes bibliographical references and index.
ISBN 0-387-98959-5 (alk. paper)
1. Numerical analysis.MI. Sacco, Riccardo.MII. Saleri, Fausto.MIII. Title.MIV. Series.
I. Title.MMII. Series.
QA297.Q83M2000
519.4”dc21 99-059414




© 2000 Springer-Verlag New York, Inc.
All rights reserved. This work may not be translated or copied in whole or in part without
the written permission of the publisher (Springer-Verlag New York, Inc., 175 Fifth Avenue,
New York, NY 10010, USA), except for brief excerpts in connection with reviews or scholarly
analysis. Use in connection with any form of information storage and retrieval, electronic
adaptation, computer software, or by similar or dissimilar methodology now known or heraf-
ter developed is forbidden.
The use of general descriptive names, trade names, trademarks, etc., in this publication, even
if the former are not especially identified, is not to be taken as a sign that such names, as
understood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely
by anyone.




ISBN 0-387-98959-5nSpringer-VerlagnNew YorknBerlinnHeidelbergMSPIN 10747955
Preface




Numerical mathematics is the branch of mathematics that proposes, de-
velops, analyzes and applies methods from scienti¬c computing to several
¬elds including analysis, linear algebra, geometry, approximation theory,
functional equations, optimization and di¬erential equations. Other disci-
plines such as physics, the natural and biological sciences, engineering, and
economics and the ¬nancial sciences frequently give rise to problems that
need scienti¬c computing for their solutions.
As such, numerical mathematics is the crossroad of several disciplines of
great relevance in modern applied sciences, and can become a crucial tool
for their qualitative and quantitative analysis. This role is also emphasized
by the continual development of computers and algorithms, which make it
possible nowadays, using scienti¬c computing, to tackle problems of such
a large size that real-life phenomena can be simulated providing accurate
responses at a¬ordable computational cost.
The corresponding spread of numerical software represents an enrichment
for the scienti¬c community. However, the user has to make the correct
choice of the method (or the algorithm) which best suits the problem at
hand. As a matter of fact, no black-box methods or algorithms exist that
can e¬ectively and accurately solve all kinds of problems.
One of the purposes of this book is to provide the mathematical foun-
dations of numerical methods, to analyze their basic theoretical proper-
ties (stability, accuracy, computational complexity), and demonstrate their
performances on examples and counterexamples which outline their pros
viii Preface

and cons. This is done using the MATLAB 1 software environment. This
choice satis¬es the two fundamental needs of user-friendliness and wide-
spread di¬usion, making it available on virtually every computer.
Every chapter is supplied with examples, exercises and applications of
the discussed theory to the solution of real-life problems. The reader is
thus in the ideal condition for acquiring the theoretical knowledge that is
required to make the right choice among the numerical methodologies and
make use of the related computer programs.
This book is primarily addressed to undergraduate students, with partic-
ular focus on the degree courses in Engineering, Mathematics, Physics and
Computer Science. The attention which is paid to the applications and the
related development of software makes it valuable also for graduate stu-
dents, researchers and users of scienti¬c computing in the most widespread
professional ¬elds.
The content of the volume is organized into four parts and 13 chapters.
Part I comprises two chapters in which we review basic linear algebra and
introduce the general concepts of consistency, stability and convergence of
a numerical method as well as the basic elements of computer arithmetic.
Part II is on numerical linear algebra, and is devoted to the solution of
linear systems (Chapters 3 and 4) and eigenvalues and eigenvectors com-
putation (Chapter 5).
We continue with Part III where we face several issues about functions
and their approximation. Speci¬cally, we are interested in the solution of
nonlinear equations (Chapter 6), solution of nonlinear systems and opti-
mization problems (Chapter 7), polynomial approximation (Chapter 8) and
numerical integration (Chapter 9).
Part IV, which is the more demanding as a mathematical background, is
concerned with approximation, integration and transforms based on orthog-
onal polynomials (Chapter 10), solution of initial value problems (Chap-
ter 11), boundary value problems (Chapter 12) and initial-boundary value
problems for parabolic and hyperbolic equations (Chapter 13).
Part I provides the indispensable background. Each of the remaining
Parts has a size and a content that make it well suited for a semester
course.
A guideline index to the use of the numerous MATLAB Programs de-
veloped in the book is reported at the end of the volume. These programs
are also available at the web site address:
http://www1.mate.polimi.it/˜ calnum/programs.html
For the reader™s ease, any code is accompanied by a brief description of
its input/output parameters.
We express our thanks to the sta¬ at Springer-Verlag New York for their
expert guidance and assistance with editorial aspects, as well as to Dr.

1 MATLAB is a registered trademark of The MathWorks, Inc.
Preface ix

Martin Peters from Springer-Verlag Heidelberg and Dr. Francesca Bonadei
from Springer-Italia for their advice and friendly collaboration all along
this project.
We gratefully thank Professors L. Gastaldi and A. Valli for their useful
comments on Chapters 12 and 13.
We also wish to express our gratitude to our families for their forbearance
and understanding, and dedicate this book to them.

Lausanne, Switzerland Al¬o Quarteroni
Milan, Italy Riccardo Sacco
Milan, Italy Fausto Saleri
January 2000
Contents




Series Preface v

Preface vii

PART I: Getting Started

1. Foundations of Matrix Analysis 1
1.1 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Operations with Matrices . . . . . . . . . . . . . . . . . . . 5
1.3.1 Inverse of a Matrix . . . . . . . . . . . . . . . . . . 6
1.3.2 Matrices and Linear Mappings . . . . . . . . . . . 7
1.3.3 Operations with Block-Partitioned Matrices . . . . 7
1.4 Trace and Determinant of a Matrix . . . . . . . . . . . . . 8
1.5 Rank and Kernel of a Matrix . . . . . . . . . . . . . . . . 9
1.6 Special Matrices . . . . . . . . . . . . . . . . . . . . . . . . 10
1.6.1 Block Diagonal Matrices . . . . . . . . . . . . . . . 10
1.6.2 Trapezoidal and Triangular Matrices . . . . . . . . 11
1.6.3 Banded Matrices . . . . . . . . . . . . . . . . . . . 11
1.7 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . 12
1.8 Similarity Transformations . . . . . . . . . . . . . . . . . . 14
1.9 The Singular Value Decomposition (SVD) . . . . . . . . . 16
1.10 Scalar Product and Norms in Vector Spaces . . . . . . . . 17
1.11 Matrix Norms . . . . . . . . . . . . . . . . . . . . . . . . . 21
xii Contents

1.11.1 Relation Between Norms and the
Spectral Radius of a Matrix . . . . . . . . . . . . . 25
1.11.2 Sequences and Series of Matrices . . . . . . . . . . 26
1.12 Positive De¬nite, Diagonally Dominant and M-Matrices . 27
1.13 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2. Principles of Numerical Mathematics 33
2.1 Well-Posedness and Condition Number of a Problem . . . 33
2.2 Stability of Numerical Methods . . . . . . . . . . . . . . . 37
2.2.1 Relations Between Stability and Convergence . . . 40
2.3 A priori and a posteriori Analysis . . . . . . . . . . . . . . 41
2.4 Sources of Error in Computational Models . . . . . . . . . 43
2.5 Machine Representation of Numbers . . . . . . . . . . . . 45
2.5.1 The Positional System . . . . . . . . . . . . . . . . 45
2.5.2 The Floating-Point Number System . . . . . . . . 46
2.5.3 Distribution of Floating-Point Numbers . . . . . . 49
2.5.4 IEC/IEEE Arithmetic . . . . . . . . . . . . . . . . 49
2.5.5 Rounding of a Real Number in Its
Machine Representation . . . . . . . . . . . . ... 50
2.5.6 Machine Floating-Point Operations . . . . . . ... 52
2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . ... 54

PART II: Numerical Linear Algebra

3. Direct Methods for the Solution of Linear Systems 57
3.1 Stability Analysis of Linear Systems . . . . . . . . . . . . 58
3.1.1 The Condition Number of a Matrix . . . . . . . . 58
3.1.2 Forward a priori Analysis . . . . . . . . . . . . . . 60
3.1.3 Backward a priori Analysis . . . . . . . . . . . . . 63
3.1.4 A posteriori Analysis . . . . . . . . . . . . . . . . . 64
3.2 Solution of Triangular Systems . . . . . . . . . . . . . . . 65
3.2.1 Implementation of Substitution Methods . . . . . 65
3.2.2 Rounding Error Analysis . . . . . . . . . . . . . . 67
3.2.3 Inverse of a Triangular Matrix . . . . . . . . . . . 67
3.3 The Gaussian Elimination Method (GEM) and
LU Factorization . . . . . . . . . . . . . . . . . . . . . . . 68
3.3.1 GEM as a Factorization Method . . . . . . . . . . 72
3.3.2 The E¬ect of Rounding Errors . . . . . . . . . . . 76
3.3.3 Implementation of LU Factorization . . . . . . . . 77
3.3.4 Compact Forms of Factorization . . . . . . . . . . 78
3.4 Other Types of Factorization . . . . . . . . . . . . . . . . . 79
3.4.1 LDMT Factorization . . . . . . . . . . . . . . . . . 79
3.4.2 Symmetric and Positive De¬nite Matrices:
The Cholesky Factorization . . . . . . . . . . ... 80
3.4.3 Rectangular Matrices: The QR Factorization ... 82
Contents xiii

3.5 Pivoting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3.6 Computing the Inverse of a Matrix . . . . . . . . . . . . . 89
3.7 Banded Systems . . . . . . . . . . . . . . . . . . . . . . . . 90
3.7.1 Tridiagonal Matrices . . . . . . . . . . . . . . . . . 91
3.7.2 Implementation Issues . . . . . . . . . . . . . . . . 92
3.8 Block Systems . . . . . . . . . . . . . . . . . . . . . . . . . 93
3.8.1 Block LU Factorization . . . . . . . . . . . . . . . 94
3.8.2 Inverse of a Block-Partitioned Matrix . . . . . . . 95
3.8.3 Block Tridiagonal Systems . . . . . . . . . . . . . . 95
3.9 Sparse Matrices . . . . . . . . . . . . . . . . . . . . . . . . 97
3.9.1 The Cuthill-McKee Algorithm . . . . . . . . . . . 98
3.9.2 Decomposition into Substructures . . . . . . . . . 100
3.9.3 Nested Dissection . . . . . . . . . . . . . . . . . . . 103
3.10 Accuracy of the Solution Achieved Using GEM . . . . . . 103
3.11 An Approximate Computation of K(A) . . . . . . . . . . . 106
3.12 Improving the Accuracy of GEM . . . . . . . . . . . . . . 109
3.12.1 Scaling . . . . . . . . . . . . . . . . . . . . . . . . 110
3.12.2 Iterative Re¬nement . . . . . . . . . . . . . . . . . 111
3.13 Undetermined Systems . . . . . . . . . . . . . . . . . . . . 112
3.14 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 115
3.14.1 Nodal Analysis of a Structured Frame . . . . . . . 115
3.14.2 Regularization of a Triangular Grid . . . . . . . . 118
3.15 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

4. Iterative Methods for Solving Linear Systems 123
4.1 On the Convergence of Iterative Methods . . . . . . . . . . 123
4.2 Linear Iterative Methods . . . . . . . . . . . . . . . . . . . 126
4.2.1 Jacobi, Gauss-Seidel and Relaxation Methods . . . 127
4.2.2 Convergence Results for Jacobi and
Gauss-Seidel Methods . . . . . . . . . . . . . . . . 129
4.2.3 Convergence Results for the Relaxation Method . 131
4.2.4 A priori Forward Analysis . . . . . . . . . . . . . . 132
4.2.5 Block Matrices . . . . . . . . . . . . . . . . . . . . 133
4.2.6 Symmetric Form of the Gauss-Seidel and
SOR Methods . . . . . . . . . . . . . . . . . . . . . 133
4.2.7 Implementation Issues . . . . . . . . . . . . . . . . 135
4.3 Stationary and Nonstationary Iterative Methods . . . . . . 136
4.3.1 Convergence Analysis of the Richardson Method . 137
4.3.2 Preconditioning Matrices . . . . . . . . . . . . . . 139
4.3.3 The Gradient Method . . . . . . . . . . . . . . . . 146
4.3.4 The Conjugate Gradient Method . . . . . . . . . . 150
4.3.5 The Preconditioned Conjugate Gradient Method . 156
4.3.6 The Alternating-Direction Method . . . . . . . . . 158
4.4 Methods Based on Krylov Subspace Iterations . . . . . . . 159
4.4.1 The Arnoldi Method for Linear Systems . . . . . . 162
xiv Contents

4.4.2 The GMRES Method . . . . . . . . . . . . . . . . 165
4.4.3 The Lanczos Method for Symmetric Systems . . . 167
4.5 The Lanczos Method for Unsymmetric Systems . . . . . . 168
4.6 Stopping Criteria . . . . . . . . . . . . . . . . . . . . . . . 171
4.6.1 A Stopping Test Based on the Increment . . . . . 172
4.6.2 A Stopping Test Based on the Residual . . . . . . 174
4.7 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 174
4.7.1 Analysis of an Electric Network . . . . . . . . . . . 174
4.7.2 Finite Di¬erence Analysis of Beam Bending . . . . 177
4.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

5. Approximation of Eigenvalues and Eigenvectors 183
5.1 Geometrical Location of the Eigenvalues . . . . . . . . . . 183
5.2 Stability and Conditioning Analysis . . . . . . . . . . . . . 186
5.2.1 A priori Estimates . . . . . . . . . . . . . . . . . . 186
5.2.2 A posteriori Estimates . . . . . . . . . . . . . . . . 190
5.3 The Power Method . . . . . . . . . . . . . . . . . . . . . . 192
5.3.1 Approximation of the Eigenvalue of
Largest Module . . . . . . . . . . . . . . . . . . . . 192
5.3.2 Inverse Iteration . . . . . . . . . . . . . . . . . . . 195
5.3.3 Implementation Issues . . . . . . . . . . . . . . . . 196
5.4 The QR Iteration . . . . . . . . . . . . . . . . . . . . . . . 200
5.5 The Basic QR Iteration . . . . . . . . . . . . . . . . . . . . 201
5.6 The QR Method for Matrices in Hessenberg Form . . . . . 203
5.6.1 Householder and Givens Transformation Matrices 204
5.6.2 Reducing a Matrix in Hessenberg Form . . . . . . 207
5.6.3 QR Factorization of a Matrix in Hessenberg Form 209
5.6.4 The Basic QR Iteration Starting from
Upper Hessenberg Form . . . . . . . . . . . . . . . 210
5.6.5 Implementation of Transformation Matrices . . . . 212
5.7 The QR Iteration with Shifting Techniques . . . . . . . . . 215
5.7.1 The QR Method with Single Shift . . . . . . . . . 215
5.7.2 The QR Method with Double Shift . . . . . . . . . 218
5.8 Computing the Eigenvectors and the SVD of a Matrix . . 221
5.8.1 The Hessenberg Inverse Iteration . . . . . . . . . . 221
5.8.2 Computing the Eigenvectors from the
Schur Form of a Matrix . . . . . . . . . . . . . . . 221
5.8.3 Approximate Computation of the SVD of a Matrix 222
5.9 The Generalized Eigenvalue Problem . . . . . . . . . . . . 224
5.9.1 Computing the Generalized Real Schur Form . . . 225
5.9.2 Generalized Real Schur Form of
Symmetric-De¬nite Pencils . . . . . . . . . . . . . 226
5.10 Methods for Eigenvalues of Symmetric Matrices . . . . . . 227
5.10.1 The Jacobi Method . . . . . . . . . . . . . . . . . 227
5.10.2 The Method of Sturm Sequences . . . . . . . . . . 230
Contents xv

5.11 The Lanczos Method . . . . . . . . . . . . . . . . . . . . . 233
5.12 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 235
5.12.1 Analysis of the Buckling of a Beam . . . . . . . . . 236
5.12.2 Free Dynamic Vibration of a Bridge . . . . . . . . 238
5.13 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240

PART III: Around Functions and Functionals

6. Root¬nding for Nonlinear Equations 245
6.1 Conditioning of a Nonlinear Equation . . . . . . . . . . . . 246
6.2 A Geometric Approach to Root¬nding . . . . . . . . . . . 248
6.2.1 The Bisection Method . . . . . . . . . . . . . . . . 248
6.2.2 The Methods of Chord, Secant and Regula Falsi
and Newton™s Method . . . . . . . . . . . . . . . . 251
6.2.3 The Dekker-Brent Method . . . . . . . . . . . . . 256
6.3 Fixed-Point Iterations for Nonlinear Equations . . . . . . . 257
6.3.1 Convergence Results for
Some Fixed-Point Methods . . . . . . . . . . . . . 260
6.4 Zeros of Algebraic Equations . . . . . . . . . . . . . . . . . 261
6.4.1 The Horner Method and De¬‚ation . . . . . . . . . 262
6.4.2 The Newton-Horner Method . . . . . . . . . . . . 263
6.4.3 The Muller Method . . . . . . . . . . . . . . . . . 267
6.5 Stopping Criteria . . . . . . . . . . . . . . . . . . . . . . . 269
6.6 Post-Processing Techniques for Iterative Methods . . . . . 272
6.6.1 Aitken™s Acceleration . . . . . . . . . . . . . . . . 272
6.6.2 Techniques for Multiple Roots . . . . . . . . . . . 275
6.7 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 276
6.7.1 Analysis of the State Equation for a Real Gas . . 276
6.7.2 Analysis of a Nonlinear Electrical Circuit . . . . . 277
6.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279

7. Nonlinear Systems and Numerical Optimization 281
7.1 Solution of Systems of Nonlinear Equations . . . . . . . . 282
7.1.1 Newton™s Method and Its Variants . . . . . . . . . 283
7.1.2 Modi¬ed Newton™s Methods . . . . . . . . . . . . . 284
7.1.3 Quasi-Newton Methods . . . . . . . . . . . . . . . 288
7.1.4 Secant-Like Methods . . . . . . . . . . . . . . . . . 288
7.1.5 Fixed-Point Methods . . . . . . . . . . . . . . . . . 290
7.2 Unconstrained Optimization . . . . . . . . . . . . . . . . . 294
7.2.1 Direct Search Methods . . . . . . . . . . . . . . . . 295
7.2.2 Descent Methods . . . . . . . . . . . . . . . . . . . 300
7.2.3 Line Search Techniques . . . . . . . . . . . . . . . 302
7.2.4 Descent Methods for Quadratic Functions . . . . . 304
7.2.5 Newton-Like Methods for Function Minimization . 307
7.2.6 Quasi-Newton Methods . . . . . . . . . . . . . . . 308
xvi Contents

7.2.7 Secant-Like Methods . . . . . . . . . . . . . . . . . 309
7.3 Constrained Optimization . . . . . . . . . . . . . . . . . . 311
7.3.1 Kuhn-Tucker Necessary Conditions for
Nonlinear Programming . . . . . . . . . . . . . . . 313
7.3.2 The Penalty Method . . . . . . . . . . . . . . . . . 315
7.3.3 The Method of Lagrange Multipliers . . . . . . . . 317
7.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 319
7.4.1 Solution of a Nonlinear System Arising from
Semiconductor Device Simulation . . . . . . . . . . 320
7.4.2 Nonlinear Regularization of a Discretization Grid . 323
7.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325

8. Polynomial Interpolation 327
8.1 Polynomial Interpolation . . . . . . . . . . . . . . . . . . . 328
8.1.1 The Interpolation Error . . . . . . . . . . . . . . . 329
8.1.2 Drawbacks of Polynomial Interpolation on Equally
Spaced Nodes and Runge™s Counterexample . . . . 330
8.1.3 Stability of Polynomial Interpolation . . . . . . . . 332
8.2 Newton Form of the Interpolating Polynomial . . . . . . . 333
8.2.1 Some Properties of Newton Divided Di¬erences . . 335
8.2.2 The Interpolation Error Using Divided Di¬erences 337
8.3 Piecewise Lagrange Interpolation . . . . . . . . . . . . . . 338
8.4 Hermite-Birko¬ Interpolation . . . . . . . . . . . . . . . . 341
8.5 Extension to the Two-Dimensional Case . . . . . . . . . . 343
8.5.1 Polynomial Interpolation . . . . . . . . . . . . . . 343
8.5.2 Piecewise Polynomial Interpolation . . . . . . . . . 344
8.6 Approximation by Splines . . . . . . . . . . . . . . . . . . 348
8.6.1 Interpolatory Cubic Splines . . . . . . . . . . . . . 349
8.6.2 B-Splines . . . . . . . . . . . . . . . . . . . . . . . 353
8.7 Splines in Parametric Form . . . . . . . . . . . . . . . . . 357
8.7.1 B´zier Curves and Parametric B-Splines . . . . . .
e 359
8.8 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 362
8.8.1 Finite Element Analysis of a Clamped Beam . . . 363
8.8.2 Geometric Reconstruction Based on
Computer Tomographies . . . . . . . . . . . . . . . 366
8.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368

9. Numerical Integration 371
9.1 Quadrature Formulae . . . . . . . . . . . . . . . . . . . . . 371
9.2 Interpolatory Quadratures . . . . . . . . . . . . . . . . . . 373
9.2.1 The Midpoint or Rectangle Formula . . . . . . . . 373
9.2.2 The Trapezoidal Formula . . . . . . . . . . . . . . 375
9.2.3 The Cavalieri-Simpson Formula . . . . . . . . . . . 377
9.3 Newton-Cotes Formulae . . . . . . . . . . . . . . . . . . . 378
9.4 Composite Newton-Cotes Formulae . . . . . . . . . . . . . 383
Contents xvii

9.5 Hermite Quadrature Formulae . . . . . . . . . . . . . . . . 386
9.6 Richardson Extrapolation . . . . . . . . . . . . . . . . . . 387
9.6.1 Romberg Integration . . . . . . . . . . . . . . . . . 389
9.7 Automatic Integration . . . . . . . . . . . . . . . . . . . . 391
9.7.1 Non Adaptive Integration Algorithms . . . . . . . 392
9.7.2 Adaptive Integration Algorithms . . . . . . . . . . 394
9.8 Singular Integrals . . . . . . . . . . . . . . . . . . . . . . . 398
9.8.1 Integrals of Functions with Finite
Jump Discontinuities . . . . . . . . . . . . . . . . . 398
9.8.2 Integrals of In¬nite Functions . . . . . . . . . . . . 398
9.8.3 Integrals over Unbounded Intervals . . . . . . . . . 401
9.9 Multidimensional Numerical Integration . . . . . . . . . . 402
9.9.1 The Method of Reduction Formula . . . . . . . . . 403
9.9.2 Two-Dimensional Composite Quadratures . . . . . 404
9.9.3 Monte Carlo Methods for
Numerical Integration . . . . . . . . . . . . . . . . 407
9.10 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 408
9.10.1 Computation of an Ellipsoid Surface . . . . . . . . 408
9.10.2 Computation of the Wind Action on a
Sailboat Mast . . . . . . . . . . . . . . . . . . . . . 410
9.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412

PART IV: Transforms, Di¬erentiation
and Problem Discretization

10. Orthogonal Polynomials in Approximation Theory 415
10.1 Approximation of Functions by Generalized Fourier Series 415
10.1.1 The Chebyshev Polynomials . . . . . . . . . . . . . 417
10.1.2 The Legendre Polynomials . . . . . . . . . . . . . 419
10.2 Gaussian Integration and Interpolation . . . . . . . . . . . 419
10.3 Chebyshev Integration and Interpolation . . . . . . . . . . 424
10.4 Legendre Integration and Interpolation . . . . . . . . . . . 426
10.5 Gaussian Integration over Unbounded Intervals . . . . . . 428
10.6 Programs for the Implementation of Gaussian Quadratures 429
10.7 Approximation of a Function in the Least-Squares Sense . 431
10.7.1 Discrete Least-Squares Approximation . . . . . . . 431
10.8 The Polynomial of Best Approximation . . . . . . . . . . . 433
10.9 Fourier Trigonometric Polynomials . . . . . . . . . . . . . 435
10.9.1 The Gibbs Phenomenon . . . . . . . . . . . . . . . 439
10.9.2 The Fast Fourier Transform . . . . . . . . . . . . . 440
10.10 Approximation of Function Derivatives . . . . . . . . . . . 442
10.10.1 Classical Finite Di¬erence Methods . . . . . . . . . 442
10.10.2 Compact Finite Di¬erences . . . . . . . . . . . . . 444
10.10.3 Pseudo-Spectral Derivative . . . . . . . . . . . . . 448
10.11 Transforms and Their Applications . . . . . . . . . . . . . 450
xviii Contents

10.11.1 The Fourier Transform . . . . . . . . . . . . . . . . 450
10.11.2 (Physical) Linear Systems and Fourier Transform . 453
10.11.3 The Laplace Transform . . . . . . . . . . . . . . . 455
10.11.4 The Z-Transform . . . . . . . . . . . . . . . . . . . 457
10.12 The Wavelet Transform . . . . . . . . . . . . . . . . . . . . 458
10.12.1 The Continuous Wavelet Transform . . . . . . . . 458
10.12.2 Discrete and Orthonormal Wavelets . . . . . . . . 461
10.13 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 463
10.13.1 Numerical Computation of Blackbody Radiation . 463
10.13.2 Numerical Solution of Schr¨dinger Equation . . .
o . 464
10.14 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467

11. Numerical Solution of Ordinary Di¬erential Equations 469
11.1 The Cauchy Problem . . . . . . . . . . . . . . . . . . . . . 469
11.2 One-Step Numerical Methods . . . . . . . . . . . . . . . . 472
11.3 Analysis of One-Step Methods . . . . . . . . . . . . . . . . 473
11.3.1 The Zero-Stability . . . . . . . . . . . . . . . . . . 475
11.3.2 Convergence Analysis . . . . . . . . . . . . . . . . 477
11.3.3 The Absolute Stability . . . . . . . . . . . . . . . . 479
11.4 Di¬erence Equations . . . . . . . . . . . . . . . . . . . . . 482
11.5 Multistep Methods . . . . . . . . . . . . . . . . . . . . . . 487
11.5.1 Adams Methods . . . . . . . . . . . . . . . . . . . 490
11.5.2 BDF Methods . . . . . . . . . . . . . . . . . . . . 492
11.6 Analysis of Multistep Methods . . . . . . . . . . . . . . . . 492
11.6.1 Consistency . . . . . . . . . . . . . . . . . . . . . . 493
11.6.2 The Root Conditions . . . . . . . . . . . . . . . . . 494
11.6.3 Stability and Convergence Analysis for
Multistep Methods . . . . . . . . . . . . . . . . . . 495
11.6.4 Absolute Stability of Multistep Methods . . . . . . 499
11.7 Predictor-Corrector Methods . . . . . . . . . . . . . . . . . 502
11.8 Runge-Kutta Methods . . . . . . . . . . . . . . . . . . . . 508
11.8.1 Derivation of an Explicit RK Method . . . . . . . 511
11.8.2 Stepsize Adaptivity for RK Methods . . . . . . . . 512
11.8.3 Implicit RK Methods . . . . . . . . . . . . . . . . 514
11.8.4 Regions of Absolute Stability for RK Methods . . 516
11.9 Systems of ODEs . . . . . . . . . . . . . . . . . . . . . . . 517
11.10 Sti¬ Problems . . . . . . . . . . . . . . . . . . . . . . . . . 519
11.11 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 521
11.11.1 Analysis of the Motion of a Frictionless Pendulum 522
11.11.2 Compliance of Arterial Walls . . . . . . . . . . . . 523
11.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527

12. Two-Point Boundary Value Problems 531
12.1 A Model Problem . . . . . . . . . . . . . . . . . . . . . . . 531
12.2 Finite Di¬erence Approximation . . . . . . . . . . . . . . . 533
Contents xix

12.2.1 Stability Analysis by the Energy Method . . . . . 534
12.2.2 Convergence Analysis . . . . . . . . . . . . . . . . 538
12.2.3 Finite Di¬erences for Two-Point Boundary
Value Problems with Variable Coe¬cients . . . . . 540
12.3 The Spectral Collocation Method . . . . . . . . . . . . . . 542
12.4 The Galerkin Method . . . . . . . . . . . . . . . . . . . . . 544
12.4.1 Integral Formulation of Boundary-Value Problems 544
12.4.2 A Quick Introduction to Distributions . . . . . . . 546
12.4.3 Formulation and Properties of the
Galerkin Method . . . . . . . . . . . . . . . . . . . 547
12.4.4 Analysis of the Galerkin Method . . . . . . . . . . 548
12.4.5 The Finite Element Method . . . . . . . . . . . . . 550
12.4.6 Implementation Issues . . . . . . . . . . . . . . . . 556
12.4.7 Spectral Methods . . . . . . . . . . . . . . . . . . . 559
12.5 Advection-Di¬usion Equations . . . . . . . . . . . . . . . . 560
12.5.1 Galerkin Finite Element Approximation . . . . . . 561
12.5.2 The Relationship Between Finite Elements and
Finite Di¬erences; the Numerical Viscosity . . . . 563
12.5.3 Stabilized Finite Element Methods . . . . . . . . . 567
12.6 A Quick Glance to the Two-Dimensional Case . . . . . . . 572
12.7 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 575
12.7.1 Lubrication of a Slider . . . . . . . . . . . . . . . . 575
12.7.2 Vertical Distribution of Spore
Concentration over Wide Regions . . . . . . . . . . 576
12.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 578

13. Parabolic and Hyperbolic Initial Boundary
Value Problems 581
13.1 The Heat Equation . . . . . . . . . . . . . . . . . . . . . . 581
13.2 Finite Di¬erence Approximation of the Heat Equation . . 584
13.3 Finite Element Approximation of the Heat Equation . . . 586
13.3.1 Stability Analysis of the θ-Method . . . . . . . . . 588
13.4 Space-Time Finite Element Methods for the
Heat Equation . . . . . . . . . . . . . . . . . . . . . . . . . 593
13.5 Hyperbolic Equations: A Scalar Transport Problem . . . . 597
13.6 Systems of Linear Hyperbolic Equations . . . . . . . . . . 599
13.6.1 The Wave Equation . . . . . . . . . . . . . . . . . 601
13.7 The Finite Di¬erence Method for Hyperbolic Equations . . 602
13.7.1 Discretization of the Scalar Equation . . . . . . . . 602
13.8 Analysis of Finite Di¬erence Methods . . . . . . . . . . . . 605
13.8.1 Consistency . . . . . . . . . . . . . . . . . . . . . . 605
13.8.2 Stability . . . . . . . . . . . . . . . . . . . . . . . . 605
13.8.3 The CFL Condition . . . . . . . . . . . . . . . . . 606
13.8.4 Von Neumann Stability Analysis . . . . . . . . . . 608
13.9 Dissipation and Dispersion . . . . . . . . . . . . . . . . . . 611
xx Contents

13.9.1 Equivalent Equations . . . . . . . . . . . . . . . . 614
13.10 Finite Element Approximation of Hyperbolic Equations . . 618
13.10.1 Space Discretization with Continuous and
Discontinuous Finite Elements . . . . . . . . . . . 618
13.10.2 Time Discretization . . . . . . . . . . . . . . . . . 620
13.11 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 623
13.11.1 Heat Conduction in a Bar . . . . . . . . . . . . . . 623
13.11.2 A Hyperbolic Model for Blood Flow
Interaction with Arterial Walls . . . . . . . . . . . 623
13.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625

References 627

Index of MATLAB Programs 643

Index 647
1
Foundations of Matrix Analysis




In this chapter we recall the basic elements of linear algebra which will be
employed in the remainder of the text. For most of the proofs as well as
for the details, the reader is referred to [Bra75], [Nob69], [Hal58]. Further
results on eigenvalues can be found in [Hou75] and [Wil65].




1.1 Vector Spaces
De¬nition 1.1 A vector space over the numeric ¬eld K (K = R or K = C)
is a nonempty set V , whose elements are called vectors and in which two
operations are de¬ned, called addition and scalar multiplication, that enjoy
the following properties:

1. addition is commutative and associative;

2. there exists an element 0 ∈ V (the zero vector or null vector) such
that v + 0 = v for each v ∈ V ;

3. 0 · v = 0, 1 · v = v, where 0 and 1 are respectively the zero and the
unity of K;

4. for each element v ∈ V there exists its opposite, ’v, in V such that
v + (’v) = 0;
2 1. Foundations of Matrix Analysis

5. the following distributive properties hold

∀± ∈ K, ∀v, w ∈ V, ±(v + w) = ±v + ±w,

∀±, β ∈ K, ∀v ∈ V, (± + β)v = ±v + βv;

6. the following associative property holds

∀±, β ∈ K, ∀v ∈ V, (±β)v = ±(βv).




Example 1.1 Remarkable instances of vector spaces are:
- V = Rn (respectively V = Cn ): the set of the n-tuples of real (respectively
complex) numbers, n ≥ 1;
- V = Pn : the set of polynomials pn (x) = n ak xk with real (or complex)
k=0
coe¬cients ak having degree less than or equal to n, n ≥ 0;
- V = C p ([a, b]): the set of real (or complex)-valued functions which are con-
tinuous on [a, b] up to their p-th derivative, 0 ¤ p < ∞. •

De¬nition 1.2 We say that a nonempty part W of V is a vector subspace
of V i¬ W is a vector space over K.

Example 1.2 The vector space Pn is a vector subspace of C ∞ (R), which is the
space of in¬nite continuously di¬erentiable functions on the real line. A trivial

subspace of any vector space is the one containing only the zero vector.

In particular, the set W of the linear combinations of a system of p vectors
of V , {v1 , . . . , vp }, is a vector subspace of V , called the generated subspace
or span of the vector system, and is denoted by

= span {v1 , . . . , vp }
W
= {v = ±1 v1 + . . . + ±p vp with ±i ∈ K, i = 1, . . . , p} .

The system {v1 , . . . , vp } is called a system of generators for W .
If W1 , . . . , Wm are vector subspaces of V , then the set

S = {w : w = v1 + . . . + vm with vi ∈ Wi , i = 1, . . . , m}

is also a vector subspace of V . We say that S is the direct sum of the
subspaces Wi if any element s ∈ S admits a unique representation of the
form s = v1 + . . . + vm with vi ∈ Wi and i = 1, . . . , m. In such a case, we
shall write S = W1 • . . . • Wm .
1.2 Matrices 3

De¬nition 1.3 A system of vectors {v1 , . . . , vm } of a vector space V is
called linearly independent if the relation

±1 v1 + ±2 v2 + . . . + ±m vm = 0

with ±1 , ±2 , . . . , ±m ∈ K implies that ±1 = ±2 = . . . = ±m = 0. Otherwise,
the system will be called linearly dependent.

We call a basis of V any system of linearly independent generators of V .
If {u1 , . . . , un } is a basis of V , the expression v = v1 u1 + . . . + vn un is
called the decomposition of v with respect to the basis and the scalars
v1 , . . . , vn ∈ K are the components of v with respect to the given basis.
Moreover, the following property holds.

Property 1.1 Let V be a vector space which admits a basis of n vectors.
Then every system of linearly independent vectors of V has at most n el-
ements and any other basis of V has n elements. The number n is called
the dimension of V and we write dim(V ) = n.
If, instead, for any n there always exist n linearly independent vectors of
V , the vector space is called in¬nite dimensional.

Example 1.3 For any integer p the space C p ([a, b]) is in¬nite dimensional. The
spaces Rn and Cn have dimension equal to n. The usual basis for Rn is the set of
unit vectors {e1 , . . . , en } where (ei )j = δij for i, j = 1, . . . n, where δij denotes
the Kronecker symbol equal to 0 if i = j and 1 if i = j. This choice is of course

not the only one that is possible (see Exercise 2).



1.2 Matrices
Let m and n be two positive integers. We call a matrix having m rows and
n columns, or a matrix m — n, or a matrix (m, n), with elements in K, a
set of mn scalars aij ∈ K, with i = 1, . . . , m and j = 1, . . . n, represented
in the following rectangular array
® 
a11 a12 . . . a1n
 a21 a22 . . . a2n 
 
A= . . . (1.1)
.
°. .»
.
. . .
am1 am2 ... amn

When K = R or K = C we shall respectively write A ∈ Rm—n or A ∈
Cm—n , to explicitly outline the numerical ¬elds which the elements of A
belong to. Capital letters will be used to denote the matrices, while the
lower case letters corresponding to those upper case letters will denote the
matrix entries.
4 1. Foundations of Matrix Analysis

We shall abbreviate (1.1) as A = (aij ) with i = 1, . . . , m and j = 1, . . . n.
The index i is called row index, while j is the column index. The set
(ai1 , ai2 , . . . , ain ) is called the i-th row of A; likewise, (a1j , a2j , . . . , amj )
is the j-th column of A.
If n = m the matrix is called squared or having order n and the set of
the entries (a11 , a22 , . . . , ann ) is called its main diagonal.
A matrix having one row or one column is called a row vector or column
vector respectively. Unless otherwise speci¬ed, we shall always assume that
a vector is a column vector. In the case n = m = 1, the matrix will simply
denote a scalar of K.
Sometimes it turns out to be useful to distinguish within a matrix the set
made up by speci¬ed rows and columns. This prompts us to introduce the
following de¬nition.

De¬nition 1.4 Let A be a matrix m — n. Let 1 ¤ i1 < i2 < . . . < ik ¤ m
and 1 ¤ j1 < j2 < . . . < jl ¤ n two sets of contiguous indexes. The matrix
S(k — l) of entries spq = aip jq with p = 1, . . . , k, q = 1, . . . , l is called a
submatrix of A. If k = l and ir = jr for r = 1, . . . , k, S is called a principal
submatrix of A.

De¬nition 1.5 A matrix A(m — n) is called block partitioned or said to
be partitioned into submatrices if
® 
A11 A12 ... A1l
 A21 A22 
... A2l
 
A= . ,
. .
..
°. »
. .
.
. . .
Ak1 Ak2 ... Akl
where Aij are submatrices of A.
Among the possible partitions of A, we recall in particular the partition by
columns

A = (a1 , a2 , . . . , an ),

ai being the i-th column vector of A. In a similar way the partition by rows
of A can be de¬ned. To ¬x the notations, if A is a matrix m — n, we shall
denote by

A(i1 : i2 , j1 : j2 ) = (aij ) i1 ¤ i ¤ i2 , j1 ¤ j ¤ j2

the submatrix of A of size (i2 ’ i1 + 1) — (j2 ’ j1 + 1) that lies between the
rows i1 and i2 and the columns j1 and j2 . Likewise, if v is a vector of size
n, we shall denote by v(i1 : i2 ) the vector of size i2 ’ i1 + 1 made up by
the i1 -th to the i2 -th components of v.
These notations are convenient in view of programming the algorithms
that will be presented throughout the volume in the MATLAB language.
1.3 Operations with Matrices 5

1.3 Operations with Matrices
Let A = (aij ) and B = (bij ) be two matrices m — n over K. We say that
A is equal to B, if aij = bij for i = 1, . . . , m, j = 1, . . . , n. Moreover, we
de¬ne the following operations:

- matrix sum: the matrix sum is the matrix A+B = (aij +bij ). The neutral
element in a matrix sum is the null matrix, still denoted by 0 and
made up only by null entries;

- matrix multiplication by a scalar: the multiplication of A by » ∈ K, is a
matrix »A = (»aij );

- matrix product: the product of two matrices A and B of sizes (m, p)
and (p, n) respectively, is a matrix C(m, n) whose entries are cij =
p
aik bkj , for i = 1, . . . , m, j = 1, . . . , n.
k=1

The matrix product is associative and distributive with respect to the ma-
trix sum, but it is not in general commutative. The square matrices for
which the property AB = BA holds, will be called commutative.
In the case of square matrices, the neutral element in the matrix product
is a square matrix of order n called the unit matrix of order n or, more
frequently, the identity matrix given by In = (δij ). The identity matrix
is, by de¬nition, the only matrix n — n such that AIn = In A = A for all
square matrices A. In the following we shall omit the subscript n unless it
is strictly necessary. The identity matrix is a special instance of a diagonal
matrix of order n, that is, a square matrix of the type D = (dii δij ). We will
use in the following the notation D = diag(d11 , d22 , . . . , dnn ).
Finally, if A is a square matrix of order n and p is an integer, we de¬ne Ap
as the product of A with itself iterated p times. We let A0 = I.

Let us now address the so-called elementary row operations that can be
performed on a matrix. They consist of:

- multiplying the i-th row of a matrix by a scalar ±; this operation is
equivalent to pre-multiplying A by the matrix D = diag(1, . . . , 1, ±,
1, . . . , 1), where ± occupies the i-th position;

- exchanging the i-th and j-th rows of a matrix; this can be done by pre-
multiplying A by the matrix P(i,j) of elements
±
 1 if r = s = 1, . . . , i ’ 1, i + 1, . . . , j ’ 1, j + 1, . . . n,



(i,j)
1 if r = j, s = i or r = i, s = j,
prs = (1.2)



 0 otherwise,
6 1. Foundations of Matrix Analysis

where Ir denotes the identity matrix of order r = j ’ i ’ 1 if j >
i (henceforth, matrices with size equal to zero will correspond to
the empty set). Matrices like (1.2) are called elementary permutation
matrices. The product of elementary permutation matrices is called
a permutation matrix, and it performs the row exchanges associated
with each elementary permutation matrix. In practice, a permutation
matrix is a reordering by rows of the identity matrix;
- adding ± times the j-th row of a matrix to its i-th row. This operation
(i,j)
can also be performed by pre-multiplying A by the matrix I + N± ,
(i,j)
where N± is a matrix having null entries except the one in position
i, j whose value is ±.


1.3.1 Inverse of a Matrix
De¬nition 1.6 A square matrix A of order n is called invertible (or regular
or nonsingular) if there exists a square matrix B of order n such that
A B = B A = I. B is called the inverse matrix of A and is denoted by A’1 .
A matrix which is not invertible is called singular.
If A is invertible its inverse is also invertible, with (A’1 )’1 = A. Moreover,
if A and B are two invertible matrices of order n, their product AB is also
invertible, with (A B)’1 = B’1 A’1 . The following property holds.

Property 1.2 A square matrix is invertible i¬ its column vectors are lin-
early independent.

De¬nition 1.7 We call the transpose of a matrix A∈ Rm—n the matrix
n — m, denoted by AT , that is obtained by exchanging the rows of A with
the columns of A.
Clearly, (AT )T = A, (A + B)T = AT + BT , (AB)T = BT AT and (±A)T =
±AT ∀± ∈ R. If A is invertible, then also (AT )’1 = (A’1 )T = A’T .

De¬nition 1.8 Let A ∈ Cm—n ; the matrix B = AH ∈ Cn—m is called the
conjugate transpose (or adjoint) of A if bij = aji , where aji is the complex
¯ ¯
conjugate of aji .
In analogy with the case of the real matrices, it turns out that (A+B)H =
AH + BH , (AB)H = BH AH and (±A)H = ±AH ∀± ∈ C.
¯

De¬nition 1.9 A matrix A ∈ Rn—n is called symmetric if A = AT , while
it is antisymmetric if A = ’AT . Finally, it is called orthogonal if AT A =
AAT = I, that is A’1 = AT .
Permutation matrices are orthogonal and the same is true for their prod-
ucts.
1.3 Operations with Matrices 7

De¬nition 1.10 A matrix A ∈ Cn—n is called hermitian or self-adjoint if
¯
AT = A, that is, if AH = A, while it is called unitary if AH A = AAH = I.
Finally, if AAH = AH A, A is called normal.
As a consequence, a unitary matrix is one such that A’1 = AH .
Of course, a unitary matrix is also normal, but it is not in general her-
mitian. For instance, the matrix of the Example 1.4 is unitary, although
not symmetric (if s = 0). We ¬nally notice that the diagonal entries of an
hermitian matrix must necessarily be real (see also Exercise 5).


1.3.2 Matrices and Linear Mappings
De¬nition 1.11 A linear map from Cn into Cm is a function f : Cn ’’
Cm such that f (±x + βy) = ±f (x) + βf (y), ∀±, β ∈ K and ∀x, y ∈ Cn .
The following result links matrices and linear maps.

Property 1.3 Let f : Cn ’’ Cm be a linear map. Then, there exists a
unique matrix Af ∈ Cm—n such that

∀x ∈ Cn .
f (x) = Af x (1.3)

Conversely, if Af ∈ Cm—n then the function de¬ned in (1.3) is a linear
map from Cn into Cm .

Example 1.4 An important example of a linear map is the counterclockwise
rotation by an angle ‘ in the plane (x1 , x2 ). The matrix associated with such a
map is given by
c s
G(‘) = , c = cos(‘), s = sin(‘)
’s c

and it is called a rotation matrix.


1.3.3 Operations with Block-Partitioned Matrices
All the operations that have been previously introduced can be extended
to the case of a block-partitioned matrix A, provided that the size of each
single block is such that any single matrix operation is well-de¬ned.
Indeed, the following result can be shown (see, e.g., [Ste73]).

Property 1.4 Let A and B be the block matrices
®  ® 
A11 . . . A1l B11 . . . B1n
. . , B =  . .
A=° . .. ..
.» °. .»
. .
. . . .
Ak1 . . . Akl Bm1 . . . Bmn

where Aij and Bij are matrices (ki — lj ) and (mi — nj ). Then we have
8 1. Foundations of Matrix Analysis

1.
®  ® 
AT AT
»A11 ... »A1l ...
11 k1
. . , . . ;
.. ..
» ∈ C; A = ° .
»A = ° . .» .»
T
. .
. . . .
AT AT
»Ak1 ... »Akl ...
1l kl


2. if k = m, l = n, mi = ki and nj = lj , then
® 
A11 + B11 . . . A1l + B1l
 
. .
..
A+B=° »;
. .
.
. .
Ak1 + Bk1 . . . Akl + Bkl

m
3. if l = m, li = mi and ki = ni , then, letting Cij = Ais Bsj ,
s=1
® 
C11 ... C1l
. . .
..
AB = ° . .»
.
. .
Ck1 ... Ckl


1.4 Trace and Determinant of a Matrix
Let us consider a square matrix A of order n. The trace of a matrix is the
n
sum of the diagonal entries of A, that is tr(A) = aii .
i=1
We call the determinant of A the scalar de¬ned through the following for-
mula

det(A) = sign(π)a1π1 a2π2 . . . anπn ,
π∈P

where P = π = (π1 , . . . , πn )T is the set of the n! vectors that are ob-
tained by permuting the index vector i = (1, . . . , n)T and sign(π) equal to
1 (respectively, ’1) if an even (respectively, odd) number of exchanges is
needed to obtain π from i.
The following properties hold

det(A) = det(AT ), det(AB) = det(A)det(B), det(A’1 ) = 1/det(A),


det(AH ) = det(A), det(±A) = ±n det(A), ∀± ∈ K.

Moreover, if two rows or columns of a matrix coincide, the determinant
vanishes, while exchanging two rows (or two columns) produces a change
1.5 Rank and Kernel of a Matrix 9

of sign in the determinant. Of course, the determinant of a diagonal matrix
is the product of the diagonal entries.
Denoting by Aij the matrix of order n ’ 1 obtained from A by elimi-
nating the i-th row and the j-th column, we call the complementary minor
associated with the entry aij the determinant of the matrix Aij . We call
the k-th principal (dominating) minor of A, dk , the determinant of the
principal submatrix of order k, Ak = A(1 : k, 1 : k). If we denote by
∆ij = (’1)i+j det(Aij ) the cofactor of the entry aij , the actual computa-
tion of the determinant of A can be performed using the following recursive
relation
±
 a11 if n = 1,



det(A) = (1.4)
n


 ∆ij aij , for n > 1,

j=1

which is known as the Laplace rule. If A is a square invertible matrix of
order n, then
1
A’1 = C
det(A)
where C is the matrix having entries ∆ji , i, j = 1, . . . , n.
As a consequence, a square matrix is invertible i¬ its determinant is non-
vanishing. In the case of nonsingular diagonal matrices the inverse is still
a diagonal matrix having entries given by the reciprocals of the diagonal
entries of the matrix.
Every orthogonal matrix is invertible, its inverse is given by AT , moreover
det(A) = ±1.


1.5 Rank and Kernel of a Matrix
Let A be a rectangular matrix m — n. We call the determinant of order
q (with q ≥ 1) extracted from matrix A, the determinant of any square
matrix of order q obtained from A by eliminating m ’ q rows and n ’ q
columns.

De¬nition 1.12 The rank of A (denoted by rank(A)) is the maximum
order of the nonvanishing determinants extracted from A. A matrix has
complete or full rank if rank(A) = min(m,n).
Notice that the rank of A represents the maximum number of linearly
independent column vectors of A that is, the dimension of the range of A,
de¬ned as
range(A) = {y ∈ Rm : y = Ax for x ∈ Rn } . (1.5)
10 1. Foundations of Matrix Analysis

Rigorously speaking, one should distinguish between the column rank of A
and the row rank of A, the latter being the maximum number of linearly
independent row vectors of A. Nevertheless, it can be shown that the row
rank and column rank do actually coincide.
The kernel of A is de¬ned as the subspace

ker(A) = {x ∈ Rn : Ax = 0} .

The following relations hold
T H
(if A ∈ Cm—n , rank(A) = rank(A ))
1. rank(A) = rank(A )
2. rank(A) + dim(ker(A)) = n.

In general, dim(ker(A)) = dim(ker(AT )). If A is a nonsingular square ma-
trix, then rank(A) = n and dim(ker(A)) = 0.

Example 1.5 Let

1 1 0
A= .
’1
1 1


Then, rank(A) = 2, dim(ker(A)) = 1 and dim(ker(AT )) = 0.

We ¬nally notice that for a matrix A ∈ Cn—n the following properties are
equivalent:
1. A is nonsingular;
2. det(A) = 0;
3. ker(A) = {0};
4. rank(A) = n;
5. A has linearly independent rows and columns.


1.6 Special Matrices
1.6.1 Block Diagonal Matrices
These are matrices of the form D = diag(D1 , . . . , Dn ), where Di are square
matrices with i = 1, . . . , n. Clearly, each single diagonal block can be of
di¬erent size. We shall say that a block diagonal matrix has size n if n
is the number of its diagonal blocks. The determinant of a block diagonal
matrix is given by the product of the determinants of the single diagonal
blocks.
1.6 Special Matrices 11

1.6.2 Trapezoidal and Triangular Matrices
A matrix A(m — n) is called upper trapezoidal if aij = 0 for i > j, while it
is lower trapezoidal if aij = 0 for i < j. The name is due to the fact that,
in the case of upper trapezoidal matrices, with m < n, the nonzero entries
of the matrix form a trapezoid.
A triangular matrix is a square trapezoidal matrix of order n of the form
®  ® 
l11 0 . . . 0 u11 u12 . . . u1n
 l21 l22 . . . 0  0 u22 . . . u2n 
   
L= . .  or U =  . . .
. .
°. .» °. .»
. .
. . . . . .
ln1 ln2 ... lnn 0 0 ... unn

The matrix L is called lower triangular while U is upper triangular.
Let us recall some algebraic properties of triangular matrices that are easy
to check.

- The determinant of a triangular matrix is the product of the diagonal
entries;

- the inverse of a lower (respectively, upper) triangular matrix is still lower
(respectively, upper) triangular;

- the product of two lower triangular (respectively, upper trapezoidal) ma-
trices is still lower triangular (respectively, upper trapezodial);

- if we call unit triangular matrix a triangular matrix that has diagonal
entries equal to 1, then, the product of lower (respectively, upper) unit
triangular matrices is still lower (respectively, upper) unit triangular.


1.6.3 Banded Matrices
The matrices introduced in the previous section are a special instance of
banded matrices. Indeed, we say that a matrix A ∈ Rm—n (or in Cm—n )
has lower band p if aij = 0 when i > j + p and upper band q if aij = 0
when j > i+q. Diagonal matrices are banded matrices for which p = q = 0,
while trapezoidal matrices have p = m’1, q = 0 (lower trapezoidal), p = 0,
q = n ’ 1 (upper trapezoidal).
Other banded matrices of relevant interest are the tridiagonal matrices
for which p = q = 1 and the upper bidiagonal (p = 0, q = 1) or lower bidiag-
onal (p = 1, q = 0). In the following, tridiagn (b, d, c) will denote the triadi-
agonal matrix of size n having respectively on the lower and upper principal
diagonals the vectors b = (b1 , . . . , bn’1 )T and c = (c1 , . . . , cn’1 )T , and on
the principal diagonal the vector d = (d1 , . . . , dn )T . If bi = β, di = δ and
ci = γ, β, δ and γ being given constants, the matrix will be denoted by
tridiagn (β, δ, γ).
12 1. Foundations of Matrix Analysis

We also mention the so-called lower Hessenberg matrices (p = m ’ 1,
q = 1) and upper Hessenberg matrices (p = 1, q = n ’ 1) that have the
following structure
®  ® 
0
h11 h12 h11 h12 ... h1n
   h21 h22 h2n 
 h21 h22 . . .   
H= .  or H =  . .
.. ..
   .
.. . . .»
°. . hm’1n » °
.
0 hmn’1 hmn
hm1 . . . . . . hmn

Matrices of similar shape can obviously be set up in the block-like format.



1.7 Eigenvalues and Eigenvectors
Let A be a square matrix of order n with real or complex entries; the number
» ∈ C is called an eigenvalue of A if there exists a nonnull vector x ∈ Cn
such that Ax = »x. The vector x is the eigenvector associated with the
eigenvalue » and the set of the eigenvalues of A is called the spectrum of A,
denoted by σ(A). We say that x and y are respectively a right eigenvector
and a left eigenvector of A, associated with the eigenvalue », if

Ax = »x, yH A = »yH .

The eigenvalue » corresponding to the eigenvector x can be determined by
computing the Rayleigh quotient » = xH Ax/(xH x). The number » is the
solution of the characteristic equation

pA (») = det(A ’ »I) = 0,

where pA (») is the characteristic polynomial. Since this latter is a polyno-
mial of degree n with respect to », there certainly exist n eigenvalues of A
not necessarily distinct. The following properties can be proved
n n
det(A) = »i , tr(A) = »i , (1.6)
i=1 i=1

and since det(AT ’ »I) = det((A ’ »I)T ) = det(A ’ »I) one concludes that
¯
σ(A) = σ(AT ) and, in an analogous way, that σ(AH ) = σ(A).
From the ¬rst relation in (1.6) it can be concluded that a matrix is
singular i¬ it has at least one null eigenvalue, since pA (0) = det(A) =
Πn »i .
i=1
Secondly, if A has real entries, pA (») turns out to be a real-coe¬cient
polynomial so that complex eigenvalues of A shall necessarily occur in com-
plex conjugate pairs.
1.7 Eigenvalues and Eigenvectors 13

Finally, due to the Cayley-Hamilton Theorem if pA (») is the charac-
teristic polynomial of A, then pA (A) = 0, where pA (A) denotes a matrix
polynomial (for the proof see, e.g., [Axe94], p. 51).
The maximum module of the eigenvalues of A is called the spectral radius
of A and is denoted by

ρ(A) = max |»|. (1.7)
»∈σ(A)


Characterizing the eigenvalues of a matrix as the roots of a polynomial
¯
implies in particular that » is an eigenvalue of A ∈ Cn—n i¬ » is an eigen-
value of AH . An immediate consequence is that ρ(A) = ρ(AH ). Moreover,
k
∀A ∈ Cn—n , ∀± ∈ C, ρ(±A) = |±|ρ(A), and ρ(Ak ) = [ρ(A)] ∀k ∈ N.

Finally, assume that A is a block triangular matrix
® 
A11 A12 ... A1k
 
0 A22 ... A2k
 
A= .
. .
..
° »
. .
.
. .
0 ... 0 Akk

As pA (») = pA11 (»)pA22 (») · · · pAkk (»), the spectrum of A is given by the
union of the spectra of each single diagonal block. As a consequence, if A
is triangular, the eigenvalues of A are its diagonal entries.

For each eigenvalue » of a matrix A the set of the eigenvectors associated
with », together with the null vector, identi¬es a subspace of Cn which is
called the eigenspace associated with » and corresponds by de¬nition to
ker(A-»I). The dimension of the eigenspace is

dim [ker(A ’ »I)] = n ’ rank(A ’ »I),

and is called geometric multiplicity of the eigenvalue ». It can never be
greater than the algebraic multiplicity of », which is the multiplicity of
» as a root of the characteristic polynomial. Eigenvalues having geometric
multiplicity strictly less than the algebraic one are called defective. A matrix
having at least one defective eigenvalue is called defective.
The eigenspace associated with an eigenvalue of a matrix A is invariant
with respect to A in the sense of the following de¬nition.


De¬nition 1.13 A subspace S in Cn is called invariant with respect to a
square matrix A if AS ‚ S, where AS is the transformed of S through A.
14 1. Foundations of Matrix Analysis

1.8 Similarity Transformations
De¬nition 1.14 Let C be a square nonsingular matrix having the same
order as the matrix A. We say that the matrices A and C’1 AC are similar,
and the transformation from A to C’1 AC is called a similarity transfor-
mation. Moreover, we say that the two matrices are unitarily similar if C
is unitary.
Two similar matrices share the same spectrum and the same characteris-
tic polynomial. Indeed, it is easy to check that if (», x) is an eigenvalue-
eigenvector pair of A, (», C’1 x) is the same for the matrix C’1 AC since
(C’1 AC)C’1 x = C’1 Ax = »C’1 x.
We notice in particular that the product matrices AB and BA, with A ∈
Cn—m and B ∈ Cm—n , are not similar but satisfy the following property
(see [Hac94], p.18, Theorem 2.4.6)
σ(AB)\ {0} = σ(BA)\ {0}
that is, AB and BA share the same spectrum apart from null eigenvalues
so that ρ(AB) = ρ(BA).
The use of similarity transformations aims at reducing the complexity
of the problem of evaluating the eigenvalues of a matrix. Indeed, if a given
matrix could be transformed into a similar matrix in diagonal or triangular
form, the computation of the eigenvalues would be immediate. The main
result in this direction is the following theorem (for the proof, see [Dem97],
Theorem 4.2).

Property 1.5 (Schur decomposition) Given A∈ Cn—n , there exists U
unitary such that
® 
»1 b12 . . . b1n
 0 »2 b2n 
 
’1 H
U AU = U AU =  . .  = T,
..
°. .»
.
. .
0 ... 0 »n
where »i are the eigenvalues of A.
It thus turns out that every matrix A is unitarily similar to an upper
triangular matrix. The matrices T and U are not necessarily unique [Hac94].
The Schur decomposition theorem gives rise to several important results;
among them, we recall:
1. every hermitian matrix is unitarily similar to a diagonal real ma-
trix, that is, when A is hermitian every Schur decomposition of A is
diagonal. In such an event, since
U’1 AU = Λ = diag(»1 , . . . , »n ),
1.8 Similarity Transformations 15

it turns out that AU = UΛ, that is, Aui = »i ui for i = 1, . . . , n so
that the column vectors of U are the eigenvectors of A. Moreover,
since the eigenvectors are orthogonal two by two, it turns out that
an hermitian matrix has a system of orthonormal eigenvectors that
generates the whole space Cn . Finally, it can be shown that a matrix
A of order n is similar to a diagonal matrix D i¬ the eigenvectors of
A form a basis for Cn [Axe94];

2. a matrix A ∈ Cn—n is normal i¬ it is unitarily similar to a diagonal
matrix. As a consequence, a normal matrix A ∈ Cn—n admits the
n
following spectral decomposition: A = UΛUH = i=1 »i ui uH being
i
U unitary and Λ diagonal [SS90];

3. let A and B be two normal and commutative matrices; then, the
generic eigenvalue µi of A+B is given by the sum »i + ξi , where
»i and ξi are the eigenvalues of A and B associated with the same
eigenvector.

There are, of course, nonsymmetric matrices that are similar to diagonal
matrices, but these are not unitarily similar (see, e.g., Exercise 7).

The Schur decomposition can be improved as follows (for the proof see,
e.g., [Str80], [God66]).

Property 1.6 (Canonical Jordan Form) Let A be any square matrix.
Then, there exists a nonsingular matrix X which transforms A into a block
diagonal matrix J such that

X’1 AX = J = diag (Jk1 (»1 ), Jk2 (»2 ), . . . , Jkl (»l )) ,


which is called canonical Jordan form, »j being the eigenvalues of A and
Jk (») ∈ Ck—k a Jordan block of the form J1 (») = » if k = 1 and
® 
»1 0 ... 0
 .
0 » 1 ··· .  .

 . .. 
..
Jk (») =  . 1 0 , for k > 1.
. .
.
 
. 
..
°. . » 1»
.
0 ... ... 0»

If an eigenvalue is defective, the size of the corresponding Jordan block
is greater than one. Therefore, the canonical Jordan form tells us that a
matrix can be diagonalized by a similarity transformation i¬ it is nonde-
fective. For this reason, the nondefective matrices are called diagonalizable.
In particular, normal matrices are diagonalizable.
16 1. Foundations of Matrix Analysis

Partitioning X by columns, X = (x1 , . . . , xn ), it can be seen that the
ki vectors associated with the Jordan block Jki (»i ) satisfy the following
recursive relation
i’1
Axl = »i xl , l= mj + 1,
(1.8)
j=1
Axj = »i xj + xj’1 , j = l + 1, . . . , l ’ 1 + ki , if ki = 1.
The vectors xi are called principal vectors or generalized eigenvectors of A.

Example 1.6 Let us consider the following matrix
® 
’1/4 ’1/4 ’1/4
7/4 3/4 1/4
0 
2 0 0 0 0
 
 ’1/2 ’1/2 
’1/2
5/2 1/2 1/2
A= .
 ’1/2 ’1/2 ’1/2 
5/2 1/2 1/2
 
° ’1/4 ’1/4 ’1/4 ’1/4 11/4 »
1/4
’3/2 ’1/2 ’1/2 1/2 1/2 7/2
The Jordan canonical form of A and its associated matrix X are given by
®  ® 
21000 0 100001
0 2 0 0 0 0 0 1 0 0 0 1
   
0 0 3 1 0 0 0 0 1 0 0 1
J= , X =  
 0 0 0 1 0 1 .
0 0 0 3 1 0
   
°0 0 0 0 3 0» °0 0 0 0 1 1»
00000 2 111111
Notice that two di¬erent Jordan blocks are related to the same eigenvalue (» =
2). It is easy to check property (1.8). Consider, for example, the Jordan block
associated with the eigenvalue »2 = 3; we have
Ax3 = [0 0 3 0 0 3]T = 3 [0 0 1 0 0 1]T = »2 x3 ,
Ax4 = [0 0 1 3 0 4]T = 3 [0 0 0 1 0 1]T + [0 0 1 0 0 1]T = »2 x4 + x3 ,
Ax5 = [0 0 0 1 3 4]T = 3 [0 0 0 0 1 1]T + [0 0 0 1 0 1]T = »2 x5 + x4 .




1.9 The Singular Value Decomposition (SVD)
Any matrix can be reduced in diagonal form by a suitable pre and post-
multiplication by unitary matrices. Precisely, the following result holds.

Property 1.7 Let A∈ Cm—n . There exist two unitary matrices U∈ Cm—m
and V∈ Cn—n such that
UH AV = Σ = diag(σ1 , . . . , σp ) ∈ Cm—n with p = min(m, n) (1.9)
and σ1 ≥ . . . ≥ σp ≥ 0. Formula (1.9) is called Singular Value Decompo-
sition or (SVD) of A and the numbers σi (or σi (A)) are called singular
values of A.
1.10 Scalar Product and Norms in Vector Spaces 17



If A is a real-valued matrix, U and V will also be real-valued and in (1.9)
UT must be written instead of UH . The following characterization of the
singular values holds

»i (AH A),
σi (A) = i = 1, . . . , n. (1.10)

Indeed, from (1.9) it follows that A = UΣVH , AH = VΣUH so that, U and
V being unitary, AH A = VΣ2 VH , that is, »i (AH A) = »i (Σ2 ) = (σi (A))2 .
Since AAH and AH A are hermitian matrices, the columns of U, called the
left singular vectors of A, turn out to be the eigenvectors of AAH (see
Section 1.8) and, therefore, they are not uniquely de¬ned. The same holds
for the columns of V, which are the right singular vectors of A.

Relation (1.10) implies that if A ∈ Cn—n is hermitian with eigenvalues given
by »1 , »2 , . . . , »n , then the singular values of A coincide with the modules
of the eigenvalues of A. Indeed because AAH = A2 , σi = »2 = |»i | for i

. 1
( 26)



>>