ńņš. 1 |

Alfio Quarteroni

Riccardo Sacco

Fausto Saleri

Springer

Texts in Applied Mathematicsm37

Editors

J.E. Marsden

L. Sirovich

M. Golubitsky

W. JĆ¤ger

Advisors

G. Iooss

P. Holmes

D. Barkley

M. Dellnitz

P. Newton

Springer

New York

Berlin

Heidelberg

Barcelona

Hong Kong

London

Milan

Paris

Singapore

Tokyo

Alfio QuarteroniMMRiccardo Sacco

Fausto Saleri

Numerical Mathematics

With 134 Illustrations

123

Riccardo Sacco

Alfio Quarteroni Fausto Saleri

Dipartimento di Matematica

Department of Mathematics Dipartimento di Matematica,

Politecnico di Milano

Ecole Polytechnique MāF. Enriquesā

Piazza Leonardo da Vinci 32

MFe Ā“rale de Lausanne

Ā“de UniversitĆ degli Studi di

20133 Milan

CH-1015 Lausanne MMilano

Italy

Switzerland Via Saldini 50

ricsac@mate.polimi.it

alfio.quarteroni@epfl.ch 20133 Milan

Italy

fausto.saleri@unimi.it

Series Editors

J.E. Marsden L. Sirovich

Control and Dynamical Systems, 107ā“81 Division of Applied Mathematics

California Institute of Technology Brown University

Pasadena, CA 91125 Providence, RI 02912

USA USA

M. Golubitsky W. JĀØ ger

a

Department of Mathematics Department of Applied Mathematics

University of Houston Universit a t Heidelberg

ĀØ

Houston, TX 77204-3476 Im Neuenheimer Feld 294

USA 69120 Heidelberg

Germany

Mathematics Subject Classification (1991): 15-01, 34-01, 35-01, 65-01

Library of Congress Cataloging-in-Publication Data

Quarteroni, Alfio.

Numerical mathematics/Alfio Quarteroni, Riccardo Sacco, Fausto Saleri.

p.Mcm. ā” (Texts in applied mathematics; 37)

Includes bibliographical references and index.

ISBN 0-387-98959-5 (alk. paper)

1. Numerical analysis.MI. Sacco, Riccardo.MII. Saleri, Fausto.MIII. Title.MIV. Series.

I. Title.MMII. Series.

QA297.Q83M2000

519.4ā”dc21 99-059414

Ā© 2000 Springer-Verlag New York, Inc.

All rights reserved. This work may not be translated or copied in whole or in part without

the written permission of the publisher (Springer-Verlag New York, Inc., 175 Fifth Avenue,

New York, NY 10010, USA), except for brief excerpts in connection with reviews or scholarly

analysis. Use in connection with any form of information storage and retrieval, electronic

adaptation, computer software, or by similar or dissimilar methodology now known or heraf-

ter developed is forbidden.

The use of general descriptive names, trade names, trademarks, etc., in this publication, even

if the former are not especially identified, is not to be taken as a sign that such names, as

understood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely

by anyone.

ISBN 0-387-98959-5nSpringer-VerlagnNew YorknBerlinnHeidelbergMSPIN 10747955

Preface

Numerical mathematics is the branch of mathematics that proposes, de-

velops, analyzes and applies methods from scientiļ¬c computing to several

ļ¬elds including analysis, linear algebra, geometry, approximation theory,

functional equations, optimization and diļ¬erential equations. Other disci-

plines such as physics, the natural and biological sciences, engineering, and

economics and the ļ¬nancial sciences frequently give rise to problems that

need scientiļ¬c computing for their solutions.

As such, numerical mathematics is the crossroad of several disciplines of

great relevance in modern applied sciences, and can become a crucial tool

for their qualitative and quantitative analysis. This role is also emphasized

by the continual development of computers and algorithms, which make it

possible nowadays, using scientiļ¬c computing, to tackle problems of such

a large size that real-life phenomena can be simulated providing accurate

responses at aļ¬ordable computational cost.

The corresponding spread of numerical software represents an enrichment

for the scientiļ¬c community. However, the user has to make the correct

choice of the method (or the algorithm) which best suits the problem at

hand. As a matter of fact, no black-box methods or algorithms exist that

can eļ¬ectively and accurately solve all kinds of problems.

One of the purposes of this book is to provide the mathematical foun-

dations of numerical methods, to analyze their basic theoretical proper-

ties (stability, accuracy, computational complexity), and demonstrate their

performances on examples and counterexamples which outline their pros

viii Preface

and cons. This is done using the MATLAB 1 software environment. This

choice satisļ¬es the two fundamental needs of user-friendliness and wide-

spread diļ¬usion, making it available on virtually every computer.

Every chapter is supplied with examples, exercises and applications of

the discussed theory to the solution of real-life problems. The reader is

thus in the ideal condition for acquiring the theoretical knowledge that is

required to make the right choice among the numerical methodologies and

make use of the related computer programs.

This book is primarily addressed to undergraduate students, with partic-

ular focus on the degree courses in Engineering, Mathematics, Physics and

Computer Science. The attention which is paid to the applications and the

related development of software makes it valuable also for graduate stu-

dents, researchers and users of scientiļ¬c computing in the most widespread

professional ļ¬elds.

The content of the volume is organized into four parts and 13 chapters.

Part I comprises two chapters in which we review basic linear algebra and

introduce the general concepts of consistency, stability and convergence of

a numerical method as well as the basic elements of computer arithmetic.

Part II is on numerical linear algebra, and is devoted to the solution of

linear systems (Chapters 3 and 4) and eigenvalues and eigenvectors com-

putation (Chapter 5).

We continue with Part III where we face several issues about functions

and their approximation. Speciļ¬cally, we are interested in the solution of

nonlinear equations (Chapter 6), solution of nonlinear systems and opti-

mization problems (Chapter 7), polynomial approximation (Chapter 8) and

numerical integration (Chapter 9).

Part IV, which is the more demanding as a mathematical background, is

concerned with approximation, integration and transforms based on orthog-

onal polynomials (Chapter 10), solution of initial value problems (Chap-

ter 11), boundary value problems (Chapter 12) and initial-boundary value

problems for parabolic and hyperbolic equations (Chapter 13).

Part I provides the indispensable background. Each of the remaining

Parts has a size and a content that make it well suited for a semester

course.

A guideline index to the use of the numerous MATLAB Programs de-

veloped in the book is reported at the end of the volume. These programs

are also available at the web site address:

http://www1.mate.polimi.it/Ė calnum/programs.html

For the readerā™s ease, any code is accompanied by a brief description of

its input/output parameters.

We express our thanks to the staļ¬ at Springer-Verlag New York for their

expert guidance and assistance with editorial aspects, as well as to Dr.

1 MATLAB is a registered trademark of The MathWorks, Inc.

Preface ix

Martin Peters from Springer-Verlag Heidelberg and Dr. Francesca Bonadei

from Springer-Italia for their advice and friendly collaboration all along

this project.

We gratefully thank Professors L. Gastaldi and A. Valli for their useful

comments on Chapters 12 and 13.

We also wish to express our gratitude to our families for their forbearance

and understanding, and dedicate this book to them.

Lausanne, Switzerland Alļ¬o Quarteroni

Milan, Italy Riccardo Sacco

Milan, Italy Fausto Saleri

January 2000

Contents

Series Preface v

Preface vii

PART I: Getting Started

1. Foundations of Matrix Analysis 1

1.1 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Operations with Matrices . . . . . . . . . . . . . . . . . . . 5

1.3.1 Inverse of a Matrix . . . . . . . . . . . . . . . . . . 6

1.3.2 Matrices and Linear Mappings . . . . . . . . . . . 7

1.3.3 Operations with Block-Partitioned Matrices . . . . 7

1.4 Trace and Determinant of a Matrix . . . . . . . . . . . . . 8

1.5 Rank and Kernel of a Matrix . . . . . . . . . . . . . . . . 9

1.6 Special Matrices . . . . . . . . . . . . . . . . . . . . . . . . 10

1.6.1 Block Diagonal Matrices . . . . . . . . . . . . . . . 10

1.6.2 Trapezoidal and Triangular Matrices . . . . . . . . 11

1.6.3 Banded Matrices . . . . . . . . . . . . . . . . . . . 11

1.7 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . 12

1.8 Similarity Transformations . . . . . . . . . . . . . . . . . . 14

1.9 The Singular Value Decomposition (SVD) . . . . . . . . . 16

1.10 Scalar Product and Norms in Vector Spaces . . . . . . . . 17

1.11 Matrix Norms . . . . . . . . . . . . . . . . . . . . . . . . . 21

xii Contents

1.11.1 Relation Between Norms and the

Spectral Radius of a Matrix . . . . . . . . . . . . . 25

1.11.2 Sequences and Series of Matrices . . . . . . . . . . 26

1.12 Positive Deļ¬nite, Diagonally Dominant and M-Matrices . 27

1.13 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2. Principles of Numerical Mathematics 33

2.1 Well-Posedness and Condition Number of a Problem . . . 33

2.2 Stability of Numerical Methods . . . . . . . . . . . . . . . 37

2.2.1 Relations Between Stability and Convergence . . . 40

2.3 A priori and a posteriori Analysis . . . . . . . . . . . . . . 41

2.4 Sources of Error in Computational Models . . . . . . . . . 43

2.5 Machine Representation of Numbers . . . . . . . . . . . . 45

2.5.1 The Positional System . . . . . . . . . . . . . . . . 45

2.5.2 The Floating-Point Number System . . . . . . . . 46

2.5.3 Distribution of Floating-Point Numbers . . . . . . 49

2.5.4 IEC/IEEE Arithmetic . . . . . . . . . . . . . . . . 49

2.5.5 Rounding of a Real Number in Its

Machine Representation . . . . . . . . . . . . ... 50

2.5.6 Machine Floating-Point Operations . . . . . . ... 52

2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . ... 54

PART II: Numerical Linear Algebra

3. Direct Methods for the Solution of Linear Systems 57

3.1 Stability Analysis of Linear Systems . . . . . . . . . . . . 58

3.1.1 The Condition Number of a Matrix . . . . . . . . 58

3.1.2 Forward a priori Analysis . . . . . . . . . . . . . . 60

3.1.3 Backward a priori Analysis . . . . . . . . . . . . . 63

3.1.4 A posteriori Analysis . . . . . . . . . . . . . . . . . 64

3.2 Solution of Triangular Systems . . . . . . . . . . . . . . . 65

3.2.1 Implementation of Substitution Methods . . . . . 65

3.2.2 Rounding Error Analysis . . . . . . . . . . . . . . 67

3.2.3 Inverse of a Triangular Matrix . . . . . . . . . . . 67

3.3 The Gaussian Elimination Method (GEM) and

LU Factorization . . . . . . . . . . . . . . . . . . . . . . . 68

3.3.1 GEM as a Factorization Method . . . . . . . . . . 72

3.3.2 The Eļ¬ect of Rounding Errors . . . . . . . . . . . 76

3.3.3 Implementation of LU Factorization . . . . . . . . 77

3.3.4 Compact Forms of Factorization . . . . . . . . . . 78

3.4 Other Types of Factorization . . . . . . . . . . . . . . . . . 79

3.4.1 LDMT Factorization . . . . . . . . . . . . . . . . . 79

3.4.2 Symmetric and Positive Deļ¬nite Matrices:

The Cholesky Factorization . . . . . . . . . . ... 80

3.4.3 Rectangular Matrices: The QR Factorization ... 82

Contents xiii

3.5 Pivoting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

3.6 Computing the Inverse of a Matrix . . . . . . . . . . . . . 89

3.7 Banded Systems . . . . . . . . . . . . . . . . . . . . . . . . 90

3.7.1 Tridiagonal Matrices . . . . . . . . . . . . . . . . . 91

3.7.2 Implementation Issues . . . . . . . . . . . . . . . . 92

3.8 Block Systems . . . . . . . . . . . . . . . . . . . . . . . . . 93

3.8.1 Block LU Factorization . . . . . . . . . . . . . . . 94

3.8.2 Inverse of a Block-Partitioned Matrix . . . . . . . 95

3.8.3 Block Tridiagonal Systems . . . . . . . . . . . . . . 95

3.9 Sparse Matrices . . . . . . . . . . . . . . . . . . . . . . . . 97

3.9.1 The Cuthill-McKee Algorithm . . . . . . . . . . . 98

3.9.2 Decomposition into Substructures . . . . . . . . . 100

3.9.3 Nested Dissection . . . . . . . . . . . . . . . . . . . 103

3.10 Accuracy of the Solution Achieved Using GEM . . . . . . 103

3.11 An Approximate Computation of K(A) . . . . . . . . . . . 106

3.12 Improving the Accuracy of GEM . . . . . . . . . . . . . . 109

3.12.1 Scaling . . . . . . . . . . . . . . . . . . . . . . . . 110

3.12.2 Iterative Reļ¬nement . . . . . . . . . . . . . . . . . 111

3.13 Undetermined Systems . . . . . . . . . . . . . . . . . . . . 112

3.14 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 115

3.14.1 Nodal Analysis of a Structured Frame . . . . . . . 115

3.14.2 Regularization of a Triangular Grid . . . . . . . . 118

3.15 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

4. Iterative Methods for Solving Linear Systems 123

4.1 On the Convergence of Iterative Methods . . . . . . . . . . 123

4.2 Linear Iterative Methods . . . . . . . . . . . . . . . . . . . 126

4.2.1 Jacobi, Gauss-Seidel and Relaxation Methods . . . 127

4.2.2 Convergence Results for Jacobi and

Gauss-Seidel Methods . . . . . . . . . . . . . . . . 129

4.2.3 Convergence Results for the Relaxation Method . 131

4.2.4 A priori Forward Analysis . . . . . . . . . . . . . . 132

4.2.5 Block Matrices . . . . . . . . . . . . . . . . . . . . 133

4.2.6 Symmetric Form of the Gauss-Seidel and

SOR Methods . . . . . . . . . . . . . . . . . . . . . 133

4.2.7 Implementation Issues . . . . . . . . . . . . . . . . 135

4.3 Stationary and Nonstationary Iterative Methods . . . . . . 136

4.3.1 Convergence Analysis of the Richardson Method . 137

4.3.2 Preconditioning Matrices . . . . . . . . . . . . . . 139

4.3.3 The Gradient Method . . . . . . . . . . . . . . . . 146

4.3.4 The Conjugate Gradient Method . . . . . . . . . . 150

4.3.5 The Preconditioned Conjugate Gradient Method . 156

4.3.6 The Alternating-Direction Method . . . . . . . . . 158

4.4 Methods Based on Krylov Subspace Iterations . . . . . . . 159

4.4.1 The Arnoldi Method for Linear Systems . . . . . . 162

xiv Contents

4.4.2 The GMRES Method . . . . . . . . . . . . . . . . 165

4.4.3 The Lanczos Method for Symmetric Systems . . . 167

4.5 The Lanczos Method for Unsymmetric Systems . . . . . . 168

4.6 Stopping Criteria . . . . . . . . . . . . . . . . . . . . . . . 171

4.6.1 A Stopping Test Based on the Increment . . . . . 172

4.6.2 A Stopping Test Based on the Residual . . . . . . 174

4.7 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 174

4.7.1 Analysis of an Electric Network . . . . . . . . . . . 174

4.7.2 Finite Diļ¬erence Analysis of Beam Bending . . . . 177

4.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

5. Approximation of Eigenvalues and Eigenvectors 183

5.1 Geometrical Location of the Eigenvalues . . . . . . . . . . 183

5.2 Stability and Conditioning Analysis . . . . . . . . . . . . . 186

5.2.1 A priori Estimates . . . . . . . . . . . . . . . . . . 186

5.2.2 A posteriori Estimates . . . . . . . . . . . . . . . . 190

5.3 The Power Method . . . . . . . . . . . . . . . . . . . . . . 192

5.3.1 Approximation of the Eigenvalue of

Largest Module . . . . . . . . . . . . . . . . . . . . 192

5.3.2 Inverse Iteration . . . . . . . . . . . . . . . . . . . 195

5.3.3 Implementation Issues . . . . . . . . . . . . . . . . 196

5.4 The QR Iteration . . . . . . . . . . . . . . . . . . . . . . . 200

5.5 The Basic QR Iteration . . . . . . . . . . . . . . . . . . . . 201

5.6 The QR Method for Matrices in Hessenberg Form . . . . . 203

5.6.1 Householder and Givens Transformation Matrices 204

5.6.2 Reducing a Matrix in Hessenberg Form . . . . . . 207

5.6.3 QR Factorization of a Matrix in Hessenberg Form 209

5.6.4 The Basic QR Iteration Starting from

Upper Hessenberg Form . . . . . . . . . . . . . . . 210

5.6.5 Implementation of Transformation Matrices . . . . 212

5.7 The QR Iteration with Shifting Techniques . . . . . . . . . 215

5.7.1 The QR Method with Single Shift . . . . . . . . . 215

5.7.2 The QR Method with Double Shift . . . . . . . . . 218

5.8 Computing the Eigenvectors and the SVD of a Matrix . . 221

5.8.1 The Hessenberg Inverse Iteration . . . . . . . . . . 221

5.8.2 Computing the Eigenvectors from the

Schur Form of a Matrix . . . . . . . . . . . . . . . 221

5.8.3 Approximate Computation of the SVD of a Matrix 222

5.9 The Generalized Eigenvalue Problem . . . . . . . . . . . . 224

5.9.1 Computing the Generalized Real Schur Form . . . 225

5.9.2 Generalized Real Schur Form of

Symmetric-Deļ¬nite Pencils . . . . . . . . . . . . . 226

5.10 Methods for Eigenvalues of Symmetric Matrices . . . . . . 227

5.10.1 The Jacobi Method . . . . . . . . . . . . . . . . . 227

5.10.2 The Method of Sturm Sequences . . . . . . . . . . 230

Contents xv

5.11 The Lanczos Method . . . . . . . . . . . . . . . . . . . . . 233

5.12 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 235

5.12.1 Analysis of the Buckling of a Beam . . . . . . . . . 236

5.12.2 Free Dynamic Vibration of a Bridge . . . . . . . . 238

5.13 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240

PART III: Around Functions and Functionals

6. Rootļ¬nding for Nonlinear Equations 245

6.1 Conditioning of a Nonlinear Equation . . . . . . . . . . . . 246

6.2 A Geometric Approach to Rootļ¬nding . . . . . . . . . . . 248

6.2.1 The Bisection Method . . . . . . . . . . . . . . . . 248

6.2.2 The Methods of Chord, Secant and Regula Falsi

and Newtonā™s Method . . . . . . . . . . . . . . . . 251

6.2.3 The Dekker-Brent Method . . . . . . . . . . . . . 256

6.3 Fixed-Point Iterations for Nonlinear Equations . . . . . . . 257

6.3.1 Convergence Results for

Some Fixed-Point Methods . . . . . . . . . . . . . 260

6.4 Zeros of Algebraic Equations . . . . . . . . . . . . . . . . . 261

6.4.1 The Horner Method and Deļ¬‚ation . . . . . . . . . 262

6.4.2 The Newton-Horner Method . . . . . . . . . . . . 263

6.4.3 The Muller Method . . . . . . . . . . . . . . . . . 267

6.5 Stopping Criteria . . . . . . . . . . . . . . . . . . . . . . . 269

6.6 Post-Processing Techniques for Iterative Methods . . . . . 272

6.6.1 Aitkenā™s Acceleration . . . . . . . . . . . . . . . . 272

6.6.2 Techniques for Multiple Roots . . . . . . . . . . . 275

6.7 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 276

6.7.1 Analysis of the State Equation for a Real Gas . . 276

6.7.2 Analysis of a Nonlinear Electrical Circuit . . . . . 277

6.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279

7. Nonlinear Systems and Numerical Optimization 281

7.1 Solution of Systems of Nonlinear Equations . . . . . . . . 282

7.1.1 Newtonā™s Method and Its Variants . . . . . . . . . 283

7.1.2 Modiļ¬ed Newtonā™s Methods . . . . . . . . . . . . . 284

7.1.3 Quasi-Newton Methods . . . . . . . . . . . . . . . 288

7.1.4 Secant-Like Methods . . . . . . . . . . . . . . . . . 288

7.1.5 Fixed-Point Methods . . . . . . . . . . . . . . . . . 290

7.2 Unconstrained Optimization . . . . . . . . . . . . . . . . . 294

7.2.1 Direct Search Methods . . . . . . . . . . . . . . . . 295

7.2.2 Descent Methods . . . . . . . . . . . . . . . . . . . 300

7.2.3 Line Search Techniques . . . . . . . . . . . . . . . 302

7.2.4 Descent Methods for Quadratic Functions . . . . . 304

7.2.5 Newton-Like Methods for Function Minimization . 307

7.2.6 Quasi-Newton Methods . . . . . . . . . . . . . . . 308

xvi Contents

7.2.7 Secant-Like Methods . . . . . . . . . . . . . . . . . 309

7.3 Constrained Optimization . . . . . . . . . . . . . . . . . . 311

7.3.1 Kuhn-Tucker Necessary Conditions for

Nonlinear Programming . . . . . . . . . . . . . . . 313

7.3.2 The Penalty Method . . . . . . . . . . . . . . . . . 315

7.3.3 The Method of Lagrange Multipliers . . . . . . . . 317

7.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 319

7.4.1 Solution of a Nonlinear System Arising from

Semiconductor Device Simulation . . . . . . . . . . 320

7.4.2 Nonlinear Regularization of a Discretization Grid . 323

7.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325

8. Polynomial Interpolation 327

8.1 Polynomial Interpolation . . . . . . . . . . . . . . . . . . . 328

8.1.1 The Interpolation Error . . . . . . . . . . . . . . . 329

8.1.2 Drawbacks of Polynomial Interpolation on Equally

Spaced Nodes and Rungeā™s Counterexample . . . . 330

8.1.3 Stability of Polynomial Interpolation . . . . . . . . 332

8.2 Newton Form of the Interpolating Polynomial . . . . . . . 333

8.2.1 Some Properties of Newton Divided Diļ¬erences . . 335

8.2.2 The Interpolation Error Using Divided Diļ¬erences 337

8.3 Piecewise Lagrange Interpolation . . . . . . . . . . . . . . 338

8.4 Hermite-Birkoļ¬ Interpolation . . . . . . . . . . . . . . . . 341

8.5 Extension to the Two-Dimensional Case . . . . . . . . . . 343

8.5.1 Polynomial Interpolation . . . . . . . . . . . . . . 343

8.5.2 Piecewise Polynomial Interpolation . . . . . . . . . 344

8.6 Approximation by Splines . . . . . . . . . . . . . . . . . . 348

8.6.1 Interpolatory Cubic Splines . . . . . . . . . . . . . 349

8.6.2 B-Splines . . . . . . . . . . . . . . . . . . . . . . . 353

8.7 Splines in Parametric Form . . . . . . . . . . . . . . . . . 357

8.7.1 BĀ“zier Curves and Parametric B-Splines . . . . . .

e 359

8.8 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 362

8.8.1 Finite Element Analysis of a Clamped Beam . . . 363

8.8.2 Geometric Reconstruction Based on

Computer Tomographies . . . . . . . . . . . . . . . 366

8.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368

9. Numerical Integration 371

9.1 Quadrature Formulae . . . . . . . . . . . . . . . . . . . . . 371

9.2 Interpolatory Quadratures . . . . . . . . . . . . . . . . . . 373

9.2.1 The Midpoint or Rectangle Formula . . . . . . . . 373

9.2.2 The Trapezoidal Formula . . . . . . . . . . . . . . 375

9.2.3 The Cavalieri-Simpson Formula . . . . . . . . . . . 377

9.3 Newton-Cotes Formulae . . . . . . . . . . . . . . . . . . . 378

9.4 Composite Newton-Cotes Formulae . . . . . . . . . . . . . 383

Contents xvii

9.5 Hermite Quadrature Formulae . . . . . . . . . . . . . . . . 386

9.6 Richardson Extrapolation . . . . . . . . . . . . . . . . . . 387

9.6.1 Romberg Integration . . . . . . . . . . . . . . . . . 389

9.7 Automatic Integration . . . . . . . . . . . . . . . . . . . . 391

9.7.1 Non Adaptive Integration Algorithms . . . . . . . 392

9.7.2 Adaptive Integration Algorithms . . . . . . . . . . 394

9.8 Singular Integrals . . . . . . . . . . . . . . . . . . . . . . . 398

9.8.1 Integrals of Functions with Finite

Jump Discontinuities . . . . . . . . . . . . . . . . . 398

9.8.2 Integrals of Inļ¬nite Functions . . . . . . . . . . . . 398

9.8.3 Integrals over Unbounded Intervals . . . . . . . . . 401

9.9 Multidimensional Numerical Integration . . . . . . . . . . 402

9.9.1 The Method of Reduction Formula . . . . . . . . . 403

9.9.2 Two-Dimensional Composite Quadratures . . . . . 404

9.9.3 Monte Carlo Methods for

Numerical Integration . . . . . . . . . . . . . . . . 407

9.10 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 408

9.10.1 Computation of an Ellipsoid Surface . . . . . . . . 408

9.10.2 Computation of the Wind Action on a

Sailboat Mast . . . . . . . . . . . . . . . . . . . . . 410

9.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412

PART IV: Transforms, Diļ¬erentiation

and Problem Discretization

10. Orthogonal Polynomials in Approximation Theory 415

10.1 Approximation of Functions by Generalized Fourier Series 415

10.1.1 The Chebyshev Polynomials . . . . . . . . . . . . . 417

10.1.2 The Legendre Polynomials . . . . . . . . . . . . . 419

10.2 Gaussian Integration and Interpolation . . . . . . . . . . . 419

10.3 Chebyshev Integration and Interpolation . . . . . . . . . . 424

10.4 Legendre Integration and Interpolation . . . . . . . . . . . 426

10.5 Gaussian Integration over Unbounded Intervals . . . . . . 428

10.6 Programs for the Implementation of Gaussian Quadratures 429

10.7 Approximation of a Function in the Least-Squares Sense . 431

10.7.1 Discrete Least-Squares Approximation . . . . . . . 431

10.8 The Polynomial of Best Approximation . . . . . . . . . . . 433

10.9 Fourier Trigonometric Polynomials . . . . . . . . . . . . . 435

10.9.1 The Gibbs Phenomenon . . . . . . . . . . . . . . . 439

10.9.2 The Fast Fourier Transform . . . . . . . . . . . . . 440

10.10 Approximation of Function Derivatives . . . . . . . . . . . 442

10.10.1 Classical Finite Diļ¬erence Methods . . . . . . . . . 442

10.10.2 Compact Finite Diļ¬erences . . . . . . . . . . . . . 444

10.10.3 Pseudo-Spectral Derivative . . . . . . . . . . . . . 448

10.11 Transforms and Their Applications . . . . . . . . . . . . . 450

xviii Contents

10.11.1 The Fourier Transform . . . . . . . . . . . . . . . . 450

10.11.2 (Physical) Linear Systems and Fourier Transform . 453

10.11.3 The Laplace Transform . . . . . . . . . . . . . . . 455

10.11.4 The Z-Transform . . . . . . . . . . . . . . . . . . . 457

10.12 The Wavelet Transform . . . . . . . . . . . . . . . . . . . . 458

10.12.1 The Continuous Wavelet Transform . . . . . . . . 458

10.12.2 Discrete and Orthonormal Wavelets . . . . . . . . 461

10.13 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 463

10.13.1 Numerical Computation of Blackbody Radiation . 463

10.13.2 Numerical Solution of SchrĀØdinger Equation . . .

o . 464

10.14 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467

11. Numerical Solution of Ordinary Diļ¬erential Equations 469

11.1 The Cauchy Problem . . . . . . . . . . . . . . . . . . . . . 469

11.2 One-Step Numerical Methods . . . . . . . . . . . . . . . . 472

11.3 Analysis of One-Step Methods . . . . . . . . . . . . . . . . 473

11.3.1 The Zero-Stability . . . . . . . . . . . . . . . . . . 475

11.3.2 Convergence Analysis . . . . . . . . . . . . . . . . 477

11.3.3 The Absolute Stability . . . . . . . . . . . . . . . . 479

11.4 Diļ¬erence Equations . . . . . . . . . . . . . . . . . . . . . 482

11.5 Multistep Methods . . . . . . . . . . . . . . . . . . . . . . 487

11.5.1 Adams Methods . . . . . . . . . . . . . . . . . . . 490

11.5.2 BDF Methods . . . . . . . . . . . . . . . . . . . . 492

11.6 Analysis of Multistep Methods . . . . . . . . . . . . . . . . 492

11.6.1 Consistency . . . . . . . . . . . . . . . . . . . . . . 493

11.6.2 The Root Conditions . . . . . . . . . . . . . . . . . 494

11.6.3 Stability and Convergence Analysis for

Multistep Methods . . . . . . . . . . . . . . . . . . 495

11.6.4 Absolute Stability of Multistep Methods . . . . . . 499

11.7 Predictor-Corrector Methods . . . . . . . . . . . . . . . . . 502

11.8 Runge-Kutta Methods . . . . . . . . . . . . . . . . . . . . 508

11.8.1 Derivation of an Explicit RK Method . . . . . . . 511

11.8.2 Stepsize Adaptivity for RK Methods . . . . . . . . 512

11.8.3 Implicit RK Methods . . . . . . . . . . . . . . . . 514

11.8.4 Regions of Absolute Stability for RK Methods . . 516

11.9 Systems of ODEs . . . . . . . . . . . . . . . . . . . . . . . 517

11.10 Stiļ¬ Problems . . . . . . . . . . . . . . . . . . . . . . . . . 519

11.11 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 521

11.11.1 Analysis of the Motion of a Frictionless Pendulum 522

11.11.2 Compliance of Arterial Walls . . . . . . . . . . . . 523

11.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527

12. Two-Point Boundary Value Problems 531

12.1 A Model Problem . . . . . . . . . . . . . . . . . . . . . . . 531

12.2 Finite Diļ¬erence Approximation . . . . . . . . . . . . . . . 533

Contents xix

12.2.1 Stability Analysis by the Energy Method . . . . . 534

12.2.2 Convergence Analysis . . . . . . . . . . . . . . . . 538

12.2.3 Finite Diļ¬erences for Two-Point Boundary

Value Problems with Variable Coeļ¬cients . . . . . 540

12.3 The Spectral Collocation Method . . . . . . . . . . . . . . 542

12.4 The Galerkin Method . . . . . . . . . . . . . . . . . . . . . 544

12.4.1 Integral Formulation of Boundary-Value Problems 544

12.4.2 A Quick Introduction to Distributions . . . . . . . 546

12.4.3 Formulation and Properties of the

Galerkin Method . . . . . . . . . . . . . . . . . . . 547

12.4.4 Analysis of the Galerkin Method . . . . . . . . . . 548

12.4.5 The Finite Element Method . . . . . . . . . . . . . 550

12.4.6 Implementation Issues . . . . . . . . . . . . . . . . 556

12.4.7 Spectral Methods . . . . . . . . . . . . . . . . . . . 559

12.5 Advection-Diļ¬usion Equations . . . . . . . . . . . . . . . . 560

12.5.1 Galerkin Finite Element Approximation . . . . . . 561

12.5.2 The Relationship Between Finite Elements and

Finite Diļ¬erences; the Numerical Viscosity . . . . 563

12.5.3 Stabilized Finite Element Methods . . . . . . . . . 567

12.6 A Quick Glance to the Two-Dimensional Case . . . . . . . 572

12.7 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 575

12.7.1 Lubrication of a Slider . . . . . . . . . . . . . . . . 575

12.7.2 Vertical Distribution of Spore

Concentration over Wide Regions . . . . . . . . . . 576

12.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 578

13. Parabolic and Hyperbolic Initial Boundary

Value Problems 581

13.1 The Heat Equation . . . . . . . . . . . . . . . . . . . . . . 581

13.2 Finite Diļ¬erence Approximation of the Heat Equation . . 584

13.3 Finite Element Approximation of the Heat Equation . . . 586

13.3.1 Stability Analysis of the Īø-Method . . . . . . . . . 588

13.4 Space-Time Finite Element Methods for the

Heat Equation . . . . . . . . . . . . . . . . . . . . . . . . . 593

13.5 Hyperbolic Equations: A Scalar Transport Problem . . . . 597

13.6 Systems of Linear Hyperbolic Equations . . . . . . . . . . 599

13.6.1 The Wave Equation . . . . . . . . . . . . . . . . . 601

13.7 The Finite Diļ¬erence Method for Hyperbolic Equations . . 602

13.7.1 Discretization of the Scalar Equation . . . . . . . . 602

13.8 Analysis of Finite Diļ¬erence Methods . . . . . . . . . . . . 605

13.8.1 Consistency . . . . . . . . . . . . . . . . . . . . . . 605

13.8.2 Stability . . . . . . . . . . . . . . . . . . . . . . . . 605

13.8.3 The CFL Condition . . . . . . . . . . . . . . . . . 606

13.8.4 Von Neumann Stability Analysis . . . . . . . . . . 608

13.9 Dissipation and Dispersion . . . . . . . . . . . . . . . . . . 611

xx Contents

13.9.1 Equivalent Equations . . . . . . . . . . . . . . . . 614

13.10 Finite Element Approximation of Hyperbolic Equations . . 618

13.10.1 Space Discretization with Continuous and

Discontinuous Finite Elements . . . . . . . . . . . 618

13.10.2 Time Discretization . . . . . . . . . . . . . . . . . 620

13.11 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 623

13.11.1 Heat Conduction in a Bar . . . . . . . . . . . . . . 623

13.11.2 A Hyperbolic Model for Blood Flow

Interaction with Arterial Walls . . . . . . . . . . . 623

13.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625

References 627

Index of MATLAB Programs 643

Index 647

1

Foundations of Matrix Analysis

In this chapter we recall the basic elements of linear algebra which will be

employed in the remainder of the text. For most of the proofs as well as

for the details, the reader is referred to [Bra75], [Nob69], [Hal58]. Further

results on eigenvalues can be found in [Hou75] and [Wil65].

1.1 Vector Spaces

Deļ¬nition 1.1 A vector space over the numeric ļ¬eld K (K = R or K = C)

is a nonempty set V , whose elements are called vectors and in which two

operations are deļ¬ned, called addition and scalar multiplication, that enjoy

the following properties:

1. addition is commutative and associative;

2. there exists an element 0 ā V (the zero vector or null vector) such

that v + 0 = v for each v ā V ;

3. 0 Ā· v = 0, 1 Ā· v = v, where 0 and 1 are respectively the zero and the

unity of K;

4. for each element v ā V there exists its opposite, ā’v, in V such that

v + (ā’v) = 0;

2 1. Foundations of Matrix Analysis

5. the following distributive properties hold

āĪ± ā K, āv, w ā V, Ī±(v + w) = Ī±v + Ī±w,

āĪ±, Ī² ā K, āv ā V, (Ī± + Ī²)v = Ī±v + Ī²v;

6. the following associative property holds

āĪ±, Ī² ā K, āv ā V, (Ī±Ī²)v = Ī±(Ī²v).

Example 1.1 Remarkable instances of vector spaces are:

- V = Rn (respectively V = Cn ): the set of the n-tuples of real (respectively

complex) numbers, n ā„ 1;

- V = Pn : the set of polynomials pn (x) = n ak xk with real (or complex)

k=0

coeļ¬cients ak having degree less than or equal to n, n ā„ 0;

- V = C p ([a, b]): the set of real (or complex)-valued functions which are con-

tinuous on [a, b] up to their p-th derivative, 0 ā¤ p < ā. ā¢

Deļ¬nition 1.2 We say that a nonempty part W of V is a vector subspace

of V iļ¬ W is a vector space over K.

Example 1.2 The vector space Pn is a vector subspace of C ā (R), which is the

space of inļ¬nite continuously diļ¬erentiable functions on the real line. A trivial

ā¢

subspace of any vector space is the one containing only the zero vector.

In particular, the set W of the linear combinations of a system of p vectors

of V , {v1 , . . . , vp }, is a vector subspace of V , called the generated subspace

or span of the vector system, and is denoted by

= span {v1 , . . . , vp }

W

= {v = Ī±1 v1 + . . . + Ī±p vp with Ī±i ā K, i = 1, . . . , p} .

The system {v1 , . . . , vp } is called a system of generators for W .

If W1 , . . . , Wm are vector subspaces of V , then the set

S = {w : w = v1 + . . . + vm with vi ā Wi , i = 1, . . . , m}

is also a vector subspace of V . We say that S is the direct sum of the

subspaces Wi if any element s ā S admits a unique representation of the

form s = v1 + . . . + vm with vi ā Wi and i = 1, . . . , m. In such a case, we

shall write S = W1 ā• . . . ā• Wm .

1.2 Matrices 3

Deļ¬nition 1.3 A system of vectors {v1 , . . . , vm } of a vector space V is

called linearly independent if the relation

Ī±1 v1 + Ī±2 v2 + . . . + Ī±m vm = 0

with Ī±1 , Ī±2 , . . . , Ī±m ā K implies that Ī±1 = Ī±2 = . . . = Ī±m = 0. Otherwise,

the system will be called linearly dependent.

We call a basis of V any system of linearly independent generators of V .

If {u1 , . . . , un } is a basis of V , the expression v = v1 u1 + . . . + vn un is

called the decomposition of v with respect to the basis and the scalars

v1 , . . . , vn ā K are the components of v with respect to the given basis.

Moreover, the following property holds.

Property 1.1 Let V be a vector space which admits a basis of n vectors.

Then every system of linearly independent vectors of V has at most n el-

ements and any other basis of V has n elements. The number n is called

the dimension of V and we write dim(V ) = n.

If, instead, for any n there always exist n linearly independent vectors of

V , the vector space is called inļ¬nite dimensional.

Example 1.3 For any integer p the space C p ([a, b]) is inļ¬nite dimensional. The

spaces Rn and Cn have dimension equal to n. The usual basis for Rn is the set of

unit vectors {e1 , . . . , en } where (ei )j = Ī“ij for i, j = 1, . . . n, where Ī“ij denotes

the Kronecker symbol equal to 0 if i = j and 1 if i = j. This choice is of course

ā¢

not the only one that is possible (see Exercise 2).

1.2 Matrices

Let m and n be two positive integers. We call a matrix having m rows and

n columns, or a matrix m Ć— n, or a matrix (m, n), with elements in K, a

set of mn scalars aij ā K, with i = 1, . . . , m and j = 1, . . . n, represented

in the following rectangular array

ļ£® ļ£¹

a11 a12 . . . a1n

ļ£Æ a21 a22 . . . a2n ļ£ŗ

ļ£Æ ļ£ŗ

A=ļ£Æ . . ļ£ŗ. (1.1)

.

ļ£°. .ļ£»

.

. . .

am1 am2 ... amn

When K = R or K = C we shall respectively write A ā RmĆ—n or A ā

CmĆ—n , to explicitly outline the numerical ļ¬elds which the elements of A

belong to. Capital letters will be used to denote the matrices, while the

lower case letters corresponding to those upper case letters will denote the

matrix entries.

4 1. Foundations of Matrix Analysis

We shall abbreviate (1.1) as A = (aij ) with i = 1, . . . , m and j = 1, . . . n.

The index i is called row index, while j is the column index. The set

(ai1 , ai2 , . . . , ain ) is called the i-th row of A; likewise, (a1j , a2j , . . . , amj )

is the j-th column of A.

If n = m the matrix is called squared or having order n and the set of

the entries (a11 , a22 , . . . , ann ) is called its main diagonal.

A matrix having one row or one column is called a row vector or column

vector respectively. Unless otherwise speciļ¬ed, we shall always assume that

a vector is a column vector. In the case n = m = 1, the matrix will simply

denote a scalar of K.

Sometimes it turns out to be useful to distinguish within a matrix the set

made up by speciļ¬ed rows and columns. This prompts us to introduce the

following deļ¬nition.

Deļ¬nition 1.4 Let A be a matrix m Ć— n. Let 1 ā¤ i1 < i2 < . . . < ik ā¤ m

and 1 ā¤ j1 < j2 < . . . < jl ā¤ n two sets of contiguous indexes. The matrix

S(k Ć— l) of entries spq = aip jq with p = 1, . . . , k, q = 1, . . . , l is called a

submatrix of A. If k = l and ir = jr for r = 1, . . . , k, S is called a principal

submatrix of A.

Deļ¬nition 1.5 A matrix A(m Ć— n) is called block partitioned or said to

be partitioned into submatrices if

ļ£® ļ£¹

A11 A12 ... A1l

ļ£Æ A21 A22 ļ£ŗ

... A2l

ļ£Æ ļ£ŗ

A=ļ£Æ . ļ£ŗ,

. .

..

ļ£°. ļ£»

. .

.

. . .

Ak1 Ak2 ... Akl

where Aij are submatrices of A.

Among the possible partitions of A, we recall in particular the partition by

columns

A = (a1 , a2 , . . . , an ),

ai being the i-th column vector of A. In a similar way the partition by rows

of A can be deļ¬ned. To ļ¬x the notations, if A is a matrix m Ć— n, we shall

denote by

A(i1 : i2 , j1 : j2 ) = (aij ) i1 ā¤ i ā¤ i2 , j1 ā¤ j ā¤ j2

the submatrix of A of size (i2 ā’ i1 + 1) Ć— (j2 ā’ j1 + 1) that lies between the

rows i1 and i2 and the columns j1 and j2 . Likewise, if v is a vector of size

n, we shall denote by v(i1 : i2 ) the vector of size i2 ā’ i1 + 1 made up by

the i1 -th to the i2 -th components of v.

These notations are convenient in view of programming the algorithms

that will be presented throughout the volume in the MATLAB language.

1.3 Operations with Matrices 5

1.3 Operations with Matrices

Let A = (aij ) and B = (bij ) be two matrices m Ć— n over K. We say that

A is equal to B, if aij = bij for i = 1, . . . , m, j = 1, . . . , n. Moreover, we

deļ¬ne the following operations:

- matrix sum: the matrix sum is the matrix A+B = (aij +bij ). The neutral

element in a matrix sum is the null matrix, still denoted by 0 and

made up only by null entries;

- matrix multiplication by a scalar: the multiplication of A by Ī» ā K, is a

matrix Ī»A = (Ī»aij );

- matrix product: the product of two matrices A and B of sizes (m, p)

and (p, n) respectively, is a matrix C(m, n) whose entries are cij =

p

aik bkj , for i = 1, . . . , m, j = 1, . . . , n.

k=1

The matrix product is associative and distributive with respect to the ma-

trix sum, but it is not in general commutative. The square matrices for

which the property AB = BA holds, will be called commutative.

In the case of square matrices, the neutral element in the matrix product

is a square matrix of order n called the unit matrix of order n or, more

frequently, the identity matrix given by In = (Ī“ij ). The identity matrix

is, by deļ¬nition, the only matrix n Ć— n such that AIn = In A = A for all

square matrices A. In the following we shall omit the subscript n unless it

is strictly necessary. The identity matrix is a special instance of a diagonal

matrix of order n, that is, a square matrix of the type D = (dii Ī“ij ). We will

use in the following the notation D = diag(d11 , d22 , . . . , dnn ).

Finally, if A is a square matrix of order n and p is an integer, we deļ¬ne Ap

as the product of A with itself iterated p times. We let A0 = I.

Let us now address the so-called elementary row operations that can be

performed on a matrix. They consist of:

- multiplying the i-th row of a matrix by a scalar Ī±; this operation is

equivalent to pre-multiplying A by the matrix D = diag(1, . . . , 1, Ī±,

1, . . . , 1), where Ī± occupies the i-th position;

- exchanging the i-th and j-th rows of a matrix; this can be done by pre-

multiplying A by the matrix P(i,j) of elements

ļ£±

ļ£“ 1 if r = s = 1, . . . , i ā’ 1, i + 1, . . . , j ā’ 1, j + 1, . . . n,

ļ£“

ļ£“

ļ£²

(i,j)

1 if r = j, s = i or r = i, s = j,

prs = (1.2)

ļ£“

ļ£“

ļ£“

ļ£³ 0 otherwise,

6 1. Foundations of Matrix Analysis

where Ir denotes the identity matrix of order r = j ā’ i ā’ 1 if j >

i (henceforth, matrices with size equal to zero will correspond to

the empty set). Matrices like (1.2) are called elementary permutation

matrices. The product of elementary permutation matrices is called

a permutation matrix, and it performs the row exchanges associated

with each elementary permutation matrix. In practice, a permutation

matrix is a reordering by rows of the identity matrix;

- adding Ī± times the j-th row of a matrix to its i-th row. This operation

(i,j)

can also be performed by pre-multiplying A by the matrix I + NĪ± ,

(i,j)

where NĪ± is a matrix having null entries except the one in position

i, j whose value is Ī±.

1.3.1 Inverse of a Matrix

Deļ¬nition 1.6 A square matrix A of order n is called invertible (or regular

or nonsingular) if there exists a square matrix B of order n such that

A B = B A = I. B is called the inverse matrix of A and is denoted by Aā’1 .

A matrix which is not invertible is called singular.

If A is invertible its inverse is also invertible, with (Aā’1 )ā’1 = A. Moreover,

if A and B are two invertible matrices of order n, their product AB is also

invertible, with (A B)ā’1 = Bā’1 Aā’1 . The following property holds.

Property 1.2 A square matrix is invertible iļ¬ its column vectors are lin-

early independent.

Deļ¬nition 1.7 We call the transpose of a matrix Aā RmĆ—n the matrix

n Ć— m, denoted by AT , that is obtained by exchanging the rows of A with

the columns of A.

Clearly, (AT )T = A, (A + B)T = AT + BT , (AB)T = BT AT and (Ī±A)T =

Ī±AT āĪ± ā R. If A is invertible, then also (AT )ā’1 = (Aā’1 )T = Aā’T .

Deļ¬nition 1.8 Let A ā CmĆ—n ; the matrix B = AH ā CnĆ—m is called the

conjugate transpose (or adjoint) of A if bij = aji , where aji is the complex

ĀÆ ĀÆ

conjugate of aji .

In analogy with the case of the real matrices, it turns out that (A+B)H =

AH + BH , (AB)H = BH AH and (Ī±A)H = Ī±AH āĪ± ā C.

ĀÆ

Deļ¬nition 1.9 A matrix A ā RnĆ—n is called symmetric if A = AT , while

it is antisymmetric if A = ā’AT . Finally, it is called orthogonal if AT A =

AAT = I, that is Aā’1 = AT .

Permutation matrices are orthogonal and the same is true for their prod-

ucts.

1.3 Operations with Matrices 7

Deļ¬nition 1.10 A matrix A ā CnĆ—n is called hermitian or self-adjoint if

ĀÆ

AT = A, that is, if AH = A, while it is called unitary if AH A = AAH = I.

Finally, if AAH = AH A, A is called normal.

As a consequence, a unitary matrix is one such that Aā’1 = AH .

Of course, a unitary matrix is also normal, but it is not in general her-

mitian. For instance, the matrix of the Example 1.4 is unitary, although

not symmetric (if s = 0). We ļ¬nally notice that the diagonal entries of an

hermitian matrix must necessarily be real (see also Exercise 5).

1.3.2 Matrices and Linear Mappings

Deļ¬nition 1.11 A linear map from Cn into Cm is a function f : Cn ā’ā’

Cm such that f (Ī±x + Ī²y) = Ī±f (x) + Ī²f (y), āĪ±, Ī² ā K and āx, y ā Cn .

The following result links matrices and linear maps.

Property 1.3 Let f : Cn ā’ā’ Cm be a linear map. Then, there exists a

unique matrix Af ā CmĆ—n such that

āx ā Cn .

f (x) = Af x (1.3)

Conversely, if Af ā CmĆ—n then the function deļ¬ned in (1.3) is a linear

map from Cn into Cm .

Example 1.4 An important example of a linear map is the counterclockwise

rotation by an angle Ļ‘ in the plane (x1 , x2 ). The matrix associated with such a

map is given by

c s

G(Ļ‘) = , c = cos(Ļ‘), s = sin(Ļ‘)

ā’s c

ā¢

and it is called a rotation matrix.

1.3.3 Operations with Block-Partitioned Matrices

All the operations that have been previously introduced can be extended

to the case of a block-partitioned matrix A, provided that the size of each

single block is such that any single matrix operation is well-deļ¬ned.

Indeed, the following result can be shown (see, e.g., [Ste73]).

Property 1.4 Let A and B be the block matrices

ļ£® ļ£¹ ļ£® ļ£¹

A11 . . . A1l B11 . . . B1n

ļ£Æ. . ļ£ŗ, B = ļ£Æ . .ļ£ŗ

A=ļ£° . .. ..

.ļ£» ļ£°. .ļ£»

. .

. . . .

Ak1 . . . Akl Bm1 . . . Bmn

where Aij and Bij are matrices (ki Ć— lj ) and (mi Ć— nj ). Then we have

8 1. Foundations of Matrix Analysis

1.

ļ£® ļ£¹ ļ£® ļ£¹

AT AT

Ī»A11 ... Ī»A1l ...

11 k1

ļ£Æ. . ļ£ŗ, ļ£Æ. . ļ£ŗ;

.. ..

Ī» ā C; A = ļ£° .

Ī»A = ļ£° . .ļ£» .ļ£»

T

. .

. . . .

AT AT

Ī»Ak1 ... Ī»Akl ...

1l kl

2. if k = m, l = n, mi = ki and nj = lj , then

ļ£® ļ£¹

A11 + B11 . . . A1l + B1l

ļ£Æ ļ£ŗ

. .

..

A+B=ļ£° ļ£»;

. .

.

. .

Ak1 + Bk1 . . . Akl + Bkl

m

3. if l = m, li = mi and ki = ni , then, letting Cij = Ais Bsj ,

s=1

ļ£® ļ£¹

C11 ... C1l

ļ£Æ. . ļ£ŗ.

..

AB = ļ£° . .ļ£»

.

. .

Ck1 ... Ckl

1.4 Trace and Determinant of a Matrix

Let us consider a square matrix A of order n. The trace of a matrix is the

n

sum of the diagonal entries of A, that is tr(A) = aii .

i=1

We call the determinant of A the scalar deļ¬ned through the following for-

mula

det(A) = sign(Ļ)a1Ļ1 a2Ļ2 . . . anĻn ,

ĻāP

where P = Ļ = (Ļ1 , . . . , Ļn )T is the set of the n! vectors that are ob-

tained by permuting the index vector i = (1, . . . , n)T and sign(Ļ) equal to

1 (respectively, ā’1) if an even (respectively, odd) number of exchanges is

needed to obtain Ļ from i.

The following properties hold

det(A) = det(AT ), det(AB) = det(A)det(B), det(Aā’1 ) = 1/det(A),

det(AH ) = det(A), det(Ī±A) = Ī±n det(A), āĪ± ā K.

Moreover, if two rows or columns of a matrix coincide, the determinant

vanishes, while exchanging two rows (or two columns) produces a change

1.5 Rank and Kernel of a Matrix 9

of sign in the determinant. Of course, the determinant of a diagonal matrix

is the product of the diagonal entries.

Denoting by Aij the matrix of order n ā’ 1 obtained from A by elimi-

nating the i-th row and the j-th column, we call the complementary minor

associated with the entry aij the determinant of the matrix Aij . We call

the k-th principal (dominating) minor of A, dk , the determinant of the

principal submatrix of order k, Ak = A(1 : k, 1 : k). If we denote by

āij = (ā’1)i+j det(Aij ) the cofactor of the entry aij , the actual computa-

tion of the determinant of A can be performed using the following recursive

relation

ļ£±

ļ£“ a11 if n = 1,

ļ£“

ļ£“

ļ£²

det(A) = (1.4)

n

ļ£“

ļ£“

ļ£“ āij aij , for n > 1,

ļ£³

j=1

which is known as the Laplace rule. If A is a square invertible matrix of

order n, then

1

Aā’1 = C

det(A)

where C is the matrix having entries āji , i, j = 1, . . . , n.

As a consequence, a square matrix is invertible iļ¬ its determinant is non-

vanishing. In the case of nonsingular diagonal matrices the inverse is still

a diagonal matrix having entries given by the reciprocals of the diagonal

entries of the matrix.

Every orthogonal matrix is invertible, its inverse is given by AT , moreover

det(A) = Ā±1.

1.5 Rank and Kernel of a Matrix

Let A be a rectangular matrix m Ć— n. We call the determinant of order

q (with q ā„ 1) extracted from matrix A, the determinant of any square

matrix of order q obtained from A by eliminating m ā’ q rows and n ā’ q

columns.

Deļ¬nition 1.12 The rank of A (denoted by rank(A)) is the maximum

order of the nonvanishing determinants extracted from A. A matrix has

complete or full rank if rank(A) = min(m,n).

Notice that the rank of A represents the maximum number of linearly

independent column vectors of A that is, the dimension of the range of A,

deļ¬ned as

range(A) = {y ā Rm : y = Ax for x ā Rn } . (1.5)

10 1. Foundations of Matrix Analysis

Rigorously speaking, one should distinguish between the column rank of A

and the row rank of A, the latter being the maximum number of linearly

independent row vectors of A. Nevertheless, it can be shown that the row

rank and column rank do actually coincide.

The kernel of A is deļ¬ned as the subspace

ker(A) = {x ā Rn : Ax = 0} .

The following relations hold

T H

(if A ā CmĆ—n , rank(A) = rank(A ))

1. rank(A) = rank(A )

2. rank(A) + dim(ker(A)) = n.

In general, dim(ker(A)) = dim(ker(AT )). If A is a nonsingular square ma-

trix, then rank(A) = n and dim(ker(A)) = 0.

Example 1.5 Let

1 1 0

A= .

ā’1

1 1

ā¢

Then, rank(A) = 2, dim(ker(A)) = 1 and dim(ker(AT )) = 0.

We ļ¬nally notice that for a matrix A ā CnĆ—n the following properties are

equivalent:

1. A is nonsingular;

2. det(A) = 0;

3. ker(A) = {0};

4. rank(A) = n;

5. A has linearly independent rows and columns.

1.6 Special Matrices

1.6.1 Block Diagonal Matrices

These are matrices of the form D = diag(D1 , . . . , Dn ), where Di are square

matrices with i = 1, . . . , n. Clearly, each single diagonal block can be of

diļ¬erent size. We shall say that a block diagonal matrix has size n if n

is the number of its diagonal blocks. The determinant of a block diagonal

matrix is given by the product of the determinants of the single diagonal

blocks.

1.6 Special Matrices 11

1.6.2 Trapezoidal and Triangular Matrices

A matrix A(m Ć— n) is called upper trapezoidal if aij = 0 for i > j, while it

is lower trapezoidal if aij = 0 for i < j. The name is due to the fact that,

in the case of upper trapezoidal matrices, with m < n, the nonzero entries

of the matrix form a trapezoid.

A triangular matrix is a square trapezoidal matrix of order n of the form

ļ£® ļ£¹ ļ£® ļ£¹

l11 0 . . . 0 u11 u12 . . . u1n

ļ£Æ l21 l22 . . . 0ļ£ŗ ļ£Æ 0 u22 . . . u2n ļ£ŗ

ļ£Æ ļ£ŗ ļ£Æ ļ£ŗ

L=ļ£Æ . . ļ£ŗ or U = ļ£Æ . . ļ£ŗ.

. .

ļ£°. .ļ£» ļ£°. .ļ£»

. .

. . . . . .

ln1 ln2 ... lnn 0 0 ... unn

The matrix L is called lower triangular while U is upper triangular.

Let us recall some algebraic properties of triangular matrices that are easy

to check.

- The determinant of a triangular matrix is the product of the diagonal

entries;

- the inverse of a lower (respectively, upper) triangular matrix is still lower

(respectively, upper) triangular;

- the product of two lower triangular (respectively, upper trapezoidal) ma-

trices is still lower triangular (respectively, upper trapezodial);

- if we call unit triangular matrix a triangular matrix that has diagonal

entries equal to 1, then, the product of lower (respectively, upper) unit

triangular matrices is still lower (respectively, upper) unit triangular.

1.6.3 Banded Matrices

The matrices introduced in the previous section are a special instance of

banded matrices. Indeed, we say that a matrix A ā RmĆ—n (or in CmĆ—n )

has lower band p if aij = 0 when i > j + p and upper band q if aij = 0

when j > i+q. Diagonal matrices are banded matrices for which p = q = 0,

while trapezoidal matrices have p = mā’1, q = 0 (lower trapezoidal), p = 0,

q = n ā’ 1 (upper trapezoidal).

Other banded matrices of relevant interest are the tridiagonal matrices

for which p = q = 1 and the upper bidiagonal (p = 0, q = 1) or lower bidiag-

onal (p = 1, q = 0). In the following, tridiagn (b, d, c) will denote the triadi-

agonal matrix of size n having respectively on the lower and upper principal

diagonals the vectors b = (b1 , . . . , bnā’1 )T and c = (c1 , . . . , cnā’1 )T , and on

the principal diagonal the vector d = (d1 , . . . , dn )T . If bi = Ī², di = Ī“ and

ci = Ī³, Ī², Ī“ and Ī³ being given constants, the matrix will be denoted by

tridiagn (Ī², Ī“, Ī³).

12 1. Foundations of Matrix Analysis

We also mention the so-called lower Hessenberg matrices (p = m ā’ 1,

q = 1) and upper Hessenberg matrices (p = 1, q = n ā’ 1) that have the

following structure

ļ£® ļ£¹ ļ£® ļ£¹

0

h11 h12 h11 h12 ... h1n

ļ£Æ ļ£ŗ ļ£Æ h21 h22 h2n ļ£ŗ

ļ£Æ h21 h22 . . . ļ£ŗ ļ£Æ ļ£ŗ

H=ļ£Æ . ļ£ŗ or H = ļ£Æ . ļ£ŗ.

.. ..

ļ£Æ ļ£ŗ ļ£Æ .ļ£ŗ

.. . . .ļ£»

ļ£°. . hmā’1n ļ£» ļ£°

.

0 hmnā’1 hmn

hm1 . . . . . . hmn

Matrices of similar shape can obviously be set up in the block-like format.

1.7 Eigenvalues and Eigenvectors

Let A be a square matrix of order n with real or complex entries; the number

Ī» ā C is called an eigenvalue of A if there exists a nonnull vector x ā Cn

such that Ax = Ī»x. The vector x is the eigenvector associated with the

eigenvalue Ī» and the set of the eigenvalues of A is called the spectrum of A,

denoted by Ļ(A). We say that x and y are respectively a right eigenvector

and a left eigenvector of A, associated with the eigenvalue Ī», if

Ax = Ī»x, yH A = Ī»yH .

The eigenvalue Ī» corresponding to the eigenvector x can be determined by

computing the Rayleigh quotient Ī» = xH Ax/(xH x). The number Ī» is the

solution of the characteristic equation

pA (Ī») = det(A ā’ Ī»I) = 0,

where pA (Ī») is the characteristic polynomial. Since this latter is a polyno-

mial of degree n with respect to Ī», there certainly exist n eigenvalues of A

not necessarily distinct. The following properties can be proved

n n

det(A) = Ī»i , tr(A) = Ī»i , (1.6)

i=1 i=1

and since det(AT ā’ Ī»I) = det((A ā’ Ī»I)T ) = det(A ā’ Ī»I) one concludes that

ĀÆ

Ļ(A) = Ļ(AT ) and, in an analogous way, that Ļ(AH ) = Ļ(A).

From the ļ¬rst relation in (1.6) it can be concluded that a matrix is

singular iļ¬ it has at least one null eigenvalue, since pA (0) = det(A) =

Ī n Ī»i .

i=1

Secondly, if A has real entries, pA (Ī») turns out to be a real-coeļ¬cient

polynomial so that complex eigenvalues of A shall necessarily occur in com-

plex conjugate pairs.

1.7 Eigenvalues and Eigenvectors 13

Finally, due to the Cayley-Hamilton Theorem if pA (Ī») is the charac-

teristic polynomial of A, then pA (A) = 0, where pA (A) denotes a matrix

polynomial (for the proof see, e.g., [Axe94], p. 51).

The maximum module of the eigenvalues of A is called the spectral radius

of A and is denoted by

Ļ(A) = max |Ī»|. (1.7)

Ī»āĻ(A)

Characterizing the eigenvalues of a matrix as the roots of a polynomial

ĀÆ

implies in particular that Ī» is an eigenvalue of A ā CnĆ—n iļ¬ Ī» is an eigen-

value of AH . An immediate consequence is that Ļ(A) = Ļ(AH ). Moreover,

k

āA ā CnĆ—n , āĪ± ā C, Ļ(Ī±A) = |Ī±|Ļ(A), and Ļ(Ak ) = [Ļ(A)] āk ā N.

Finally, assume that A is a block triangular matrix

ļ£® ļ£¹

A11 A12 ... A1k

ļ£Æ ļ£ŗ

0 A22 ... A2k

ļ£Æ ļ£ŗ

A=ļ£Æ ļ£ŗ.

. .

..

ļ£° ļ£»

. .

.

. .

0 ... 0 Akk

As pA (Ī») = pA11 (Ī»)pA22 (Ī») Ā· Ā· Ā· pAkk (Ī»), the spectrum of A is given by the

union of the spectra of each single diagonal block. As a consequence, if A

is triangular, the eigenvalues of A are its diagonal entries.

For each eigenvalue Ī» of a matrix A the set of the eigenvectors associated

with Ī», together with the null vector, identiļ¬es a subspace of Cn which is

called the eigenspace associated with Ī» and corresponds by deļ¬nition to

ker(A-Ī»I). The dimension of the eigenspace is

dim [ker(A ā’ Ī»I)] = n ā’ rank(A ā’ Ī»I),

and is called geometric multiplicity of the eigenvalue Ī». It can never be

greater than the algebraic multiplicity of Ī», which is the multiplicity of

Ī» as a root of the characteristic polynomial. Eigenvalues having geometric

multiplicity strictly less than the algebraic one are called defective. A matrix

having at least one defective eigenvalue is called defective.

The eigenspace associated with an eigenvalue of a matrix A is invariant

with respect to A in the sense of the following deļ¬nition.

Deļ¬nition 1.13 A subspace S in Cn is called invariant with respect to a

square matrix A if AS ā‚ S, where AS is the transformed of S through A.

14 1. Foundations of Matrix Analysis

1.8 Similarity Transformations

Deļ¬nition 1.14 Let C be a square nonsingular matrix having the same

order as the matrix A. We say that the matrices A and Cā’1 AC are similar,

and the transformation from A to Cā’1 AC is called a similarity transfor-

mation. Moreover, we say that the two matrices are unitarily similar if C

is unitary.

Two similar matrices share the same spectrum and the same characteris-

tic polynomial. Indeed, it is easy to check that if (Ī», x) is an eigenvalue-

eigenvector pair of A, (Ī», Cā’1 x) is the same for the matrix Cā’1 AC since

(Cā’1 AC)Cā’1 x = Cā’1 Ax = Ī»Cā’1 x.

We notice in particular that the product matrices AB and BA, with A ā

CnĆ—m and B ā CmĆ—n , are not similar but satisfy the following property

(see [Hac94], p.18, Theorem 2.4.6)

Ļ(AB)\ {0} = Ļ(BA)\ {0}

that is, AB and BA share the same spectrum apart from null eigenvalues

so that Ļ(AB) = Ļ(BA).

The use of similarity transformations aims at reducing the complexity

of the problem of evaluating the eigenvalues of a matrix. Indeed, if a given

matrix could be transformed into a similar matrix in diagonal or triangular

form, the computation of the eigenvalues would be immediate. The main

result in this direction is the following theorem (for the proof, see [Dem97],

Theorem 4.2).

Property 1.5 (Schur decomposition) Given Aā CnĆ—n , there exists U

unitary such that

ļ£® ļ£¹

Ī»1 b12 . . . b1n

ļ£Æ 0 Ī»2 b2n ļ£ŗ

ļ£Æ ļ£ŗ

ā’1 H

U AU = U AU = ļ£Æ . . ļ£ŗ = T,

..

ļ£°. .ļ£»

.

. .

0 ... 0 Ī»n

where Ī»i are the eigenvalues of A.

It thus turns out that every matrix A is unitarily similar to an upper

triangular matrix. The matrices T and U are not necessarily unique [Hac94].

The Schur decomposition theorem gives rise to several important results;

among them, we recall:

1. every hermitian matrix is unitarily similar to a diagonal real ma-

trix, that is, when A is hermitian every Schur decomposition of A is

diagonal. In such an event, since

Uā’1 AU = Ī = diag(Ī»1 , . . . , Ī»n ),

1.8 Similarity Transformations 15

it turns out that AU = UĪ, that is, Aui = Ī»i ui for i = 1, . . . , n so

that the column vectors of U are the eigenvectors of A. Moreover,

since the eigenvectors are orthogonal two by two, it turns out that

an hermitian matrix has a system of orthonormal eigenvectors that

generates the whole space Cn . Finally, it can be shown that a matrix

A of order n is similar to a diagonal matrix D iļ¬ the eigenvectors of

A form a basis for Cn [Axe94];

2. a matrix A ā CnĆ—n is normal iļ¬ it is unitarily similar to a diagonal

matrix. As a consequence, a normal matrix A ā CnĆ—n admits the

n

following spectral decomposition: A = UĪUH = i=1 Ī»i ui uH being

i

U unitary and Ī diagonal [SS90];

3. let A and B be two normal and commutative matrices; then, the

generic eigenvalue Āµi of A+B is given by the sum Ī»i + Ī¾i , where

Ī»i and Ī¾i are the eigenvalues of A and B associated with the same

eigenvector.

There are, of course, nonsymmetric matrices that are similar to diagonal

matrices, but these are not unitarily similar (see, e.g., Exercise 7).

The Schur decomposition can be improved as follows (for the proof see,

e.g., [Str80], [God66]).

Property 1.6 (Canonical Jordan Form) Let A be any square matrix.

Then, there exists a nonsingular matrix X which transforms A into a block

diagonal matrix J such that

Xā’1 AX = J = diag (Jk1 (Ī»1 ), Jk2 (Ī»2 ), . . . , Jkl (Ī»l )) ,

which is called canonical Jordan form, Ī»j being the eigenvalues of A and

Jk (Ī») ā CkĆ—k a Jordan block of the form J1 (Ī») = Ī» if k = 1 and

ļ£® ļ£¹

Ī»1 0 ... 0

ļ£Æ .ļ£ŗ

ļ£Æ0 Ī» 1 Ā·Ā·Ā· . ļ£ŗ .ļ£ŗ

ļ£Æ

ļ£Æ . .. ļ£ŗ

..

Jk (Ī») = ļ£Æ . 1 0 ļ£ŗ, for k > 1.

. .

.

ļ£Æ ļ£ŗ

ļ£Æ. ļ£ŗ

..

ļ£°. . Ī» 1ļ£»

.

0 ... ... 0Ī»

If an eigenvalue is defective, the size of the corresponding Jordan block

is greater than one. Therefore, the canonical Jordan form tells us that a

matrix can be diagonalized by a similarity transformation iļ¬ it is nonde-

fective. For this reason, the nondefective matrices are called diagonalizable.

In particular, normal matrices are diagonalizable.

16 1. Foundations of Matrix Analysis

Partitioning X by columns, X = (x1 , . . . , xn ), it can be seen that the

ki vectors associated with the Jordan block Jki (Ī»i ) satisfy the following

recursive relation

iā’1

Axl = Ī»i xl , l= mj + 1,

(1.8)

j=1

Axj = Ī»i xj + xjā’1 , j = l + 1, . . . , l ā’ 1 + ki , if ki = 1.

The vectors xi are called principal vectors or generalized eigenvectors of A.

Example 1.6 Let us consider the following matrix

ļ£® ļ£¹

ā’1/4 ā’1/4 ā’1/4

7/4 3/4 1/4

ļ£Æ0 ļ£ŗ

2 0 0 0 0

ļ£Æ ļ£ŗ

ļ£Æ ā’1/2 ā’1/2 ļ£ŗ

ā’1/2

5/2 1/2 1/2

A=ļ£Æ ļ£ŗ.

ļ£Æ ā’1/2 ā’1/2 ā’1/2 ļ£ŗ

5/2 1/2 1/2

ļ£Æ ļ£ŗ

ļ£° ā’1/4 ā’1/4 ā’1/4 ā’1/4 11/4 ļ£»

1/4

ā’3/2 ā’1/2 ā’1/2 1/2 1/2 7/2

The Jordan canonical form of A and its associated matrix X are given by

ļ£® ļ£¹ ļ£® ļ£¹

21000 0 100001

ļ£Æ0 2 0 0 0 0ļ£ŗ ļ£Æ0 1 0 0 0 1ļ£ŗ

ļ£Æ ļ£ŗ ļ£Æ ļ£ŗ

ļ£Æ0 0 3 1 0 0ļ£ŗ ļ£Æ0 0 1 0 0 1ļ£ŗ

J=ļ£Æ ļ£ŗ, X = ļ£Æ ļ£ŗ

ļ£Æ 0 0 0 1 0 1 ļ£ŗ.

ļ£Æ0 0 0 3 1 0ļ£ŗ

ļ£Æ ļ£ŗ ļ£Æ ļ£ŗ

ļ£°0 0 0 0 3 0ļ£» ļ£°0 0 0 0 1 1ļ£»

00000 2 111111

Notice that two diļ¬erent Jordan blocks are related to the same eigenvalue (Ī» =

2). It is easy to check property (1.8). Consider, for example, the Jordan block

associated with the eigenvalue Ī»2 = 3; we have

Ax3 = [0 0 3 0 0 3]T = 3 [0 0 1 0 0 1]T = Ī»2 x3 ,

Ax4 = [0 0 1 3 0 4]T = 3 [0 0 0 1 0 1]T + [0 0 1 0 0 1]T = Ī»2 x4 + x3 ,

Ax5 = [0 0 0 1 3 4]T = 3 [0 0 0 0 1 1]T + [0 0 0 1 0 1]T = Ī»2 x5 + x4 .

ā¢

1.9 The Singular Value Decomposition (SVD)

Any matrix can be reduced in diagonal form by a suitable pre and post-

multiplication by unitary matrices. Precisely, the following result holds.

Property 1.7 Let Aā CmĆ—n . There exist two unitary matrices Uā CmĆ—m

and Vā CnĆ—n such that

UH AV = Ī£ = diag(Ļ1 , . . . , Ļp ) ā CmĆ—n with p = min(m, n) (1.9)

and Ļ1 ā„ . . . ā„ Ļp ā„ 0. Formula (1.9) is called Singular Value Decompo-

sition or (SVD) of A and the numbers Ļi (or Ļi (A)) are called singular

values of A.

1.10 Scalar Product and Norms in Vector Spaces 17

If A is a real-valued matrix, U and V will also be real-valued and in (1.9)

UT must be written instead of UH . The following characterization of the

singular values holds

Ī»i (AH A),

Ļi (A) = i = 1, . . . , n. (1.10)

Indeed, from (1.9) it follows that A = UĪ£VH , AH = VĪ£UH so that, U and

V being unitary, AH A = VĪ£2 VH , that is, Ī»i (AH A) = Ī»i (Ī£2 ) = (Ļi (A))2 .

Since AAH and AH A are hermitian matrices, the columns of U, called the

left singular vectors of A, turn out to be the eigenvectors of AAH (see

Section 1.8) and, therefore, they are not uniquely deļ¬ned. The same holds

for the columns of V, which are the right singular vectors of A.

Relation (1.10) implies that if A ā CnĆ—n is hermitian with eigenvalues given

by Ī»1 , Ī»2 , . . . , Ī»n , then the singular values of A coincide with the modules

of the eigenvalues of A. Indeed because AAH = A2 , Ļi = Ī»2 = |Ī»i | for i

ńņš. 1 |