. 12
( 26)


In this chapter we address the numerical solution of systems of nonlinear
equations and the minimization of a function of several variables.
The ¬rst problem generalizes to the n-dimensional case the search for
the zeros of a function, which was considered in Chapter 6, and can be
formulated as follows: given F : Rn ’ Rn ,

¬nd x— ∈ Rn such that F(x— ) = 0. (7.1)

Problem (7.1) will be solved by extending to several dimensions some of
the schemes that have been proposed in Chapter 6.
The basic formulation of the second problem reads: given f : Rn ’ R,
called an objective function,

minimize f (x) in Rn , (7.2)

and is called an unconstrained optimization problem.
A typical example consists of determining the optimal allocation of n
resources, x1 , x2 , . . . , xn , in competition with each other and ruled by a
speci¬c law. Generally, such resources are not unlimited; this circumstance,
from a mathematical standpoint, amounts to requiring that the minimizer
of the objective function lies within a subset „¦ ‚ Rn , and, possibly, that
some equality or inequality constraints must be satis¬ed.
When these constraints exist the optimization problem is called con-
strained and can be formulated as follows: given the objective function f ,

minimize f (x) in „¦ ‚ Rn . (7.3)
282 7. Nonlinear Systems and Numerical Optimization

Remarkable instances of (7.3) are those in which „¦ is characterized by con-
ditions like h(x) = 0 (equality constraints) or h(x) ¤ 0 (inequality con-
straints), where h : Rn ’ Rm , with m ¤ n, is a given function, called cost
function, and the condition h(x) ¤ 0 means hi (x) ¤ 0, for i = 1, . . . , m.
If the function h is continuous and „¦ is connected, problem (7.3) is
usually referred to as a nonlinear programming problem. Notable examples
in this area are:

convex programming if f is a convex function and h has convex compo-
nents (see (7.21));

linear programming if f and h are linear;

quadratic programming if f is quadratic and h is linear.

Problems (7.1) and (7.2) are strictly related to one another. Indeed, if we
denote by Fi the components of F, then a point x— , a solution of (7.1),
is a minimizer of the function f (x) = i=1 Fi2 (x). Conversely, assuming
that f is di¬erentiable and setting the partial derivatives of f equal to
zero at a point x— at which f is minimum leads to a system of nonlinear
equations. Thus, any system of nonlinear equations can be associated with
a suitable minimization problem, and vice versa. We shall take advantage
of this observation when devising e¬cient numerical methods.

7.1 Solution of Systems of Nonlinear Equations
Before considering problem (7.1), let us set some notation which will be
used throughout the chapter.
For k ≥ 0, we denote by C k (D) the set of k-continuously di¬erentiable
functions from D to Rn , where D ⊆ Rn is a set that will be made precise
from time to time. We shall always assume that F ∈ C 1 (D), i.e., F : Rn ’
Rn is a continuously di¬erentiable function on D.
We denote also by JF (x) the Jacobian matrix associated with F and
evaluated at the point x = (x1 , . . . , xn )T of Rn , de¬ned as

(JF (x))ij = (x), i, j = 1, . . . , n.

Given any vector norm · , we shall henceforth denote the sphere of radius
R with center x— by

B(x— ; R) = {y ∈ Rn : y ’ x— < R} .
7.1 Solution of Systems of Nonlinear Equations 283

7.1.1 Newton™s Method and Its Variants
An immediate extension to the vector case of Newton™s method (6.16) for
scalar equations can be formulated as follows:

given x(0) ∈ Rn , for k = 0, 1, . . . , until convergence:

JF (x(k) )δx(k) = ’F(x(k) );
x(k+1) = x(k) + δx(k) .

Thus, at each step k the solution of a linear system with matrix JF (x(k) )
is required.

Example 7.1 Consider the nonlinear system
 ex2 +x2 ’ 1 = 0,
1 2

 ex2 ’x2 ’ 1
1 2 = 0,
2 2
which admits the unique solution x— = 0. In this case, F(x) = (ex1 +x2 ’
2 2
1, ex1 ’x2 ’ 1). Running Program 57, leads to convergence in 15 iterations to the
pair (0.61 · 10’5 , 0.61 · 10’5 )T , starting from the initial datum x(0) = (0.1, 0.1)T ,
thus demonstrating a fairly rapid convergence rate. The results, however, dra-
matically change as the choice of the initial guess is varied. For instance, picking
up x(0) = (10, 10)T , 220 iterations are needed to obtain a solution comparable to
the previous one, while, starting from x(0) = (20, 20)T , Newton™s method fails to


The previous example points out the high sensitivity of Newton™s method
on the choice of the initial datum x(0) , as con¬rmed by the following local
convergence result.

Theorem 7.1 Let F : Rn ’ Rn be a C 1 function in a convex open set
D of Rn that contains x— . Suppose that J’1 (x— ) exists and that there exist
positive constants R, C and L, such that J’1 (x— ) ¤ C and

∀x, y ∈ B(x— ; R),
JF (x) ’ JF (y) ¤ L x ’ y

having denoted by the same symbol · two consistent vector and matrix
norms. Then, there exists r > 0 such that, for any x(0) ∈ B(x— ; r), the
sequence (7.4) is uniquely de¬ned and converges to x— with

x(k+1) ’ x— ¤ CL x(k) ’ x— 2 . (7.5)
Proof. Proceeding by induction on k, let us check (7.5) and, moreover, that
x(k+1) ∈ B(x— ; r), where r = min(R, 1/(2CL)). First, we prove that for any
x(0) ∈ B(x— ; r), the inverse matrix J’1 (x(0) ) exists. Indeed

J’1 (x— )[JF (x(0) ) ’ JF (x— )] ¤ J’1 (x— ) JF (x(0) ) ’ JF (x— ) ¤ CLr ¤ ,
284 7. Nonlinear Systems and Numerical Optimization

and thus, thanks to Theorem 1.5, we can conclude that J’1 (x(0) ) exists, since

J’1 (x— )
J’1 (x(0) ) ¤ ¤ 2 J’1 (x— ) ¤ 2C.
1 ’ J’1 (x— )[JF (x(0) ) ’ JF (x— )]

As a consequence, x(1) is well de¬ned and

x(1) ’ x— = x(0) ’ x— ’ J’1 (x(0) )[F(x(0) ) ’ F(x— )].

Factoring out J’1 (x(0) ) on the right hand side and passing to the norms, we get

¤ J’1 (x(0) )
x(1) ’ x— F(x— ) ’ F(x(0) ) ’ JF (x(0) )[x— ’ x(0) ]

¤ 2C x ’ x(0) 2

where the remainder of Taylor™s series of F has been used. The previous relation
proves (7.5) in the case k = 0; moreover, since x(0) ∈ B(x— ; r), we have x— ’
x(0) ¤ 1/(2CL), from which x(1) ’ x— ¤ x ’ x(0) .
This ensures that x(1) ∈ B(x— ; r).
By a similar proof, one can check that, should (7.5) be true for a certain k,
then the same inequality would follow also for k + 1 in place of k. This proves
the theorem.

Theorem 7.1 thus con¬rms that Newton™s method is quadratically conver-
gent only if x(0) is su¬ciently close to the solution x— and if the Jacobian
matrix is nonsingular. Moreover, it is worth noting that the computational
e¬ort needed to solve the linear system (7.4) can be excessively high as n
gets large. Also, JF (x(k) ) could be ill-conditioned, which makes it quite di¬-
cult to obtain an accurate solution. For these reasons, several modi¬cations
to Newton™s method have been proposed, which will be brie¬‚y considered
in the later sections, referring to the specialized literature for further details
(see [OR70], [DS83], [Erh97], [BS90] and the references therein).

7.1.2 Modi¬ed Newton™s Methods
Several modi¬cations of Newton™s method have been proposed in order
to reduce its cost when the computed solution is su¬ciently close to x— .
Further variants, that are globally convergent, will be introduced for the
solution of the minimization problem (7.2).

1. Cyclic updating of the Jacobian matrix
An e¬cient alternative to method (7.4) consists of keeping the Jacobian
matrix (more precisely, its factorization) unchanged for a certain number,
say p ≥ 2, of steps. Generally, a deterioration of convergence rate is accom-
panied by a gain in computational e¬ciency.
7.1 Solution of Systems of Nonlinear Equations 285

Program 57 implements Newton™s method in the case in which the LU
factorization of the Jacobian matrix is updated once every p steps. The pro-
grams used to solve the triangular systems have been described in Chapter
Here and in later codings in this chapter, we denote by x0 the initial
vector, by F and J the variables containing the functional expressions of F
and of its Jacobian matrix JF , respectively. The parameters toll and nmax
represent the stopping tolerance in the convergence of the iterative process
and the maximum admissible number of iterations, respectively. In output,
the vector x contains the approximation to the searched zero of F, while
nit denotes the number of iterations necessary to converge.

Program 57 - newtonxsys : Newton™s method for nonlinear systems

function [x, nit] = newtonsys(F, J, x0, toll, nmax, p)
[n,m]=size(F); nit=0; Fxn=zeros(n,1); x=x0; err=toll+1;
for i=1:n, for j=1:n, Jxn(i,j)=eval(J((i-1)*n+j,:)); end; end
[L,U,P]=lu(Jxn); step=0;
while err > toll
if step == p
step = 0;
for i=1:n;
for j=1:n; Jxn(i,j)=eval(J((i-1)*n+j,:)); end
for i=1:n, Fxn(i)=eval(F(i,:)); end
nit=nit+1; step=step+1; Fxn=-P*Fxn; y=forward col(L,Fxn);
deltax=backward col(U,y); x = x + deltax; err=norm(deltax);
if nit > nmax
disp(™ Fails to converge within maximum number of iterations ™);

2. Inexact solution of the linear systems
Another possibility consists of solving the linear system (7.4) by an iter-
ative method where the maximum number of admissible iterations is ¬xed
a priori. The resulting schemes are identi¬ed as Newton-Jacobi, Newton-
SOR or Newton-Krylov methods, according to the iterative process that is
used for the linear system (see [BS90], [Kel99]). Here, we limit ourselves to
describing the Newton-SOR method.
286 7. Nonlinear Systems and Numerical Optimization

In analogy with what was done in Section 4.2.1, let us decompose the
Jacobian matrix at step k as

JF (x(k) ) = Dk ’ Ek ’ Fk (7.6)

where Dk = D(x(k) ), ’Ek = ’E(x(k) ) and ’Fk = ’F(x(k) ), the diagonal
part and the lower and upper triangular portions of the matrix JF (x(k) ),
respectively. We suppose also that Dk is nonsingular. The SOR method for
solving the linear system in (7.4) is organized as follows: setting δx0 = 0,

δx(k) = Mk δxr’1 ’ ωk (Dk ’ ωk Ek )’1 F(x(k) ),
r = 1, 2, . . . , (7.7)

where Mk is the iteration matrix of SOR method
Mk = [Dk ’ ωk Ek ] [(1 ’ ωk )Dk + ωk Fk ] ,

and ωk is a positive relaxation parameter whose optimal value can rarely
be determined a priori. Assume that only r = m steps of the method are
(k) (k)
carried out. Recalling that δxr = xr ’ x(k) and still denoting by x(k+1)
the approximate solution computed after m steps, we ¬nd that this latter
can be written as (see Exercise 1)
x(k+1) = x(k) ’ ωk Mm’1 + · · · + I (Dk ’ ωk Ek ) F(x(k) ). (7.8)

This method is thus a composite iteration, in which at each step k, starting
from x(k) , m steps of the SOR method are carried out to solve approxi-
mately system (7.4).
The integer m, as well as ωk , can depend on the iteration index k; the
simplest choice amounts to performing, at each Newton™s step, only one
iteration of the SOR method, thus obtaining for r = 1 from (7.7) the one-
step Newton-SOR method
x(k+1) = x(k) ’ ωk (Dk ’ ωk Ek ) F(x(k) ).

In a similar way, the preconditioned Newton-Richardson method with ma-
trix Pk , if truncated at the m-th iteration, is

P’1 F(x(k) ),
x(k+1) = x(k) ’ I + Mk + . . . + Mk

where Pk is the preconditioner of JF and

Mk = P’1 Nk , Nk = Pk ’ JF (x(k) ).

For an e¬cient implementation of these techniques we refer to the MAT-
LAB software package developed in [Kel99].
7.1 Solution of Systems of Nonlinear Equations 287

3. Di¬erence approximations of the Jacobian matrix
Another possibility consists of replacing JF (x(k) ) (whose explicit compu-
tation is often very expensive) with an approximation through n-dimensional
di¬erences of the form
F(x(k) + hj ej ) ’ F(x(k) )
∀k ≥ 0,
(Jh )j = , (7.9)
where ej is the j-th vector of the canonical basis of Rn and hj > 0 are
increments to be suitably chosen at each step k of the iteration (7.4). The
following result can be shown.

Property 7.1 Let F and x— be such that the hypotheses of Theorem 7.1
are ful¬lled, where · denotes the · 1 vector norm and the corresponding
induced matrix norm. If there exist two positive constants µ and h such
that x(0) ∈ B(x— , µ) and 0 < |hj | ¤ h for j = 1, . . . , n then the sequence

de¬ned by
x(k+1) = x(k) ’ Jh F(x(k) ), (7.10)

is well de¬ned and converges linearly to x— . Moreover, if there exists a
positive constant C such that max|hj | ¤ C x(k) ’ x— or, equivalently,
there exists a positive constant c such that max|hj | ¤ c F(x(k) ) , then
the sequence (7.10) is convergent quadratically.
This result does not provide any constructive indication as to how to com-
pute the increments hj . In this regard, the following remarks can be made.
The ¬rst-order truncation error with respect to hj , which arises from the
divided di¬erence (7.10), can be reduced by reducing the sizes of hj . On
the other hand, a too small value for hj can lead to large rounding er-
rors. A trade-o¬ must therefore be made between the need of limiting the
truncation errors and ensuring a certain accuracy in the computations.
A possible choice is to take

(k) (k)
M max |xj |, Mj sign(xj ),
hj =

where Mj is a parameter that characterizes the typical size of the com-
ponent xj of the solution. Further improvements can be achieved using
higher-order divided di¬erences to approximate derivatives, like
(k) (k)
F(x(k) + hj ej ) ’ F(x(k) ’ hj ej )
∀k ≥ 0.
(Jh )j = ,
For further details on this subject, see, for instance, [BS90].
288 7. Nonlinear Systems and Numerical Optimization

7.1.3 Quasi-Newton Methods
By this term, we denote all those schemes in which globally convergent
methods are coupled with Newton-like methods that are only locally con-
vergent, but with an order greater than one.
In a quasi-Newton method, given a continuously di¬erentiable function
F : Rn ’ Rn , and an initial value x(0) ∈ Rn , at each step k one has to
accomplish the following operations:
1. compute F(x(k) );
2. choose JF (x(k) ) as being either the exact JF (x(k) ) or an approxima-
tion of it;
3. solve the linear system JF (x(k) )δx(k) = ’F(x(k) );
4. set x(k+1) = x(k) + ±k δx(k) , where ±k are suitable damping parame-
Step 4. is thus the characterizing element of this family of methods. It will
be addressed in Section 7.2.6, where a criterion for selecting the “direction”
δx(k) will be provided.

7.1.4 Secant-like Methods
These methods are constructed starting from the secant method introduced
in Section 6.2 for scalar functions. Precisely, given two vectors x(0) and x(1) ,
at the generic step k ≥ 1 we solve the linear system

Qk δx(k+1) = ’F(x(k) ) (7.11)

and we set x(k+1) = x(k) + δx(k+1) . Qk is an n — n matrix such that

Qk δx(k) = F(x(k) ) ’ F(x(k’1) ) = b(k) , k ≥ 1,

and is obtained by a formal generalization of (6.13). However, the algebraic
relation above does not su¬ce to uniquely determine Qk . For this purpose
we require Qk for k ≥ n to be a solution to the following set of n systems

Qk x(k) ’ x(k’j) = F(x(k) ) ’ F(x(k’j) ), j = 1, . . . , n. (7.12)

If the vectors x(k’j) , . . . , x(k) are linearly independent, system (7.12) allows
for calculating all the unknown coe¬cients {(Qk )lm , l, m = 1, . . . , n} of
Qk . Unfortunately, in practice the above vectors tend to become linearly
dependent and the resulting scheme is unstable, not to mention the need
for storing all the previous n iterates.
For these reasons, an alternative approach is pursued which aims at pre-
serving the information already provided by the method at step k. Precisely,
7.1 Solution of Systems of Nonlinear Equations 289

Qk is looked for in such a way that the di¬erence between the following
linear approximants to F(x(k’1) ) and F(x(k) ), respectively

F(x(k) ) + Qk (x ’ x(k) ), F(x(k’1) ) + Qk’1 (x ’ x(k’1) ),

is minimized jointly with the constraint that Qk satis¬es system (7.12).
Using (7.12) with j = 1, the di¬erence between the two approximants is
found to be

dk = (Qk ’ Qk’1 ) x ’ x(k’1) . (7.13)

Let us decompose the vector x ’ x(k’1) as

x ’ x(k’1) = ±δx(k) + s,

where ± ∈ R and sT δx(k) = 0. Therefore, (7.13) becomes

dk = ± (Qk ’ Qk’1 ) δx(k) + (Qk ’ Qk’1 ) s.

Only the second term in the relation above can be minimized since the ¬rst
one is independent of Qk , being

(Qk ’ Qk’1 )δx(k) = b(k) ’ Qk’1 δx(k) .

The problem has thus become: ¬nd the matrix Qk such that (Qk ’ Qk’1 ) s
is minimized ∀s orthogonal to δx(k) with the constraint that (7.12) holds.
It can be shown that such a matrix exists and can be recursively computed
as follows
(b(k) ’ Qk’1 δx(k) )δx(k)
Qk = Qk’1 + . (7.14)
δx(k) δx(k)
The method (7.11), with the choice (7.14) of matrix Qk is known as the
Broyden method. To initialize (7.14), we set Q0 equal to the matrix JF (x(0) )
or to any approximation of it, for instance, the one yielded by (7.9). As for
the convergence of Broyden™s method, the following result holds.

Property 7.2 If the assumptions of Theorem 7.1 are satis¬ed and there
exist two positive constants µ and γ such that

x(0) ’ x— ¤ µ, Q0 ’ JF (x— ) ¤ γ,

then the sequence of vectors x(k) generated by Broyden™s method is well
de¬ned and converges superlinearly to x— , that is

x(k) ’ x— ¤ ck x(k’1) ’ x— (7.15)

where the constants ck are such that lim ck = 0.
290 7. Nonlinear Systems and Numerical Optimization

Under further assumptions, it is also possible to prove that the sequence
Qk converges to JF (x— ), a property that does not necessarily hold for the
above method as demonstrated in Example 7.3.
There exist several variants to Broyden™s method which aim at reducing
its computational cost, but are usually less stable (see [DS83], Chapter 8).

Program 58 implements Broyden™s method (7.11)-(7.14). We have denoted
by Q the initial approximation Q0 in (7.14).

Program 58 - broyden : Broyden™s method for nonlinear systems
function [x,it]=broyden(x,Q,nmax,toll,f)
[n,m]=size(f); it=0; err=1;
fk=zeros(n,1); fk1=fk;
for i=1:n, fk(i)=eval(f(i,:)); end
while it < nmax & err > toll
s=-Q \ fk; x=s+x; err=norm(s,inf);
if err > toll
for i=1:n, fk1(i)=eval(f(i,:)); end
it=it+1; fk=fk1;

Example 7.2 Let us solve using Broyden™s method the nonlinear system of Ex-
ample 7.1. The method converges in 35 iterations to the value (0.7 · 10’8 , 0.7 ·
10’8 )T compared with the 26 iterations required by Newton™s method starting
from the same initial guess (x(0) = (0.1, 0.1)T ). The matrix Q0 has been set equal
to the Jacobian matrix evaluated at x(0) . Figure 7.1 shows the behavior of the

Euclidean norm of the error for both methods.

Example 7.3 Suppose we wish to solve using the Broyden method the nonlinear
system F(x) = (x1 +x2 ’3; x2 +x2 ’9)T = 0. This system admits the two solutions
1 2
(0, 3)T and (3, 0)T . Broyden™s method converges in 8 iterations to the solution
(0, 3)T starting from x(0) = (2, 4)T . However, the sequence of Qk , stored in the
variable Q of Program 58, does not converge to the Jacobian matrix, since
1 1 1 1
lim Q(k) = = JF [(0, 3)T ] = .
1.5 1.75 0 6

7.1.5 Fixed-point Methods
We conclude the analysis of methods for solving systems of nonlinear equa-
tions by extending to n-dimensions the ¬xed-point techniques introduced
in the scalar case. For this, we reformulate problem (7.1) as
given G : Rn ’ Rn , ¬nd x— ∈ Rn such that G(x— ) = x— (7.16)
7.1 Solution of Systems of Nonlinear Equations 291









0 5 10 15 20 25 30 35 40

FIGURE 7.1. Euclidean norm of the error for the Newton method (solid line)
and the Broyden method (dashed line) in the case of the nonlinear system of
Example 7.1

where G is related to F through the following property: if x— is a ¬xed
point of G, then F(x— ) = 0.
Analogously to what was done in Section 6.3, we introduce iterative meth-
ods for the solution of (7.16) of the form:

given x(0) ∈ Rn , for k = 0, 1, . . . until convergence, ¬nd

x(k+1) = G(x(k) ). (7.17)

In order to analyze the convergence of the ¬xed-point iteration (7.17) the
following de¬nition will be useful.

De¬nition 7.1 A mapping G : D ‚ Rn ’ Rn is contractive on a set
D0 ‚ D if there exists a constant ± < 1 such that G(x) ’ G(y) ¤
± x ’ y for all x, y in D0 where · is a suitable vector norm.
The existence and uniqueness of a ¬xed point for G is ensured by the
following theorem.

Theorem 7.2 (contraction-mapping theorem) Suppose that G : D ‚
Rn ’ Rn is contractive on a closed set D0 ‚ D and that G(x) ‚ D0 for
all x ∈ D0 . Then G has a unique ¬xed point in D0 .
Proof. Let us ¬rst prove the uniqueness of the ¬xed point. For this, assume that
there exist two distinct ¬xed points, x— , y— . Then

x— ’ y— = G(x— ) ’ G(y— ) ¤ ± x— ’ y—

from which (1 ’ ±) x— ’ y— ¤ 0. Since (1 ’ ±) > 0, it must necessarily be that
x— ’ y— = 0, i.e., x— = y— .
292 7. Nonlinear Systems and Numerical Optimization

To prove the existence we show that x(k) given by (7.17) is a Cauchy sequence.
This in turn implies that x(k) is convergent to a point x(—) ∈ D0 . Take x(0)
arbitrarily in D0 . Then, since the image of G is included in D0 , the sequence x(k)
is well de¬ned and

x(k+1) ’ x(k) = G(x(k) ) ’ G(x(k’1) ) ¤ ± x(k) ’ x(k’1) .

After p steps, p ≥ 1, we obtain
’x ¤ x(k+i) ’ x(k+i’1) ¤ ±p’1 + . . . + 1 x(k+1) ’ x(k)
(k+p) (k)
¤ x(1) ’ x(0) .

Owing to the continuity of G it follows that lim G(x(k) ) = G(x(—) ) which proves
that x is a ¬xed point for G.

The following result provides a su¬cient condition for the iteration (7.17) to
converge (for the proof see [OR70], pp. 299-301), and extends the analogous
Theorem 6.3 in the scalar case.

Property 7.3 Suppose that G : D ‚ Rn ’ Rn has a ¬xed point x— in the
interior of D and that G is continuously di¬erentiable in a neighborhood of
x— . Denote by JG the Jacobian matrix of G and assume that ρ(JG (x(—) )) <
1. Then there exists a neighborhood S of x— such that S ‚ D and, for any
x(0) ∈ S, the iterates de¬ned by (7.17) all lie in D and converge to x— .

As usual, since the spectral radius is the in¬mum of the induced matrix
norms, in order for convergence to hold it su¬ces to check that JG (x) < 1
for some matrix norm.

Example 7.4 Consider the nonlinear system
F(x) = x2 + x2 ’ 1, 2x1 + x2 ’ 1 = 0,
1 2

whose solutions are x— = (0, 1)T and x— = (4/5, ’3/5)T . To solve it, let us use
1 2
two ¬xed-point schemes, respectively de¬ned by the following iteration functions
®  ® 
1 ’ x2 1 ’ x2
G1 (x) = ° » , G2 (x) = ° ».
2 2 (7.18)
1 ’ x1 ’ 1 ’ x1
2 2

It can be checked that Gi (x— ) = x— for i = 1, 2 and that the Jacobian matrices
i i
of G1 and G2 , evaluated at x— and x— respectively, are
1 2
®  ® 
0 ’1 0 ’1
2 2
JG1 (x— ) = ° » , JG2 (x— ) = ° ».
1 2
0 0 0
7.1 Solution of Systems of Nonlinear Equations 293

The spectral radii are ρ(JG1 (x— )) = 0 and ρ(JG2 (x— )) = 2/3 0.817 < 1 so
1 2
that both methods are convergent in a suitable neighborhood of their respective
¬xed points.
Running Program 59, with a tolerance of 10’10 on the maximum absolute
di¬erence between two successive iterates, the ¬rst scheme converges to x— in 9
iterations, starting from x(0) = (’0.9, 0.9)T , while the second one converges to
x— in 115 iterations, starting from x(0) = (0.9, 0.9)T . The dramatic change in
the convergence behavior of the two methods can be explained in view of the
di¬erence between the spectral radii of the corresponding iteration matrices. •

Remark 7.1 Newton™s method can be regarded as a ¬xed-point method
with iteration function

GN (x) = x ’ J’1 (x)F(x). (7.19)

If we denote by r(k) = F(x(k) ) the residual at step k, from (7.19) it turns
out that Newton™s method can be alternatively formulated as

I ’ JGN (x(k) ) x(k+1) ’ x(k) = ’r(k) .

This equation allows us to interpret Newton™s method as a preconditioned
stationary Richardson method. This prompts introducing a parameter ±k
in order to accelerate the convergence of the iteration

I ’ JGN (x(k) ) x(k+1) ’ x(k) = ’±k r(k) .

The problem of how to select ±k will be addressed in Section 7.2.6.
An implementation of the ¬xed-point method (7.17) is provided in Pro-
gram 59. We have denoted by dim the size of the nonlinear system and
by Phi the variables containing the functional expressions of the iteration
function G. In output, the vector alpha contains the approximation of the
sought zero of F and the vector res contains the sequence of the maximum
norms of the residuals of F(x(k) ).

Program 59 - ¬xposys : Fixed-point method for nonlinear systems
function [alpha, res, nit]=¬xposys(dim, x0, nmax, toll, Phi, F)
x = x0; alpha=[x™]; res = 0;
for k=1:dim,
r=abs(eval(F(k,:))); if (r > res), res = r; end
nit = 0; residual(1)=res;
while ((nit <= nmax) & (res >= toll)),
nit = nit + 1;
for k = 1:dim, xnew(k) = eval(Phi(k,:)); end
x = xnew; res = 0; alpha=[alpha;x]; x=x™;
for k = 1:dim,
294 7. Nonlinear Systems and Numerical Optimization

r = abs(eval(F(k,:)));
if (r > res), res=r; end,

7.2 Unconstrained Optimization
We turn now to minimization problems. The point x— , the solution of (7.2),
is called a global minimizer of f , while x— is a local minimizer of f if ∃R > 0
such that

f (x— ) ¤ f (x), ∀x ∈ B(x— ; R).

Throughout this section we shall always assume that f ∈ C 1 (Rn ), and
we refer to [Lem89] for the case in which f is non di¬erentiable. We shall
denote by
‚f ‚f
∇f (x) = (x), . . . , (x) ,
‚x1 ‚xn

the gradient of f at a point x. If d is a non null vector in Rn , then the
directional derivative of f with respect to d is

f (x + ±d) ’ f (x)
(x) = lim
‚d ±

and satis¬es ‚f (x)/‚d = [∇f (x)] d. Moreover, denoting by (x, x + ±d)
the segment in Rn joining the points x and x + ±d, with ± ∈ R, Taylor™s
expansion ensures that ∃ξ ∈ (x, x + ±d) such that

f (x + ±d) ’ f (x) = ±∇f (ξ)T d. (7.20)

If f ∈ C 2 (Rn ), we shall denote by H(x) (or ∇2 f (x)) the Hessian matrix of
f evaluated at a point x, whose entries are

‚ 2 f (x)
hij (x) = , i, j = 1, . . . , n.
‚xi ‚xj

In such a case it can be shown that, if d = 0, the second-order directional
derivative exists and we have
(x) = dT H(x)d.
7.2 Unconstrained Optimization 295

For a suitable ξ ∈ (x, x + d) we also have
f (x + d) ’ f (x) = ∇f (x)T d + dT H(ξ)d.
Existence and uniqueness of solutions for (7.2) are not guaranteed in Rn .
Nevertheless, the following optimality conditions can be proved.

Property 7.4 Let x— ∈ Rn be a local minimizer of f and assume that
f ∈ C 1 (B(x— ; R)) for a suitable R > 0. Then ∇f (x— ) = 0. Moreover,
if f ∈ C 2 (B(x— ; R)) then H(x— ) is positive semide¬nite. Conversely, if
x— ∈ B(x— ; R) and H(x— ) is positive de¬nite, then x— is a local minimizer
of f in B(x— ; R).
A point x— such that ∇f (x— ) = 0, is said to be a critical point for f . This
condition is necessary for optimality to hold. However, this condition also
becomes su¬cient if f is a convex function on Rn , i.e., such that ∀x, y ∈ Rn
and for any ± ∈ [0, 1]
f [±x + (1 ’ ±)y] ¤ ±f (x) + (1 ’ ±)f (y). (7.21)
For further and more general existence results, see [Ber82].

7.2.1 Direct Search Methods
In this section we deal with direct methods for solving problem (7.2), which
only require f to be continuous. In later sections, we shall introduce the
so-called descent methods, which also involve values of the derivatives of f
and have, in general, better convergence properties.
Direct methods are employed when f is not di¬erentiable or if the com-
putation of its derivatives is a nontrivial task. They can also be used to
provide an approximate solution to employ as an initial guess for a descent
method. For further details, we refer to [Wal75] and [Wol78].

The Hooke and Jeeves Method
Assume we are searching for the minimizer of f starting from a given initial
point x(0) and requiring that the error on the residual is less than a certain
¬xed tolerance . The Hooke and Jeeves method computes a new point x(1)
using the values of f at suitable points along the orthogonal coordinate
directions around x(0) . The method consists of two steps: an exploration
step and an advancing step.
The exploration step starts by evaluating f (x(0) + h1 e1 ), where e1 is the
¬rst vector of the canonical basis of Rn and h1 is a positive real number to
be suitably chosen.
If f (x(0) + h1 e1 ) < f (x(0) ), then a success is recorded and the starting
point is moved in x(0) + h1 e1 , from which an analogous check is carried out
at point x(0) + h1 e1 + h2 e2 with h2 ∈ R+ .
296 7. Nonlinear Systems and Numerical Optimization

If, instead, f (x(0) + h1 e1 ) ≥ f (x(0) ), then a failure is recorded and a
similar check is performed at x(0) ’ h1 e1 . If a success is registered, the
method explores, as previously, the behavior of f in the direction e2 starting
from this new point, while, in case of a failure, the method passes directly
to examining direction e2 , keeping x(0) as starting point for the exploration
To achieve a certain accuracy, the step lengths hi must be selected in
such a way that the quantities

|f (x(0) ± hj ej ) ’ f (x(0) |, j = 1, . . . , n (7.22)

have comparable sizes.
The exploration step terminates as soon as all the n Cartesian directions
have been examined. Therefore, the method generates a new point, y(0) ,
after at most 2n+1 functional evaluations. Only two possibilities may arise:

1. y(0) = x(0) . In such a case, if max hi the method terminates
i=1,... ,n
and yields the approximate solution x . Otherwise, the step lengths
hi are halved and another exploration step is performed starting from
x(0) ;

max |hi | < , then the method terminates yielding
2. y(0) = x(0) . If
i=1,... ,n
y(0) as an approximate solution, otherwise the advancing step starts.
The advancing step consists of moving further from y(0) along the
direction y(0) ’ x(0) (which is the direction that recorded the maxi-
mum decrease of f during the exploration step), rather then simply
setting y(0) as a new starting point x(1) .
This new starting point is instead set equal to 2y(0) ’ x(0) . From this
point a new series of exploration moves is started. If this exploration
leads to a point y(1) such that f (y(1) ) < f (y(0) ’ x(0) ), then a new
starting point for the next exploration step has been found, otherwise
the initial guess for further explorations is set equal to y(1) = y(0) ’
x(0) .
The method is now ready to restart from the point x(1) just com-

Program 60 provides an implementation of the Hooke and Jeeves method.
The input parameters are the size n of the problem, the vector h of the
initial steps along the Cartesian directions, the variable f containing the
functional expression of f in terms of the components x(1), . . . , x(n), the
initial point x0 and the stopping tolerance toll equal to . In output, the
code returns the approximate minimizer of f , x, the value minf attained by
f at x and the number of iterations needed to compute x up to the desired
accuracy. The exploration step is performed by Program 61.
7.2 Unconstrained Optimization 297

Program 60 - hookejeeves : The method of Hooke and Jeeves (HJ)
function [x,minf,nit]=hookejeeves(n,h,f,x0,toll)
x = x0; minf = eval(f); nit = 0;
while h > toll
[y] = explore(h,n,f,x);
if y == x, h = h/2; else
x = 2*y-x; [z] = explore(h,n,f,x);
if z == x, x = y; else, x = z; end
nit = nit +1;
minf = eval(f);

Program 61 - explore : Exploration step in the HJ method
function [x]=explore(h,n,f,x0)
x = x0; f0 = eval(f);
for i=1:n
x(i) = x(i) + h(i); ¬ = eval(f);
if ¬ < f0, f0 = ¬; else
x(i) = x0(i) - h(i); ¬ = eval(f);
if ¬ < f0, f0 = ¬; else, x(i) = x0 (i); end

The Method of Nelder and Mead
This method, proposed in [NM65], employs local linear approximants of f
to generate a sequence of points x(k) , approximations of x— , starting from
simple geometrical considerations. To explain the details of the algorithm,
we begin by noticing that a plane in Rn is uniquely determined by ¬xing
n + 1 points that must not be lying on a hyperplane.
Denote such points by x(k) , for k = 0, . . . , n. They could be generated as

x(k) = x(0) + hk ek , k = 1, . . . , n (7.23)

having selected the steplengths hk ∈ R+ in such a way that the variations
(7.22) are of comparable size.
Let us now denote by x(M ) , x(m) and x(µ) those points of the set x(k)
at which f respectively attains its maximum and minimum value and the
value immediately preceding the maximum. Moreover, denote by xc the
centroid of point x(k) de¬ned as
x(k) x(j) .
298 7. Nonlinear Systems and Numerical Optimization

The method generates a sequence of approximations of x— , starting from
x(k) , by employing only three possible transformations: re¬‚ections with
respect to centroids, dilations and contractions. Let us examine the details
of the algorithm assuming that n + 1 initial points are available.
1. Determine the points x(M ) , x(m) and x(µ) .
2. Compute as an approximation of x— the point
n + 1 i=0

and check if x is su¬ciently close (in a sense to be made precise) to

x . Typically, one requires that the standard deviation of the values
f (x(0) ), . . . , f (x(n) ) from
¯ f (x(i) )
n + 1 i=0

are less than a ¬xed tolerance µ, that is
n 2
1 ¯
f (x(i) ) ’ f < µ.
n i=0

(M )
Otherwise, x(M ) is re¬‚ected with respect to xc , that is, the follow-
ing new point xr is computed
xr = (1 + ±)x(M ) ’ ±x(M ) ,

where ± ≥ 0 is a suitable re¬‚ection factor. Notice that the method
has moved along the “opposite” direction to x(M ) . This statement has
a geometrical interpretation in the case n = 2, since the points x(k)
coincide with x(M ) , x(m) and x(µ) . They thus de¬ne a plane whose
slope points from x(M ) towards x(m) and the method provides a step
along this direction.
3. If f (x(m) ) ¤ f (x(r) ) ¤ f (x(µ) ), the point x(M ) is replaced by x(r)
and the algorithm returns to step 2.
4. If f (x(r) ) < f (x(m) ) then the re¬‚ection step has produced a new
minimizer. This means that the minimizer could lie outside the set
de¬ned by the convex hull of the considered points. Therefore, this
set must be expanded by computing the new vertex
x(e) = βx(r) + (1 ’ β)x(M ) ,

where β > 1 is an expansion factor. Then, before coming back to step
2., two possibilities arise:
7.2 Unconstrained Optimization 299

4a. if f (x(e) ) < f (x(m) ) then x(M ) is replaced by x(e) ;
4b. f (x(e) ) ≥ f (x(m) ) then x(M ) is replaced by x(r) since f (x(r) ) <
f (x(m) ).
5. If f (x(r) ) > f (x(µ) ) then the minimizer probably lies within a subset
of the convex hull of points x(k) and, therefore, two di¬erent ap-
proaches can be pursued to contract this set. If f (x(r) ) < f (x(M ) ),
the contraction generates a new point of the form

x(co) = γx(r) + (1 ’ γ)x(M ) , γ ∈ (0, 1),


x(co) = γx(M ) + (1 ’ γ)x(M ) , γ ∈ (0, 1),

Finally, before returning to step 2., if f (x(co) ) < f (x(M ) ) and f (x(co) ) <
f (x(r) ), the point x(M ) is replaced by x(co) , while if f (x(co) ) ≥ f (x(M ) )
or if f (x(co) ) > f (x(r) ), then n new points x(k) are generated, with
k = 1, . . . , n, by halving the distances between the original points
and x(0) .

As far as the choice of the parameters ±, β and γ is concerned, the following
values are empirically suggested in [NM65]: ± = 1, β = 2 and γ = 1/2. The
resulting scheme is known as the Simplex method (that must not be con-
fused with a method sharing the same name used in linear programming),
since the set of the points x(k) , together with their convex combinations,
form a simplex in Rn .
The convergence rate of the method is strongly a¬ected by the orientation
of the starting simplex. To address this concern, in absence of information
about the behavior of f , the initial choice (7.23) turns out to be satisfactory
in most cases.
We ¬nally mention that the Simplex method is the basic ingredient of
the MATLAB function fmins for function minimization in n dimensions.

Example 7.5 Let us compare the performances of the Simplex method with the
Hooke and Jeeves method, in the minimization of the Rosembrock function

f (x) = 100(x2 ’ x2 )2 + (1 ’ x1 )2 . (7.24)

This function has a minimizer at (1, 1)T and represents a severe benchmark for
testing numerical methods in minimization problems. The starting point for both
methods is set equal to x(0) = (’1.2, 1)T , while the step sizes are taken equal
to h1 = 0.6 and h2 = 0.5, in such a way that (7.23) is satis¬ed. The stopping
tolerance on the residual is set equal to 10’4 . For the implementation of Simplex
method, we have used the MATLAB function fmins.
Figure 7.2 shows the iterates computed by the Hooke and Jeeves method (of
which one in every ten iterates have been reported, for the sake of clarity) and by
300 7. Nonlinear Systems and Numerical Optimization











’1.5 ’1 ’0.5 0 0.5 1 1.5

FIGURE 7.2. Convergence histories of the Hooke and Jeeves method
(crossed-line) and the Simplex method (circled-line). The level curves of the min-
imized function (7.24) are reported in dashed line

the Simplex method, superposed to the level curves of the Rosembrock function.
The graph demonstrates the di¬culty of this benchmark: actually, the function
is like a curved, narrow valley, which attains its minimum along the parabola of
equation x2 ’ x2 = 0.
The Simplex method converges in only 165 iterations, while 935 are needed for
the Hooke and Jeeves method to converge. The former scheme yields a solution
equal to (0.999987, 0.999978)T , while the latter gives the vector (0.9655, 0.9322)T .

7.2.2 Descent Methods
In this section we introduce iterative methods that are more sophisticated
than those examined in Section 7.2.1. They can be formulated as follows:

given an initial vector x(0) ∈ Rn , compute for k ≥ 0 until convergence
x(k+1) = x(k) + ±k d(k) , (7.25)
where d(k) is a suitably chosen direction and ±k is a positive parameter
(called stepsize) that measures the step along the direction d(k) . This di-
rection d(k) is a descent direction if
d(k) ∇f (x(k) ) < 0 if ∇f (x(k) ) = 0,
if ∇f (x
(k) (k)
d =0 ) = 0.
A descent method is a method like (7.25), in which the vectors d(k) are
descent directions.
Property (7.20) ensures that there exists ±k > 0, su¬ciently small, such
f (x(k) + ±k d(k) ) < f (x(k) ), (7.27)
7.2 Unconstrained Optimization 301

provided that f is continuously di¬erentiable. Actually, taking in (7.20)
ξ = x(k) + ‘±k d(k) with ‘ ∈ (0, 1), and employing the continuity of ∇f ,
we get

f (x(k) + ±k d(k) ) ’ f (x(k) ) = ±k ∇f (x(k) )T d(k) + µ, (7.28)

where µ tends to zero as ±k tends to zero. As a consequence, if ±k > 0 is
su¬ciently small, the sign of the left-side of (7.28) coincides with the sign
of ∇f (x(k) )T d(k) , so that (7.27) is satis¬ed if d(k) is a descent direction.
Di¬erent choices of d(k) correspond to di¬erent methods. In particular,
we recall the following ones:

- Newton™s method, in which

d(k) = ’H’1 (x(k) )∇f (x(k) ),

provided that H is positive de¬nite within a su¬ciently large neigh-
borhood of point x— ;
- inexact Newton™s methods, in which

d(k) = ’B’1 ∇f (x(k) ),

where Bk is a suitable approximation of H(x(k) );
- the gradient method or steepest descent method, corresponding to setting
d(k) = ’∇f (x(k) ). This method is thus an inexact Newton™s method,
in which Bk = I. It can also be regarded as a gradient-like method,
since d(k) ∇f (x(k) ) = ’ ∇f (x(k) ) 2 ;

- the conjugate gradient method, for which

d(k) = ’∇f (x(k) ) + βk d(k’1) ,

where βk is a scalar to be suitably selected in such a way that the
directions d(k) turn out to be mutually orthogonal with respect to
a suitable scalar product.

Selecting d(k) is not enough to completely identify a descent method,
since it remains an open problem how to determine ±k in such a way that
(7.27) is ful¬lled without resorting to excessively small stepsizes ±k (and,
thus, to methods with a slow convergence).
A method for computing ±k consists of solving the following minimiza-
tion problem in one dimension:

¬nd ± such that φ(±) = f (x(k) + ±d(k) ) is minimized. (7.29)

In such a case we have the following result.
302 7. Nonlinear Systems and Numerical Optimization

Theorem 7.3 Consider the descent method (7.25). If at the generic step
k, the parameter ±k is set equal to the exact solution of (7.29), then the
following orthogonality property holds

∇f (x(k+1) )T d(k) = 0.

Proof. Let ±k be a solution to (7.29). Then, the ¬rst derivative of φ, given by
‚f (k) ‚ (k) (k)
(xi + ±di ) = ∇f (x(k) + ±k d(k) )T d(k) ,
(x + ±k d(k) )
φ (±) =
‚xi ‚±

vanishes at ± = ±k . The thesis then follows, recalling the de¬nition of x(k+1) . 3

Unfortunately, except for in special cases (which are nevetherless quite
relevant, see Section 7.2.4), providing an exact solution of (7.29) is not fea-
sible, since this is a nonlinear problem. One possible strategy consists of
approximating f along the straight line x(k) + ±d(k) through an interpo-
lating polynomial and then minimizing this polynomial (see the quadratic
interpolation Powell methods and cubic interpolation Davidon methods in
Generally speaking, a process that leads to an approximate solution to
(7.29) is said to be a line search technique and is addressed in the next

7.2.3 Line Search Techniques
The methods that we are going to deal with in this section, are iterative
techniques that terminate as soon as some accuracy stopping criterion on
±k is satis¬ed. We shall assume that (7.26) holds.
Practical experience reveals that it is not necessary to solve accurately
for (7.29) in order to devise e¬cient methods, rather, it is crucial to enforce
some limitation on the step lengths (and, thus, on the admissible values for
±k ). Actually, without introducing any limitation, a reasonable request on
±k would seem be that the new iterate x(k+1) satis¬es the inequality

f (x(k+1) ) < f (x(k) ), (7.30)

where x(k) and d(k) have been ¬xed. For this purpose, the procedure based
on starting from a (su¬ciently large) value of the step length ±k and halve
this value until (7.30) is ful¬lled, can yield completely wrong results (see,

More stringent criteria than (7.30) should be adopted in the choice of
possible values for ±k . To this end, we notice that two kinds of di¬culties
arise with the above examples: a slow descent rate of the sequence and the
use of small stepsizes.
7.2 Unconstrained Optimization 303

The ¬rst di¬culty can be overcome by requiring that
0 ≥ vM (x(k+1) ) = f (x(k) ) ’ f (x(k) + ±k d(k) )
±k (7.31)
≥ ’σ∇f (x(k) )T d(k) ,

with σ ∈ (0, 1/2). This amounts to requiring that the average descent
rate vM of f along d(k) , evaluated at x(k+1) , be at least equal to a given
fraction of the initial descent rate at x(k) . To avoid the generation of too
small stepsizes, we require that the descent rate in the direction d(k) at
x(k+1) is not less than a given fraction of the descent rate at x(k)

|∇f (x(k) + ±k d(k) )T d(k) | ¤ β|∇f (x(k) )T d(k) |, (7.32)

with β ∈ (σ, 1) in such a way as to also satisfy (7.31). In computational
practice, σ ∈ [10’5 , 10’1 ] and β ∈ [10’1 , 1 ] are usual choices. Sometimes,
(7.32) is replaced by the milder condition

∇f (x(k) + ±k d(k) )T d(k) ≥ β∇f (x(k) )T d(k) (7.33)

(recall that ∇f (x(k) )T d(k) is negative, since d(k) is a descent direction).

The following property ensures that, under suitable assumptions, it is pos-
sible to ¬nd out values of ±k which satisfy (7.31)-(7.32) or (7.31)-(7.33).

Property 7.5 Assume that f (x) ≥ M for any x ∈ Rn . Then there exists
an interval I = [c, C] for the descent method, with 0 < c < C, such that
∀±k ∈ I, (7.31), (7.32) (or (7.31)-(7.33)) are satis¬ed, with σ ∈ (0, 1/2)
and β ∈ (σ, 1).
Under the constraint of ful¬lling conditions (7.31) and (7.32), several
choices for ±k are available. Among the most up-to-date strategies, we re-
call here the backtracking techniques: having ¬xed σ ∈ (0, 1/2), then start
with ±k = 1 and then keep on reducing its value by a suitable scale factor
ρ ∈ (0, 1) (backtrack step) until (7.31) is satis¬ed. This procedure is im-
plemented in Program 62, which requires as input parameters the vector x
containing x(k) , the macros f and J of the functional expressions of f and
its Jacobian, the vector d of the direction d(k) , and a value for σ (usually
of the order of 10’4 ) and the scale factor ρ. In output, the code returns
the vector x(k+1) , computed using a suitable value of ±k .

Program 62 - backtrackr : Backtraking for line search
function [xnew]= backtrackr(sigma,rho,x,f,J,d)
alphak = 1; fk = eval(f); Jfk = eval (J);
xx = x; x = x + alphak * d; fk1 = eval (f);
while fk1 > fk + sigma * alphak * Jfk™*d
304 7. Nonlinear Systems and Numerical Optimization

alphak = alphak*rho;
x = xx + alphak*d;
fk1 = eval(f);

Other commonly used strategies are those developed by Armijo and Gold-
stein (see [Arm66], [GP67]). Both use σ ∈ (0, 1/2). In the Armijo formula,
one takes ±k = β mk ±, where β ∈ (0, 1), ± > 0 and mk is the ¬rst non-
¯ ¯
negative integer such that (7.31) is satis¬ed. In the Goldstein formula, the
parameter ±k is determined in such a way that
f (x(k) + ±k d(k) ) ’ f (x(k) )
σ¤ ¤ 1 ’ σ. (7.34)
±k ∇f (x(k) )T d(k)
A procedure for computing ±k that satis¬es (7.34) is provided in [Ber82],
Chapter 1. Of course, one can even choose ±k = ± for any k, which is
clearly convenient when evaluating f is a costly task.
In any case, a good choice of the value ± is mandatory. In this respect,
one can proceed as follows. For a given value ±, the second degree poly-
nomial Π2 along the direction d is constructed, subject to the following
interpolation constraints
Π2 (x(k) ) = f (x(k) ),
Π2 (x(k) + ±d(k) ) = f (x(k) + ±d(k) ),
¯ ¯
Π2 (x(k) ) = ∇f (x(k) )T d(k) .
Next, the value ± is computed such that Π2 is minimized, then, we let
± = ±.

7.2.4 Descent Methods for Quadratic Functions
A case of remarkable interest, where the parameter ±k can be exactly com-
puted, is the problem of minimizing the quadratic function
x Ax ’ bT x,
f (x) = (7.35)
where A∈ Rn—n is a symmetric and positive de¬nite matrix and b ∈ Rn .
In such a case, as already seen in Section 4.3.3, a necessary condition for
x— to be a minimizer for f is that x— is the solution of the linear system
(3.2). Actually, it can be checked that if f is a quadratic function
∇f (x) = Ax ’ b = ’r, H(x) = A.
As a consequence, all gradient-like iterative methods developed in Section
4.3.3 for linear systems, can be extended tout-court to solve minimization
7.2 Unconstrained Optimization 305

In particular, having ¬xed a descent direction d(k) , we can determine
the optimal value of the acceleration parameter ±k that appears in (7.25),
in such a way as to ¬nd the point where the function f , restricted to the
direction d(k) , is minimized. Setting to zero the directional derivative, we
d T T
f (x(k) + ±k d(k) ) = ’d(k) r(k) + ±k d(k) Ad(k) = 0
from which the following expression for ±k is obtained
d(k) r(k)
±k = . (7.36)
d(k) Ad(k)
The error introduced by the iterative process (7.25) at the k-th step is
x(k+1) ’ x— = x(k+1) ’ x— A x(k+1) ’ x—
(k) T T
x— 2 —
’ ’x
(k) (k)
±k d(k) Ad(k) .
=x + 2±k d Ax +

On the other hand x(k) ’ x— = r(k) A’1 r(k) , so that from (7.37) it
follows that

x(k+1) ’ x— = ρk x(k) ’ x—
2 2

having denoted by ρk = 1 ’ σk , with
A’1 r(k) .
σk = (d(k) r(k) )2 / d(k) Ad(k) r(k)

Since A is symmetric and positive de¬nite, σk is always positive. Moreover,
it can be directly checked that ρk is strictly less than 1, except when d(k)
is orthogonal to r(k) , in which case ρk = 1.
The choice d(k) = r(k) , which leads to the steepest descent method, pre-
vents this last circumstance from arising. In such a case, from (7.38) we
»max ’ »min (k)
x(k+1) ’ x— x ’ x—
¤ (7.39)
»max + »min
having employed the following result.

Lemma 7.1 (Kantorovich inequality) Let A ∈ Rn—n be a symmetric
positive de¬nite matrix whose eigenvalues with largest and smallest module
are given by »max and »min , respectively. Then, ∀y ∈ Rn , y = 0,

(yT y)2 4»max »min
≥ .
(yT Ay)(yT A’1 y) (»max + »min )2
306 7. Nonlinear Systems and Numerical Optimization

It follows from (7.39) that, if A is ill-conditioned, the error reducing factor
for the steepest descent method is close to 1, yielding a slow convergence to
the minimizer x— . As done in Chapter 4, this drawback can be overcome
by introducing directions d(k) that are mutually A-conjugate, i.e.
d(k) Ad(m) = 0 if k = m.

The corresponding methods enjoy the following ¬nite termination property.

Property 7.6 A method for computing the minimizer x— of the quadratic
function (7.35) which employs A-conjugate directions terminates after at
most n steps if the acceleration parameter ±k is selected as in (7.36). More-
over, for any k, x(k+1) is the minimizer of f over the subspace generated
by the vectors x(0) , d(0) , . . . , d(k) and
r(k+1) d(m) = 0 ∀m ¤ k.

The A-conjugate directions can be determined by following the proce-
dure described in Section 4.3.4. Letting d(0) = r(0) , the conjugate gradient
method for function minimization is

d(k+1) = r(k) + βk d(k) ,
r(k+1) Ad(k) r(k+1) r(k+1)
βk = ’ = ,
d(k) Ad(k) r(k) r(k)
x(k+1) = x(k) + ±k d(k) .

It satis¬es the following error estimate
K2 (A) ’ 1
x(k) ’ x— x(0) ’ x—
¤2 A,
K2 (A) + 1

which can be improved by lowering the condition number of A, i.e., resort-
ing to the preconditioning techniques that have been dealt with in Section

Remark 7.2 (The nonquadratic case) The conjugate gradient method
can be extended to the case in which f is a non quadratic function. However,
in such an event, the acceleration parameter ±k cannot be exactly deter-
mined a priori, but requires the solution of a local minimization problem.
Moreover, the parameters βk can no longer be uniquely found. Among the
most reliable formulae, we recall the one due to Fletcher-Reeves,

∇f (x(k) ) 2
β1 = 0, βk = , for k > 1
∇f (x(k’1) ) 2
7.2 Unconstrained Optimization 307

and the one due to Polak-Ribi´re
∇f (x(k) ) (∇f (x(k) ) ’ ∇f (x(k’1) ))
β1 = 0, βk = , for k > 1.
∇f (x(k’1) ) 2

7.2.5 Newton-like Methods for Function Minimization
An alternative is provided by Newton™s method, which di¬ers from its ver-
sion for nonlinear systems in that now it is no longer applied to f , but to
its gradient.
Using the notation of Section 7.2.2, Newton™s method for function mini-
mization amounts to computing, for k = 0, 1, . . . , until convergence

d(k) = ’H’1 ∇f (x(k) ),
(k+1) (k) (k)
x =x +d ,

where x(0) ∈ Rn is a given initial vector and having set Hk = H(x(k) ). The
method can be derived by truncating Taylor™s expansion of f (x(k) ) at the
f (x(k) ) + ∇f (x(k) )T p + pT Hk p.
f (x(k) + p) (7.41)
Selecting p in (7.41) in such a way that the new vector x(k+1) = x(k) + p
satis¬es ∇f (xk+1 ) = 0, we end up with method (7.40), which thus con-
verges in one step if f is quadratic.
In the general case, a result analogous to Theorem 7.1 also holds for func-
tion minimization. Method (7.40) is therefore locally quadratically conver-
gent to the minimizer x— . However, it is not convenient to use Newton™s
method from the beginning of the computation, unless x(0) is su¬ciently
close to x— . Otherwise, indeed, Hk could not be invertible and the direc-
tions d(k) could fail to be descent directions. Moreover, if Hk is not positive
de¬nite, nothing prevents the scheme (7.40) from converging to a saddle
point or a maximizer, which are points where ∇f is equal to zero. All these
drawbacks, together with the high computational cost (recall that a linear
system with matrix Hk must be solved at each iteration), prompt suit-
ably modifying method (7.40), which leads to the so-called quasi-Newton

A ¬rst modi¬cation, which applies to the case where Hk is not posi-
tive de¬nite, yields the so-called Newton™s method with shift. The idea is
to prevent Newton™s method from converging to non-minimizers of f , by
applying the scheme to a new Hessian matrix Hk = Hk + µk In , where, as
308 7. Nonlinear Systems and Numerical Optimization

usual, In denotes the identity matrix of order n and µk is selected in such
a way that Hk is positive de¬nite. The problem is to determine the shift
µk with a reduced e¬ort. This can be done, for instance, by applying the
Gershgorin theorem to the matrix Hk (see Section 5.1). For further details
on the subject, see [DS83] and [GMW81].

7.2.6 Quasi-Newton Methods
At the generic k-th iteration, a quasi-Newton method for function mini-
mization performs the following steps:

1. compute the Hessian matrix Hk , or a suitable approximation Bk ;

2. ¬nd a descent direction d(k) (not necessarily coinciding with the di-
rection provided by Newton™s method), using Hk or Bk ;

3. compute the acceleration parameter ±k ;

4. update the solution, setting x(k+1) = x(k) + ±k d(k) , according to a
global convergence criterion.

In the particular case where d(k) = ’H’1 ∇f (x(k) ), the resulting scheme is
called the damped Newton™s method. To compute Hk or Bk , one can resort
to either Newton™s method or secant-like methods, which will be considered


. 12
( 26)