. 8
( 16)


Figure 4.5. Representation of the bisection of a tetrahedron.

Grid re¬nement may be necessary in those parts of the domain where
the weak solution of the variational equation has low regularity. The ¬gure
of the front cover (taken from [70]) shows the domain for a density-driven
¬‚ow problem, where the in¬‚ow and the out¬‚ow pass through very small,
nearly point-sized surfaces. The re¬nement is the result of a grid adaptation
strategy based on a posteriori error estimators (see Section 4.2). In time-
dependent problems, where those parts of the grid in which a re¬nement is
needed may also vary, grid coarsening is necessary to limit the expense. A
simple grid coarsening can be achieved, for example, by cancelling former
re¬nement steps in a conforming way.

4.1 For a given triangle K, the circumcentre can be computed by ¬nding
the intersection of the perpendicular bisectors associated with two edges
184 4. Grid Generation and A Posteriori Error Estimation

of K. This can be achieved by solving a linear system of equations with
respect to the coordinates of the circumcentre.

(a) Give such a system.
(b) How can the radius of the circumcircle be obtained from this solution?

4.2 Given a triangle K, denote by hi the length of edge i, i ∈ {1, 2, 3}.
Prove that the following expression equals the radius of the circumcircle
(without using the circumcentre!):

h1 h2 h3

4.3 Let K1 , K2 be two triangles sharing an edge.

(a) Show the equivalence of the following edge swap criteria:
Angle criterion: Select the diagonal of the so-formed quadrilateral
that maximizes the minimum of the six interior angles among the
two con¬gurations.
Circle criterion: Choose the diagonal of the quadrilateral for which
the open circumcircle disks to the resulting triangles do not contain
any of the remaining vertices.
(b) If ±1 , ±2 denote the two interior angles that are located opposite the
common edge of the triangles K1 and K2 , respectively, then the circle
criterion states that an edge swap is to be performed if

±1 + ±2 > π.

Prove this assertion.
(c) The criterion in (b) is numerically expensive. Show that the following
test is equivalent:

[(a1,1 ’ a3,1 )(a2,1 ’ a3,1 ) + (a1,2 ’ a3,2 )(a2,2 ’ a3,2 )]
—[(a2,1 ’ a4,1 )(a1,2 ’ a4,2 ) ’ (a1,1 ’ a4,1 )(a2,2 ’ a4,2 )]
< [(a2,1 ’ a4,1 )(a1,1 ’ a4,1 ) + (a2,2 ’ a4,2 )(a1,2 ’ a4,2 )]
—[(a2,1 ’ a3,1 )(a1,2 ’ a3,2 ) ’ (a1,1 ’ a3,1 )(a2,2 ’ a3,2 )] .

Here ai = (ai,1 , ai,2 )T , i ∈ {1, 2, 3}, denote the vertices of a triangle
ordered clockwise, and a4 = (a4,1 , a4,2 )T is the remaining vertex of
the quadrilateral, the position of which is tested in relation to the
circumcircle de¬ned by a1 , a2 , a3 .
Hint: Addition theorems for the sin function.
4.2. A Posteriori Error Estimates and Grid Adaptation 185

4.2 A Posteriori Error Estimates and Grid
In the practical application of discretization methods to partial di¬erential
equations, an important question is how much the computed approximative
solution uh deviates from the weak solution u of the given problem.
Typically, a certain norm of the error u ’ uh is taken as a measure of
this deviation. For elliptic or parabolic di¬erential equations of order two,
a common norm to quantify the error is the energy norm (respectively an
equivalent norm) or the L2 -norm. Some practically important problems
involve the approximation of the so-called derived quantities which can be
mathematically interpreted in terms of values of certain linear functionals
of the solution u. In such a case, an estimate of the corresponding error is
also of interest.
Example 4.1
ν ·∇u dσ: ¬‚ux of u through a part of the boundary “0 ‚ ‚„¦,
J(u) = “0
integral mean of u on some subdomain „¦0 ‚ „¦.
J(u) = „¦0 u dx:

In the following we will consider some estimates for a norm · of the
error u ’ uh and explain the corresponding terminology. Similar statements
remain true if u ’ uh is replaced by |J(u) ’ J(uh )|.
The error estimates given in the previous chapters are characterized by
the fact that no information about the computed solution uh is needed.
Estimates of this type are called a priori error estimates.
For example, consider a variational equation with a bilinear form that
satis¬es (for some space V such that H0 („¦) ‚ V ‚ H 1 („¦) and · := · 1 )

the assumptions (2.42), (2.43) and use numerically piecewise linear, con-
tinuous ¬nite elements. Then C´a™s lemma (Theorem 2.17) together with
the interpolation error estimate from Theorem 3.29 implies the estimate
u ’ uh ¤
u ’ Ih (u) 1 ¤ Ch , (4.1)
± ±
where the constant C depends on the weak solution u of the variational
Here C has the special form

|‚ u| dx
¯ ± 2
C=C (4.2)
„¦ |±|=2

with C > 0 independent of u. Unfortunately, the structure of the bound
(4.2) does not allow an immediate numerical application of (4.1).
But even if the constant C could be estimated and (4.1) could be used
to determine the discretization parameter h (maximum diameter of the
triangles in Th ) for a prescribed tolerance, in general this would lead to
186 4. Grid Generation and A Posteriori Error Estimation

a grid that is too ¬ne. This corresponds to an algebraic problem that is
too large. The reason is that the described approach determines a global
parameter, whereas the true error measure may have di¬erent magnitudes
in di¬erent regions of the domain „¦.
So we should aim at error estimates of type
u ’ uh ¤ D· (4.3)
D1 · ¤ u ’ u h ¤ D2 · , (4.4)
where the constants D, D1 , D2 > 0 do not depend on the discretization
parameters and
·= ·K . (4.5)

Here the quantities ·K should be computable using only the data ”
including possibly uh |K ” which are known on the particular element K.
If the bounds · (or the terms ·K , respectively) in (4.3) (respectively
(4.4)) depend on uh , i.e., they can be evaluated only if uh is known, then
they are called (local) a posteriori error estimators in the wider sense.
Often the bounds also depend on the weak solution u of the variational
equality, so in fact, they cannot be evaluated immediately. In such a case
they should be replaced by computable quantities that do not depend on u
in a direct way. So, if the bounds can be evaluated without knowing u but
using possibly uh , then they are called (local) a posteriori error estimators
in the strict sense.
Inequalities of the form (4.3) guarantee, for a given tolerance µ > 0, that
the inequality · ¤ µ implies that the error measure does not exceed µ up
to a multiplicative constant. In this sense the error estimator · is called
reliable. Now, if the computed approximative solution uh is su¬ciently
precise in the described sense, then the computation can be ¬nished. If uh
is such that · > µ, then the question of how to modify the discretization
in order to achieve the tolerance or, if the computer resources are nearly
exhausted, how to minimize the overshooting of ·, arises. That is, the
information given by the evaluation of the bounds has to be used to adapt
the discretization and then to perform a new run of the solution process.
A typical modi¬cation is to re¬ne or to coarsen the grid.
Error estimators may overestimate the real error measure signi¬cantly;
thus a grid adaptation procedure based on such an error estimate gener-
ates a grid that is too ¬ne, and consequently, the corresponding algebraic
problem is too large.
This e¬ect can be reduced or even avoided if the error estimator satis¬es
a two-sided inequality like (4.4). Then the ratio D2 /D1 is a measure of the
e¬ciency of the error estimator.
4.2. A Posteriori Error Estimates and Grid Adaptation 187

An error estimator · is called asymptotically exact if for an arbitrary con-
vergent sequence of approximations {uh } with u ’ uh ’ 0 the following
limit is valid:
’ 1.
u ’ uh

Usually, a posteriori error estimators are designed for a well-de¬ned class
of boundary or initial-boundary value problems. Within a given class of
problems, the question regarding the sensitivity of the constants D in (4.3)
or D1 , D2 in (4.4), with respect to the particular data of the problem (e.g.,
coe¬cients, inhomogeneities, geometry of the domain, grid geometry, . . . ),
arises. If this dependence of the data is not crucial, then the error estimator
is called robust within this class.

Grid Adaptation
Let us assume that the local error estimators ·K composing an e¬cient er-
ror estimator · for an approximate solution uh on some grid Th really re¬‚ect
the error on the element K and that this local error can be improved by a
re¬nement of K (e.g., following the principles of Section 4.1.5). Then the
following grid adaptation strategies can be applied until the given tolerance
µ is reached or the computer resources are exhausted.

Equidistribution strategy: The objective of the grid adaptation (re¬nement
or coarsening of elements) is to get a new grid Th new
such that the
local error estimators ·K for this new grid take one and the same
value for all elements K ∈ Th ; that is (cf. (4.5))

·K ≈ for all K ∈ Th .
new new
|Th |

Since the number of elements of the new grid enters the right-hand
side of this criterion, the strategy is an implicit method. In practical
use, it is approximated iteratively.

Cut-o¬ strategy: Given a parameter κ ∈ (0, 1), a threshold value κ· is
de¬ned. Then the elements K with ·K > κ· will be re¬ned.

Reduction strategy: Given a parameter κ ∈ (0, 1), an auxiliary toler-
ance µ· := κ· is de¬ned. Then a couple of steps following the
equidistribution strategy with the tolerance µ· are performed.

In practice, the equidistribution strategy may perform comparatively slowly
and thus may reduce the e¬ciency of the complete solution process. The
cut-o¬ method does not allow grid coarsening. It is rather sensitive to the
choice of the parameter κ. Among all three strategies, the reduction method
represents the best compromise.
188 4. Grid Generation and A Posteriori Error Estimation

Design of A Posteriori Error Estimators
In the following, three basic principles of the design of a posteriori error
estimators will be described. In order to illustrate the underlying ideas and
to avoid unnecessary technical di¬culties, a model problem will be treated:
Consider a di¬usion-reaction equation on a polygonally bounded domain
„¦ ‚ R2 with homogeneous Dirichlet boundary conditions

’∆u + ru = f in „¦ ,
u= 0 on ‚„¦ ,

where f ∈ L2 („¦) and r ∈ C(„¦) with r(x) ≥ 0 for all x ∈ „¦. The problem
is discretized using piecewise linear, continuous ¬nite element functions as
described in Section 2.2.
Setting a(u, v) := „¦ (∇u · ∇v + ruv) dx for u, v ∈ V := H0 („¦), we have

the following variational (weak) formulation:
Find u ∈ V such that a(u, v) = f, v for all v ∈ V.

The corresponding ¬nite element method reads as follows:
Find uh ∈ Vh such that a(uh , vh ) = f, vh for all vh ∈ Vh .

Residual Error Estimators
Similar to the derivation of the a priori error estimate in the proof of C´a™s
lemma (Theorem 2.17), the V -ellipticity of a (2.43) implies that

± u ’ uh ¤ a(u ’ uh , u ’ uh ) .

Without loss of generality we may suppose u ’ uh ∈ V \ {0}, hence

1 a(u ’ uh , u ’ uh ) a(u ’ uh , v)
u ’ uh ¤ ¤ sup . (4.6)
u ’ uh 1
± ± v∈V v1

We observe that the term

a(u ’ uh , v) = a(u, v) ’ a(uh , v) = f, v ’ a(uh , v) (4.7)

is the residual of the variational equation; i.e., the right-hand side of in-
equality (4.6) can be interpreted as a certain norm of the variational
In a next step, the variational residual will be split into local terms
according to the given grid, and these terms are transformed by means of
integration by parts. For arbitrary v ∈ V, from (4.7) it follows that

a(u ’ uh , v) = f v dx ’ (∇uh · ∇v + ruh v) dx

[f ’ (’∆uh + ruh )]v dx ’ ν · ∇uh v dσ
= .
K ‚K
4.2. A Posteriori Error Estimates and Grid Adaptation 189

The ¬rst factor in the integrals over the elements K is the classical
elementwise residual of the di¬erential equation:

rK (uh ) := [f ’ (’∆uh + ruh )] K

All quantities entering rK (uh ) are known. In the case considered here we
even have ’∆uh = 0 on K, hence rK (uh ) = [f ’ ruh ] K .
The integrals over the boundary of the elements K are further split into
a sum over the integrals along the element edges E ‚ ‚K:

ν · ∇uh v dσ = ν · ∇uh v dσ .
‚K E

Since v = 0 on ‚„¦, only the integrals along edges lying in „¦ contribute to
the sum. Denoting by Eh the set of all interior edges of all elements K ∈ Th
and assigning a ¬xed unit normal νE to any of those edges, we see that in
the summation of the split boundary integrals over all K ∈ Th there occur
exactly two integrals along one and the same edge E ∈ Eh . This observation
results in the relation

ν · ∇uh v dσ = [νE · ∇uh ]E v dσ ,
‚K E
K∈Th E∈Eh

where, for a piecewise continuous function w : „¦ ’ R, the term

[w]E (x) := lim w(x + νE ) ’ lim w(x ’ νE ) , x∈E,
’+0 ’+0

denotes the jump of the function w across the edge E. If w is the normal
derivative of uh in the ¬xed direction νE , i.e., w = νE · ∇uh , then its jump
does not depend on the particular orientation of νE (see Exercise 4.6).
In summary, we have the following relation:

a(u ’ uh , v) = rK (uh )v dx ’ [νE · ∇uh ]E v dσ .
K∈Th E∈Eh

Using the error equation (2.39), we obtain for an arbitrary element vh ∈ Vh
the fundamental identity

a(u ’ uh , v) = a(u ’ uh , v ’ vh )

rK (uh )(v ’ vh ) dx

’ [νE · ∇uh ]E (v ’ vh ) dσ ,

which is the starting point for the construction of further estimates.
190 4. Grid Generation and A Posteriori Error Estimation

¨¨r r
¢d  ¢f ¢d  ¢f
d¡ d
¢f ¢f
¢ ¢d ¢ ¢d
d¢f d¢f
¢¢ ¢¢
¢ ¢ Kd¢ f ¢ ¢ E ¢d f
¢d ¢  ¢ d 
d d¢
€ €
¢  e €€€  ¢  e €€€ 
¢  e ¨ ¢  e ¨
¨ ¨
¢ e¨ ¢

Figure 4.6. The triangular neighbourhoods ∆(K) (left) and ∆(E) (right).

First we see that the Cauchy“Schwarz inequality immediately implies

a(u ’ uh , v ’ vh ) ¤ v ’ vh
rK (uh ) 0,K 0,K
[νE · ∇uh ]E v ’ vh
+ 0,E .

To get this bound as small as possible, the function vh ∈ Vh is chosen such
that the element v ∈ V is approximated adequately in both spaces L2 (K)
and L2 (E). One suggestion is the use of an interpolating function according
to (2.47). However, since V ∈ C(„¦), this interpolant is not de¬ned. There-
fore other approximation procedures have to be applied. Roughly speaking,
suitable approximation principles, due to Cl´ment [52] or Scott and Zhang
[67], are based on taking certain local integral means. However, at this place
we cannot go further into these details and refer to the cited literature. In
fact, for our purposes it is important only that such approximations exist.
Their particular design is of minor interest.
We will formulate the relevant facts as a lemma. To do so, we need some
additional notation (see Figure 4.6):
triangular neighbourhood of a triangle K : ∆(K) := K :K ©K=… K ,
triangular neighbourhood of an edge E : ∆(E) := K.
K :K ©E=…
Thus ∆(K) consists of the union of the supports of those nodal basis
functions that are associated with the vertices of K, whereas ∆(E) is formed
by the union of those nodal basis functions that are associated with the
boundary points of E. Furthermore, the length of the edge E is denoted
by hE := |E|.
Lemma 4.2 Let a regular family (Th ) of triangulations of the domain „¦
be given. Then for any v ∈ V there exists an element Qh v ∈ Vh such that
for all triangles K ∈ Th and all edges E ∈ Eh the following estimates are
v ’ Qh v ¤ ChK |v|1,∆(K) ,
4.2. A Posteriori Error Estimates and Grid Adaptation 191

v ’ Qh v ¤C hE |v|1,∆(E) ,

where the constant C > 0 depends only on the family of triangulations.
Now, setting vh = Qh v in (4.8), the discrete Cauchy-Schwarz inequality

a(u ’ uh , v) ¤ C 0,K |v|1,∆(K)
hK rK (uh )

hE [νE · ∇uh ]E |v|1,∆(E)
+C 0,E
1/2 1/2

¤C |v|2
h2 rK (uh ) 2
K 0,K 1,∆(K)
K∈Th K∈Th
1/2 1/2
hE [νE · ∇uh ]E 0,E |v|2
+C .
E∈Eh E∈Eh

A detailed investigation of the two second factors shows that we can
decompose the integrals over ∆(K), ∆(E), according to

... = ... , ... = ... .
∆(K) K ∆(E) K
K ‚∆(K) K ‚∆(E)

This leads to a repeated summation of the integrals over the single elements
K. However, due to the regularity of the family of triangulations, the mul-
tiplicity of these summations is bounded independently of the particular
triangulation (see (3.93)). So we arrive at the estimates

1,∆(K) ¤ C|v|1 , |v|2
1,∆(E) ¤ C|v|1 .
2 2

K∈Th E∈Eh

Using the inequality a + b ¤ 2(a2 + b2 ) for a, b ∈ R, we get
a(u ’ uh , v)

¤ hE [νE · ∇uh ]E |v|1 .
h2 rK (uh ) 2 2
C +
K 0,K 0,E
K∈Th E∈Eh

Finally, (4.6) yields

u ’ uh ¤ D· with · 2 := 2

1 2
·K := h2 f ’ ruh hE [νE · ∇uh ]E
2 2
+ . (4.9)
K 0,K 0,E
192 4. Grid Generation and A Posteriori Error Estimation

Here we have taken into account that in the transformation of the edge
... into the double sum ...
E∈Eh K∈Th E‚‚K\‚„¦

the latter sums up every interior edge twice.
In summary, we have obtained an a posteriori error estimate of the form
(4.3). By means of re¬ned arguments it is also possible to derive lower
bounds for u ’ uh 1 . For details, we refer to the literature, for example

Error Estimation by Gradient Recovery
If we are interested in an estimate of the error u ’ uh ∈ V = H0 („¦) 1

measured in the H 1 - or energy norm · , this problem can be simpli¬ed by
means of the fact that both norms are equivalent on V to the H 1 -seminorm
|u ’ uh |1 = |∇u ’ ∇uh | dx =: ∇u ’ ∇uh
This is a simple consequence of the de¬nitions and the Poincar´ inequality
(see Theorem 2.18). That is, there exist constants C1 , C2 > 0 independent
of h such that
C1 |u ’ uh |1 ¤ u ’ uh ¤ C2 |u ’ uh |1 (4.10)
(cf. Exercise 4.8). Consequently, ∇u remains the only unknown quantity in
the error bound.
The idea of error estimation by means of gradient recovery is to replace
the unknown gradient of the weak solution u by a suitable quantity Rh uh
that is computable from the approximative solution uh at moderate ex-
pense. A popular example of such a technique is the so-called Z 2 estimate.
Here we will describe a simple version of it. Further applications can be
found in the original papers by Zienkiewicz and Zhu, e.g., [74].
Similar to the notation introduced in the preceding subsection, for a
given node a the set
∆(a) := K
K :a∈‚K

denotes the triangular neighbourhood of a (see Figure 4.7). This set co-
incides with the support of the piecewise linear, continuous basis function
associated with that node.
The gradient ∇uh of a piecewise linear, continuous ¬nite element function
uh is constant on every triangle K. This suggests that at any node a of
the triangulation Th we de¬ne the average Rh uh (a) of the values of the
gradients on those triangles sharing the vertex a:
∇uh |K |K| .
Rh uh (a) :=
4.2. A Posteriori Error Estimates and Grid Adaptation 193

¢ ¢d
¢ ¢d
¢d ¢
¢  e €€€
¢  e ¨ ¨
¢ e¨

Figure 4.7. The triangular neighbourhood ∆(a).

Interpolating the two components of these nodal values of Rh uh separately
in Vh , we get a recovery operator Rh : Vh ’ Vh — Vh .
Now a local error estimator can be de¬ned by the simple restriction of
the quantity · := Rh uh ’ ∇uh 0 onto a single element K:
· K := Rh uh ’ ∇uh .

A nice insight into the properties of this local estimator was given by
Rodr´±guez ([64], see also [35]), who compared it with the corresponding
residual estimator (4.9). Namely, neglecting in the residual estimator just
the residual part, i.e., setting
1 2
hE [νE · ∇uh ]E
˜2 and · 2 := ˜2
·K := ˜ ·K ,

then the following result is true:
Theorem 4.3 There exist two constants c1 , c2 > 0 depending only on the
family of triangulations such that
c1 · ¤ · ¤ c2 · .
˜ ˜
The motivation for the method of gradient recovery is to be seen in the
fact that Rh uh possesses special convergence properties. Namely, under
certain assumptions the recovered gradient Rh uh converges asymptoti-
cally to ∇u faster than ∇uh does. In such a case Rh uh is said to be a
superconvergent approximation to ∇u.
If superconvergence holds, the simple decomposition
∇u ’ ∇uh = Rh uh ’ ∇uh + ∇u ’ Rh uh
demonstrates that the ¬rst di¬erence on the right-hand side represents
the asymptotically dominating, computable part of the gradient error
∇u ’ ∇uh . In other words, if we could de¬ne, for the class of problems
under consideration, a superconvergent gradient recovery Rh uh that is com-
putable with moderate expense, then the quantities · K and · de¬ned above
may serve as a tool for a posteriori error estimation.
194 4. Grid Generation and A Posteriori Error Estimation

Unfortunately, such superconvergence properties are valid only under
rather restrictive assumptions (especially with respect to the grid and to the
regularity of the weak solution). Thus it is di¬cult to obtain a full math-
ematical foundation in practice. Nevertheless, gradient recovery is often
applied and yields satisfactory results in many situations.
The following example, which is due to Repin [63], shows that a recovered
gradient does not have to re¬‚ect the real behaviour of the error.
Example 4.4 Consider the following boundary value problem for d = 1
and „¦ = (0, 1):
’u = f u(0) = u(1) ’ 1 = 0 .
in „¦ ,
If f is constant, the exact solution reads u(x) = x(2 + (1 ’ x)f )/2. Suppose
we have found the function vh = x as an approximate solution. For an
arbitrary partition of „¦ into subintervals, this function is piecewise linear
and it satis¬es the boundary conditions formulated above. Now let Rh be
an arbitrary gradient recovery operator that is able to reproduce at least
constants. Since vh = 1, we have vh ’ Rh vh = 0, whereas the real error is
vh ’ u = (x ’ 1 )f.
An interpretation of this e¬ect is that the function vh does not solve the
corresponding discrete (Galerkin) equations. But this property of uh is used
for the proof of superconvergence. This property also plays an important
role in the derivation of the residual error estimates, because the error
equation is used therein.

Dual-Weighted Residual Error Estimators
The aforementioned a posteriori error estimates have two disadvantages:
On the one hand, certain global constants, which are not known in general,
are part of the bounds. Typical examples are ±’1 in (4.6) and the constants
C1 , C2 in the equivalence relation (4.10). On the other hand, we obtained

scaling factors like hK and hE simply by using a particular approximation
In the following, we will outline a method that attempts to circumvent
these drawbacks. It is especially appropriate for the estimation of errors of
functionals depending linearly on the solution.
So let J : V ’ R denote a linear, continuous functional. We are interested
in an estimate of |J(u) ’ J(uh )|. Therefore, the following auxiliary dual
problem is considered:
Find w ∈ V such that a(v, w) = J(v) for all v ∈ V.
Taking v = u ’ uh , we get immediately
J(u) ’ J(uh ) = J(u ’ uh ) = a(u ’ uh , w) .
If wh ∈ Vh is an arbitrary element, the error equation (2.39) yields
J(u) ’ J(uh ) = a(u ’ uh , w ’ wh ) .
4.2. A Posteriori Error Estimates and Grid Adaptation 195

Obviously, the right-hand side is of the same structure as in the derivation
of the estimate (4.8). Consequently, by using the same arguments it follows

|J(u) ’ J(uh )| ¤ w ’ wh
rK (uh ) 0,K 0,K

[νE · ∇uh ]E w ’ wh
+ .

In contrast to the previous approaches, here the norms of w ’ wh will not
be theoretically analyzed but numerically approximated. This can be done
by an approximation of the dual solution w. There are several (more or less
heuristic) ways to do this.

(1) Estimation of the approximation error: Here, the norms of w ’ wh
are estimated as in the case of residual error estimators. Since the
result depends on the unknown H 1 -seminorm of w, which is equiva-
lent to the L2 -norm of ∇w, the ¬nite element solution wh ∈ Vh of the
auxiliary problem is used to approximate ∇w. It is a great disadvan-
tage of this approach that again global constants enter in the ¬nal
estimate through the estimation of the approximation error. Further-
more, the discrete auxiliary problem is of similar complexity to that
of the original discrete problem.

(2) Higher-order discretizations of the auxiliary problem: The auxiliary
problem is solved numerically by using a method that is more accu-
rate than the original method to determine a solution in Vh . Then
w is replaced by that solution and wh ∈ Vh by an interpolant of
that solution. Unfortunately, since the discrete auxiliary problem is
of comparatively large dimension, this approach is rather expensive.

(3) Approximation by means of higher-order recovery: This method
works similarly to the approach described in the previous subsection;
w is replaced by an element that is recovered from the ¬nite element
solution wh ∈ Vh of the auxiliary problem. The recovered element
approximates w with higher order in both norms than wh does. This
method exhibits two problems: On the one hand, the auxiliary prob-
lem has to be solved numerically, and on the other hand, ensuring
the corresponding superconvergence properties may be di¬cult.

At the end of this section we want to mention how the method could be
used to estimate certain norms of the error. In the case where the norms
are induced by particular scalar products, there is a simple, formal way.
For example, for the L2 -norm we have
u ’ uh , u ’ uh 0
u ’ uh = .
u ’ uh 0
196 4. Grid Generation and A Posteriori Error Estimation

Keeping u and uh ¬xed, we get with the de¬nition
v, u ’ uh 0
J(v) :=
u ’ uh 0
a linear, continuous functional J : H 1 („¦) ’ R such that J(u) ’ J(uh ) =
u ’ uh 0 .
The practical di¬culty of this approach consists in the fact that to be
able to ¬nd the solution w of the auxiliary problem we have to know the
values of J, but they depend on the unknown element u ’ uh . The idea
of approximating these values immediately implies two problems: There is
additional expense, and the in¬‚uence of the approximation quality on the
accuracy of the obtained bounds has to be analyzed.

4.4 Let „¦ ‚ R2 be a bounded domain with a polygonal, Lipschitz con-
tinuous boundary and V := H0 („¦). Now consider a V -elliptic, continuous
bilinear form a and a continuous linear form b. The problem
u∈V : a(u, v) = b(v) for all v ∈ V
is discretized using piecewise linear, continuous ¬nite elements. If Ei de-
notes the support of the nodal basis functions of Vh associated with the
vertex ai , show that the abstract local error indicators
a(e, v)
·i := sup
v∈H0 (Ei )

can be estimated by means of the solutions ei ∈ H0 (Ei ) of the local

boundary value problems
ei ∈ H0 (Ei ) : a(ei , v) = b(v) ’ a(uh , v) for all v ∈ H0 (Ei )
1 1

as follows (M and ± denote the constants appearing in the continuity and
ellipticity conditions on a):
± ei ¤ ·i ¤ M ei .
If necessary, the elements of H0 (Ei ) are extended by zero to the whole
domain „¦.

4.5 A linear polynomial on some triangle is uniquely de¬ned either by
its values at the vertices or by its values at the edge midpoints. For a
¬xed triangulation of a polygonally bounded, simply connected domain
„¦ ‚ R2 , there can be de¬ned two ¬nite element spaces by identifying
common degrees of freedom of adjacent triangles.
Exercises 197

(a) Show that the dimension of the space de¬ned by the degrees of free-
dom located at the vertices is less than the dimension of the other
space (provided that the triangulation consists of more than one
(b) How can one explain this “loss of degrees of freedom”?

4.6 Denote by Th a triangulation of the domain „¦ ‚ Rd . Show that for
a function v : „¦ ’ R that is continuously di¬erentiable on each element
the jump [νE · ∇v]E of the normal derivative of v across an element edge
E does not depend on the orientation of the normal νE .

4.7 Let a regular family of triangulations (Th ) of a domain „¦ ‚ R2 be
given. Show that there exist constants C > 0 that depend only on the
family (Th ) such that
|v|2 ¤ for all v ∈ L2 („¦) ,
Cv 0
|v|2 ¤ for all v ∈ L2 („¦) .
Cv 0

4.8 Let „¦ ‚ Rd be a bounded domain. Show that there are constants
C1 , C2 > 0 such that for all v ∈ H0 („¦),

C1 |v|1 ¤ v ¤ C2 |v|1 .
Iterative Methods
for Systems of Linear Equations

We consider again the system of linear equations
Ax = b (5.1)
with nonsingular matrix A ∈ Rm,m , right-hand side b ∈ Rm , and solution
x ∈ Rm . As shown in Chapters 2 and 3, such systems of equations arise
from ¬nite element discretizations of elliptic boundary value problems. The
matrix A is the sti¬ness matrix and thus sparse, as can be seen from (2.37).
A sparse matrix is vaguely a matrix with so many vanishing elements that
using this structure in the solution of (5.1) is advantageous. Taking advan-
tage of a band or hull structure was discussed in Section 2.5. More precisely,
if (5.1) represents a ¬nite element discretization, then it is not su¬cient to
know the properties of the solution method for a ¬xed m. It is on the con-
trary necessary to study a sequence of problems with growing dimension
m, as it appears by the re¬nement of a triangulation. In the strict sense we
understand by the notion sparse matrices a sequence of matrices in which
the number of nonzero elements per row is bounded independently of the
dimension. This is the case for the sti¬ness matrices due to (2.37) if the un-
derlying sequence of triangulations is regular in the sense of De¬nition 3.28,
for example. In ¬nite element discretizations of time-dependent problems
(Chapter 7) as well as in ¬nite volume discretizations (Chapter 6) systems
of equations of equal properties arise, so that the following considerations
can be also applied there.
The described matrix structure is best applied in iterative methods that
have the operation matrix — vector as an essential module, where either
the system matrix A or a matrix of similar structure derived from it is
5. Iterative Methods for Systems of Linear Equations 199

concerned. If the matrix is sparse in the strict sense, then O(m) elementary
operations are necessary. In particular, list-oriented storage schemes can be
of use, as pointed out in Section 2.5.
The e¬ort for the approximative solution of (5.1) by an iterative method
is determined by the number of elementary operations per iteration step
and the number of iterations k that are necessary in order to reach the
desired relative error level µ > 0, i.e., to meet the demand
x(k) ’ x ¤ µ x(0) ’ x . (5.2)

Here x(k) k is the sequence of iterates for the initial value x(0) , · a ¬xed
norm in Rm , and x = A’1 b the exact solution of (5.1).
For all methods to be discussed we will have linear convergence of the
x(k) ’ x ¤ x(0) ’ x
with a contraction number with 0 < < 1, which in general depends on
the dimension m. To satisfy (5.2), k iterations are thus su¬cient, with
1 1
k≥ ln ln . (5.4)
The computational e¬ort of a method obviously depends on the size of µ,
although this will be seen as ¬xed and only the dependence on the dimen-
sion m is considered: often µ will be omitted in the corresponding Landau™s
symbols. The methods di¬er therefore by their convergence behaviour, de-
scribed by the contraction number and especially by its dependence
on m (for speci¬c classes of matrices and boundary value problems). A
method is (asymptotically) optimal if the contraction numbers are bounded
independently of m:
(m) ¤ < 1. (5.5)
In this case the total e¬ort for a sparse matrix is O(m) elementary opera-
tions, as for a matrix — vector step. Of course, for a more exact comparison,
the corresponding constants, which also re¬‚ect the e¬ort of an iteration
step, have to be exactly estimated.
While direct methods solve the system of equations (5.1) with machine
precision, provided it is solvable in a stable manner, one can freely choose
the accuracy with iterative methods. If (5.1) is generated by the discretiza-
tion of a boundary value problem, it is recommended to solve it only with
that accuracy with which (5.1) approximates the boundary value prob-
lem. Asymptotic statements hereto have, among others, been developed in
(3.89), (7.129) and give an estimation of the approximation error by Ch± ,
with constants C, ± > 0, whereby h is the mesh size of the corresponding
triangulation. Since the constants in these estimates are usually unknown,
the error level can be adapted only asymptotically in m, in order to gain
200 5. Iterative Methods for Systems of Linear Equations

an algorithmic error of equal asymptotics compared to the error of ap-
proximation. Although this contradicts the above-described point of view
of a constant error level, it does not alter anything in the comparison of
the methods: The respective e¬ort always has to be multiplied by a fac-
tor O(ln m) if in d space dimensions m ∼ h’d is valid, and the relations
between the methods compared remain the same.
Furthermore, the choice of the error level µ will be in¬‚uenced by the
quality of the initial iterate. Generally, statements about the initial iterate
are only possible for special situations: For parabolic initial boundary value
problems (Chapter 7) and a one-step time discretization it is recommended
to use the approximation of the old time level as initial iterate. In the
case of a hierarchy of space discretizations, a nested iteration is possible
(Section 5.6), where the initial iterates will naturally result.

5.1 Linear Stationary Iterative Methods
5.1.1 General Theory
We begin with the study of the following class of a¬ne-linear iteration
¦(x) := M x + N b , (5.6)
with matrices M, N ∈ Rm,m to be speci¬ed later. By means of ¦ an iter-
ation sequence x(0) , x(1) , x(2) , . . . is de¬ned through a ¬xed-point iteration

x(k+1) := ¦ x(k) , k = 0, 1, . . . , (5.7)
from an initial approximation x(0) . Methods of this kind are called linear
stationary, because of their form (5.6) with a ¬xed iteration matrix M . The
function ¦ : Rm ’ Rm is continuous, so that in case of convergence of x(k)
for k ’ ∞, for the limit x we have
x = ¦(x) = M x + N b .
In order to achieve that the ¬xed-point iteration de¬ned by (5.6) is con-
sistent with Ax = b, i.e., each solution of (5.1) is also a ¬xed point, we
must require
A’1 b = M A’1 b + N b for arbitrary b ∈ Rm ,
i.e., A’1 = M A’1 + N , and thus
I = M + NA . (5.8)
On the other hand, if N is nonsingular, which will always be the case in
the following, then (5.8) also implies that a ¬xed point of (5.6) solves the
system of equations.
5.1. Linear Stationary Iterative Methods 201

Assuming the validity of (5.8), the ¬xed-point iteration for (5.6) can also
be written as
x(k+1) = x(k) ’ N Ax(k) ’ b , (5.9)
M x(k) + N b = (I ’ N A) x(k) + N b .
If N is nonsingular, we have additionally an equivalent form given by
W x(k+1) ’ x(k) = ’ Ax(k) ’ b (5.10)
with W := N ’1 . The correction x(k+1) ’x(k) for x(k) is given by the residual

g (k) := Ax(k) ’ b
through (5.9) or (5.10), possibly by solving a system of equations. In order
to compete with the direct method, the solution of (5.10) should require
one order in m fewer elementary operations. For dense matrices no more
operations than O(m2 ) should be necessary as are already necessary for
the calculation of g (k) . The same holds for sparse matrices, for example
band matrices. On the other side the method should converge, and that as
quickly as possible.
In the form (5.6) ¦ is Lipschitz continuous for a given norm · on
R with Lipschitz constant M , where · is a norm on Rm,m that is

consistent with the vector norm (see (A3.9)).
More precisely, for a consistent iteration the error
e(k) := x(k) ’ x ,
with x = A’1 b still denoting the exact solution, even satis¬es
e(k+1) = M e(k) ,
because (5.7) and (5.8) imply
e(k+1) = x(k+1) ’ x = M x(k) + N b ’ M x ’ N Ax = M e(k) . (5.11)
The spectral radius of M , that is, the maximum of the absolute values of
the (complex) eigenvalues of M , will be denoted by (M ).
The following general convergence theorem holds:
Theorem 5.1 A ¬xed-point iteration given by (5.6) to solve Ax = b is
globally and linearly convergent if
(M ) < 1 . (5.12)
· on Rm,m induced by a norm ·
This is satis¬ed if for a matrix norm
on Rm we have
M < 1. (5.13)
202 5. Iterative Methods for Systems of Linear Equations

If the consistency condition (5.8) holds and the matrix and vector norms
applied are consistent, then the convergence is monotone in the following
e(k+1) ¤ M e(k) . (5.14)

Proof: Assuming (5.12), then for µ = (1 ’ (M )) /2 > 0 there is a norm
· S on Rm such that the induced norm · S on Rm,m satis¬es
¤ (M ) + µ < 1

(see [16, p. 34]). The function ¦ is a contraction with respect to this special
norm on Rm . Therefore, Banach™s ¬xed-point theorem (Theorem 8.4) can
be applied on X = (Rm , · S ), which ensures the global convergence of
the sequence x(k) k to a ¬xed point x of ¦.
If (5.13) holds, ¦ is a contraction even with respect to the norm ·
on Rm , and M is the Lipschitz constant. Finally relation (5.14) follows
from (5.11).

In any case, we have convergence in any norm on Rm , since they are all
equivalent. Linear convergence for (5.12) holds only in the generally not
available norm · S with M S as contraction number.
As termination criterion for the concrete iteration methods to be
introduced, often
g (k) ¤ δ g (0) (5.15)
is used with a control parameter δ > 0, abbreviated as g (k) = 0. The
connection to the desired reduction of the relative error according to (5.2)
is given by
e(k) g (k)
¤ κ(A) (0) , (5.16)
e(0) g
where the condition number κ(A) = A A’1 is to be computed with
respect to a matrix norm that is consistent with the chosen vector norm.
Relation (5.16) follows from
A’1 g (k) ¤ A’1
e(k) g (k) ,
Ae(0) ¤ A
g (0) e(0) .
Therefore, for the selection of δ in (5.15) we have to take into account the
behaviour of the condition number.
For the iteration matrix M , according to (5.8), we have
M = I ’ NA ,
or according to (5.10) with nonsingular W ,
M = I ’ W ’1 A .
5.1. Linear Stationary Iterative Methods 203

To improve the convergence, i.e. to reduce (M ) (or M ), we need
N ≈ A’1 and W ≈ A ,
which is in contradiction to the fast solvability of (5.10).

5.1.2 Classical Methods
The fast solvability of (5.10) (in O(m) operations) is ensured by choosing
W := D , (5.17)
where A = L + D + R is the unique partition of A, with a strictly lower
triangular matrix L, a strictly upper triangular matrix R, and the diagonal
matrix D:
«  « 
0 ··· ··· 0 0 a1,2 · · · a1,m
¬ a2,1 0 ··· 0 · ¬ . .. .. ·
¬ · ¬. ·
. .
. .
L := ¬ . . · , R := ¬ ·,
.. ..
. .  0 · · · 0 am’1,m 
. .
. .
am,1 · · · am,m’1 0 0 ··· ··· 0
« 
¬ 0·
¬ ·
D := ¬ ·.
0 
Assume aii = 0 for all i = 1, . . . , m, or equivalently that D is nonsingular,
which can be achieved by row and column permutation.
The choice of (5.17) is called the method of simultaneous displacements
or Jacobi™s method. In the formulation form (5.6) we have
D’1 ,
N =
I ’ N A = I ’ D’1 A = ’D’1 (L + R) .
MJ =
Therefore, the iteration can be written as
D x(k+1) ’ x(k) = ’ Ax(k) ’ b
x(k+1) = D’1 ’ Lx(k) ’ Rx(k) + b (5.19)
« 
i’1 m
aij xj + bi  for all i = 1, . . . , m .
(k+1) (k) (k)
’ aij xj ’
xi =
aii j=1 j=i+1

On the right side in the ¬rst sum it is reasonable to use the new iterate
x where it is already calculated. This leads us to the iteration
x(k+1) = D’1 ’ Lx(k+1) ’ Rx(k) + b (5.20)
204 5. Iterative Methods for Systems of Linear Equations

(D + L) x(k+1) = ’Rx(k) + b
(D + L) x(k+1) ’ x(k) = ’ Ax(k) ’ b , (5.21)
the so-called method of successive displacements or Gauss“Seidel method.
According to (5.21) we have here a consistent iteration with
W =D+L.
Since D is nonsingular, W is nonsingular. Written in the form (5.6) the
method is de¬ned by
W ’1 = (D + L)
N = ,
’1 ’1
I ’ N A = I ’ (D + L) A = ’ (D + L)
MGS = R.
In contrast to the Jacobi iteration, the Gauss“Seidel iteration depends on
the order of the equations. However, the derivation (5.20) shows that the
number of operations per iteration step is equal,
Jacobi becomes Gauss“Seidel,
if x(k+1) is stored on the same vector as x(k) .
A su¬cient convergence condition is given by the following theorem:
Theorem 5.2 Jacobi™s method and the Gauss“Seidel method converge
· ∞ if the strict row sum
globally and monotonically with respect to
|aij | < |aii | for all i = 1, . . . , m (5.22)

is satis¬ed.

Proof : The proof here is given only for the Jacobi iteration. For the other
method see, for example, [16].
The inequality (5.22) is equivalent to MJ ∞ < 1 because of MJ =
’D’1 (L + R) if · ∞ is the matrix norm that is induced by · ∞ , which
means the maximum-row-sum norm (see (A3.6)).

It can be shown that the Gauss“Seidel method converges “better” than
Jacobi™s method, as expected: Under the assumption of (5.22) for the
respective iteration matrices,
¤ MJ
MGS <1
∞ ∞

(see, for example, [16]).
5.1. Linear Stationary Iterative Methods 205

Theorem 5.3 If A is symmetric and positive de¬nite, then the Gauss“
Seidel method converges globally. The convergence is monotone in the
energy norm · A , where x A := xT Ax for x ∈ Rm .

Proof: See [16, p. 90].

If the di¬erential operator, and therefore the bilinear form, is symmet-
ric, that is, if (3.12) holds with c = 0, then Theorem 5.3 can be applied.
Concerning the applicability of Theorem 5.2, even for the Poisson equation
with Dirichlet boundary conditions (1.1), (1.2) requirements for the ¬nite
element discretization are necessary in order to satisfy at least a weaker
version of (5.22). This example then satis¬es the weak row sum criterion
only in the following sense:
|aij | ¤ |aii | for all i = 1, . . . , m ;

“<” holds for at least one i ∈ {1, . . . , m} .

In the case of the ¬nite di¬erence method (1.7) for the rectangular do-
main or the ¬nite element method from Section 2.2, which leads to the
same discretization matrix, (5.23) is satis¬ed. For a general triangulation
with linear ansatz functions, conditions for the angles of the elements must
be required (see the angle condition in Section 3.9). The condition (5.23)
is also su¬cient, if A is irreducible (see Appendix A.3).

Theorem 5.4 If A satis¬es the condition (5.23) and is irreducible, then
Jacobi™s method converges globally.

Proof: See [28, p. 111].

The qualitative statement of convergence does not say anything about
the usefulness of Jacobi™s and the Gauss“Seidel method for ¬nite element
discretizations. As an example we consider the Dirichlet problem for the
Poisson equation on a rectangular domain as in (1.5), with the ¬ve-point
stencil discretization introduced in Section 1.2. We restrict ourselves to
an equal number of nodes in both space directions for simpli¬city of the
notation. This number is denoted by n + 1, di¬erently than in Chapter 1.
Therefore, A ∈ Rm,m according to (1.14), with m = (n ’ 1)2 being the
number of interior nodes. The factor h’2 can be omitted by multiplying
the equation by h2 .
In the above example the eigenvalues and therefore the spectral radius
can be calculated explicitly. Due to D = 4I we have for Jacobi™s method
1 1
M = ’ (A ’ 4I) = I ’ A ,
4 4
206 5. Iterative Methods for Systems of Linear Equations

and therefore A and M have the same eigenvectors, namely,
ikπ jlπ
1 ¤ i, j, k, l ¤ n ’ 1 ,
z k,l = sin sin ,
ij n n
with the eigenvalues
kπ lπ
2 2 ’ cos ’ cos (5.24)
n n
for A and
1 kπ 1 lπ
cos + cos (5.25)
2 n 2 n
for M with 1 ¤ k, l ¤ n ’ 1. This can be proven directly with the help of
trigonometric identities (see, for example, [15, p. 53]). Thus we have
(n ’ 1)π π2
= cos = 1 ’ 2 + O n’4 .
(M ) = ’ cos (5.26)
n n 2n
With growing n the rate of convergence becomes worse. The e¬ort to gain
an approximative solution, which means to reduce the error level below a
given threshold µ, is proportional to the number of iterations — operations
for an iteration, as we discussed at the beginning of this chapter. Due to
(5.4) and (5.12) the number of necessary operations is calculated as follows:
ln(1/µ) 1 1
· O(m) = ln · O n2 · O(m) = ln O(m2 ) .
’ ln( (M )) µ µ
Here the well-known expansion ln(1 + x) = x + O(x2 ) is employed in the
determination of the leading term of ’1/(ln( (M )). An analogous result
with better constants holds for the Gauss“Seidel method.
In comparison to this, the elimination or the Cholesky method requires
O band-width2 · m = O(m2 )
operations; i.e., they are of the same complexity. Therefore, both methods
are of use for only moderately large m.
An iterative method has a superior complexity to that of the Cholesky
method if
(M ) = 1 ’ O(n’± ) (5.27)
with ± < 2. In the ideal case (5.5) holds; then the method needs O(m)
operations, which is asymptotically optimal.
In the following we will present a sequence of methods with increasingly
better convergence properties for systems of equations that arise from ¬nite
element discretizations.
The simplest iteration is the Richardson method, de¬ned by
M =I ’A, i.e., N = W = I . (5.28)
5.1. Linear Stationary Iterative Methods 207

For this method we have
(M ) = max {|1 ’ »max (A)|, |1 ’ »min (A)|} ,
with »max (A) and »min (A) being the largest and smallest eigenvalues of A,
respectively. Therefore, this method is convergent for special matrices only.
In the case of a nonsingular D, the Richardson method for the transformed
system of equations
D’1 Ax = D’1 b
is equivalent to Jacobi™s method.
More generally, the following can be shown: If a consistent method is
de¬ned by M, N with I = M +N A, and N nonsingular, then it is equivalent
to the Richardson method applied to
N Ax = N b . (5.29)
The Richardson method for (5.29) has the form

x(k+1) ’ x(k) = ’N N Ax(k) ’ N b

with N = I, which means the form (5.9), and vice versa.
Equation (5.29) can also be interpreted as a preconditioning of the system
of equations (5.1), with the aim to reduce the spectral condition number
κ(A) of the system matrix, since this is essential for the convergence be-
haviour. This will be further speci¬ed in the following considerations (5.33),
(5.73). As already seen in the aforementioned examples, the matrix N A will
not be constructed explicitly, since N is in general densely occupied, even
if N ’1 is sparse. The evaluation of y = N Ax therefore means solving the
auxiliary system of equations
N ’1 y = Ax .
Obviously, we have the following:
Lemma 5.5 If the matrix A is symmetric and positive de¬nite, then for
the Richardson method all eigenvalues of M are real and smaller than 1.

5.1.3 Relaxation
We continue to assume that A is symmetric and positive de¬nite. Therefore,
divergence of the procedure can be caused only by negative eigenvalues of
I ’ A less than or equal to ’1. In general, bad or nonconvergent iterative
methods can be improved in their convergence behaviour by relaxation if
they meet certain conditions.
For an iteration method, given in the form (5.6), (5.7), the corresponding
relaxation method with relaxation parameter ω > 0 is de¬ned by
ω M x(k) + N b + (1 ’ ω)x(k) ,
x(k+1) := (5.30)
208 5. Iterative Methods for Systems of Linear Equations

which means
ωM + (1 ’ ω)I ,
Mω := Nω := ωN , (5.31)
or if the condition of consistency M = I ’ N A holds,
= ω x(k) ’ N Ax(k) ’ b + (1 ’ ω)x(k)
= x(k) ’ ωN Ax(k) ’ b .
Let us assume for the procedure (5.6) that all eigenvalues of M are real.
For the smallest one »min and the largest one »max we assume
»min ¤ »max < 1 ;
this is, for example, the case for the Richardson method. Then also the
eigenvalues of Mω are real, and we conclude that
»i (Mω ) = ω»i (M ) + 1 ’ ω = 1 ’ ω 1 ’ »i (M )
if the »i (B) are the eigenvalues of B in an arbitrary ordering. Hence
(Mω ) = max |1 ’ ω (1 ’ »min (M ))| , |1 ’ ω (1 ’ »max (M ))| ,
since f (») := 1 ’ ω(1 ’ ») is a straight line for a ¬xed ω (with f (1) = 1
and f (0) = 1 ’ ω).

ρ(Mω )
- 1 + ω ( 1 ’ »min )
1 1
» max
ω1 1 - ω ( 1 ’ »max )
» min » max »
» min
1 - ω ( 1 ’ »min )
f (») for ω1 < 1 and ω2 > 1

Figure 5.1. Calculation of ω .

For the optimal ω , i.e., ω with
¯ ¯
(Mω ) = min (Mω ) ,

we therefore have, as can be seen from Figure 5.1,
1 ’ ω (1 ’ »max (M )) = ’1 + ω (1 ’ »min (M ))
¯ ¯
⇐’ ω=
¯ .
2 ’ »max (M ) ’ »min (M )
Hence ω > 0 and
(Mω ) = 1 ’ ω (1 ’ »max (M )) < 1 ;
5.1. Linear Stationary Iterative Methods 209

consequently, the method converges with optimal ω even in cases where
it would not converge for ω = 1. But keep in mind that one needs the
eigenvalues of M to determine ω .
Moreover, we have

¯ »max (M ) + »min (M ) < 0 .
If »min (M ) = ’»max (M ), that is, ω = 1, we will achieve an improvement
by relaxation:
(Mω ) < (M ) .

The case of ω < 1 is called underrelaxation, whereas in the case of ω > 1
we speak of an overerrelaxation.
In particular, for the Richardson method with the iteration matrix M =
I ’ A, due to »min (M ) = 1 ’ »max (A) and »max (M ) = 1 ’ »min (A), the
optimal ω is given by
¯ . (5.32)
»min (A) + »max (A)
»max (A) ’ »min (A) κ(A) ’ 1
(Mω ) = 1 ’ ω »min (A) =
¯ = < 1, (5.33)
»min (A) + »max (A) κ(A) + 1
with the spectral condition number of A
»max (A)
κ(A) :=
»min (A)
(see Appendix A.3).
For large κ(A) we have
κ(A) ’ 1 2
(Mω ) = ,
κ(A) + 1 κ(A)
the variable of the proportionality being κ(A). For the example of the
¬ve-point stencil discretization, due to (5.24),
n’1 π
»min (A) + »max (A) = 4 2 ’ cos π ’ cos = 8,
n n
and thus due to (5.32),
¯ .
Hence the iteration matrix Mω = I ’ 1 A is identical to the Jacobi iteration:
¯ 4
We have rediscovered Jacobi™s method.
By means of (5.33) we can estimate the contraction number, since we
know from (5.24) that
4 1 ’ cos n’1 π 1 + cos π 4n2
≈ 2.
n n
κ(A) = = (5.34)
1 ’ cos π
4 1 ’ cos π π
210 5. Iterative Methods for Systems of Linear Equations

This shows the stringency of Theorem 3.45, and again we can conclude that
≈1’ 2 .
(Mω ) = cos (5.35)
n 2n
Due to Theorem 3.45 the convergence behaviour seen for the model problem
is also valid in general for quasi-uniform triangulations.

5.1.4 SOR and Block-Iteration Methods
We assume again that A is a general nonsingular matrix. For the relaxation
of the Gauss“Seidel method we use it in the form
Dx(k+1) = ’Lx(k+1) ’ Rx(k) + b ,
instead of the resolved form (5.20).
The relaxed method is then
Dx(k+1) = ω ’ Lx(k+1) ’ Rx(k) + b + (1 ’ ω)Dx(k) (5.36)
with a relaxation parameter ω > 0. This is equivalent to
(D + ωL) x(k+1) = (’ωR + (1 ’ ω)D) x(k) + ωb . (5.37)
(D + ωL)’1 (’ωR + (1 ’ ω)D) ,
Mω :=
Nω := (D + ωL) ω.
In the application to discretizations of boundary value problems, nor-
mally we choose ω > 1, which means overrelaxation. This explains the
name of the SOR method as an abbreviation of successive overrelaxation.
The e¬ort to execute an iteration step is hardly higher than for the Gauss“
Seidel method. Although we have to add 3m operations to the evaluation
of the right side of (5.36), the forward substitution to solve the auxiliary
system of equations in (5.37) is already part of the form (5.36).
The calculation of the optimal ω here is more di¬cult, because Mω
depends nonlinearly on ω. Only for special classes of matrices can the opti-
mal ω minimizing (Mω ) be calculated explicitly in dependence on (M1 ),
the convergence rate of the (nonrelaxed) Gauss“Seidel method. Before we
sketch this, we will look at some further variants of this procedure:
The matrix Nω is nonsymmetric even for symmetric A. One gets a sym-
metric Nω if after one SOR step another one is performed in which the
indices are run through in reverse order m, m ’ 1, . . . , 2, 1, which means
that L and R are exchanged. The two half steps
1 1
= ω ’ Lx(k+ 2 ) ’ Rx(k) + b + (1 ’ ω)Dx(k) ,
Dx(k+ 2 )
1 1
= ω ’ Lx(k+ 2 ) ’ Rx(k+1) + b + (1 ’ ω)Dx(k+ 2 ) ,
5.1. Linear Stationary Iterative Methods 211

make up one step of the symmetric SOR, the SSOR method for short. A
special case is the symmetric Gauss“Seidel method for ω = 1.
We write down the procedure for symmetric A, i.e., R = LT in the form
(5.6), in which the symmetry of N becomes obvious:
’1 ’1
(1 ’ ω)D ’ ωL (D + ωL) (1 ’ ω)D ’ ωLT ,
D + ωLT
M =
D (D + ωL)’1 .
= ω(2 ’ ω) D + ωLT
N (5.38)
The e¬ort for SSOR is only slightly higher than for SOR if the vectors
already calculated in the half steps are stored and used again, as for example
Lx(k+1/2) .
Other variants of these procedures are created if the procedures are not
applied to the matrix itself but to a block partitioning
with Aij ∈ Rmi ,mj ,
A = (Aij )i,j i, j = 1, . . . , p , (5.39)
with i=1 mi = m. As an example we get the block-Jacobi method, which
is analogous to (5.19) and has the form
« 
’ + βi  for all i = 1, . . . , p .
(k+1) (k) (k)

ξi = Aij ξj Aij ξj
j=1 j=i+1
Here x = (ξ1 , . . . , ξp ) and b = (β1 , . . . , βp ) , respectively, are correspond-
(k) (k+1)
ing partitions of the vectors. By exchanging ξj with ξj in the ¬rst
sum one gets the block-Gauss“Seidel method and then in the same way
the relaxed variants. The iteration (5.40) includes p vector equations. For
each of them we have to solve a system of equations with system matrix
Aii . To get an advantage compared to the pointwise method a much lower
e¬ort should be necessary than for the solution of the total system. This
can require ” if at all possible ” a rearranging of the variables and equa-
tions. The necessary permutations will not be noted explicitly here. Such
methods are applied in ¬nite di¬erence methods or other methods with
structured grids (see Section 4.1) if an ordering of nodes is possible such
that the matrices Aii are diagonal or tridiagonal and therefore the systems
of equations are solvable with O(mi ) operations.
As an example we again discuss the ¬ve-point stencil discretization of
the Poisson equation on a square with n + 1 nodes per space dimension.
The matrix A then has the form (1.14) with l = m = n. If the nodes are
numbered rowwise and we choose one block for each line, which means
p = n ’ 1 and mi = n ’ 1 for all i = 1, . . . , p, then the matrices Aii are
tridiagonal. On the other hand, if one chooses a partition of the indices of
the nodes in subsets Si such that a node with index in Si has neighbours
only in other index sets, then for such a selection and arbitrary ordering
within the index sets the matrices Aii become diagonal. Neighbours here
denote the nodes within a di¬erence stencil or more generally, those nodes
212 5. Iterative Methods for Systems of Linear Equations

« 
4 ’1 0 ’1 0 0 0 0 0
¬’1 4 ’1 0 ’1 0 0 0 0 ·
¬ ·
¬ 0 ’1 4 0 0 ’1 0 0 0 ·
¬ ·


. 8
( 16)