N

N

k=0 l=0 k=0

438 10. Orthogonal Polynomials in Approximation Theory

Therefore, looking at the ¬rst and last equality, we get

N ’1 N ’1

’(j’ N )k

ik(j’ N )h

, j = 0, . . . , N ’ 1. (10.55)

2

f (xj ) = fk e = fk WN

2

k=0 k=0

The mapping {f (xj )} ’ {fk } described by (10.52) is called the Discrete

Fourier Transform (DFT), while the mapping (10.55) from {fk } to {f (xj )}

is called the inverse transform (IDFT). Both DFT and IDFT can be written

in matrix form as {fk } = T{f (xj )} and {f (xj )} = C{fk } where T ∈

CN —N , C denotes the inverse of T and

1 (k’ N )j

k, j = 0, . . . , N ’ 1,

2

Tkj = W ,

NN

’(j’ N )k

j, k = 0, . . . , N ’ 1.

2

Cjk = WN ,

A naive implementation of the matrix-vector computation in the DFT and

IDFT would require N 2 operations. Using the FFT (Fast Fourier Trans-

form) algorithm only O(N log2 N ) ¬‚ops are needed, provided that N is a

power of 2, as will be explained in Section 10.9.2.

The function ΠF f ∈ SN introduced in (10.53) is the solution of the

N

minimization problem f ’ ΠF f N ¤ f ’ g N for any g ∈ SN , where

N

1/2

· N = (·, ·)N is a discrete norm for SN . In the case where f is periodic

with all its derivatives up to order s (s ≥ 1), an error estimate analogous

to that for Chebyshev and Legendre interpolation holds

¤ CN ’s f

f ’ ΠF f L2 (0,2π) s

N

and also

max |f (x) ’ ΠF f (x)| ¤ CN 1/2’s f s.

N

0¤x¤2π

In a similar manner, we also have

|(f, vN ) ’ (f, vN )N | ¤ CN ’s f vN

s

for any vN ∈ SN , and in particular, setting vN = 1 we have the following

error for the quadrature formula (10.51)

2π N ’1

f (xj ) ¤ CN ’s f

f (x)dx ’ h s

j=0

0

(see for the proof [CHQZ88], Chapter 2).

N ’1

Notice that h f (xj ) is nothing else than the composite trapezoidal

j=0

2π

rule for approximating the integral f (x)dx. Therefore, such a formula

0

10.9 Fourier Trigonometric Polynomials 439

turns out to be extremely accurate when dealing with periodic and smooth

integrands.

Programs 88 and 89 provide an implementation of the DFT and IDFT. The

input parameter f is a string containing the function f to be transformed

while fc is a vector of size N containing the values fk .

Program 88 - dft : Discrete Fourier transform

function fc = dft(N,f)

h = 2*pi/N; x=[0:h:2*pi*(1-1/N)]; fx = eval(f); wn = exp(-i*h);

for k=0:N-1,

s = 0;

for j=0:N-1

s = s + fx(j+1)*wnˆ((k-N/2)*j);

end

fc (k+1) = s/N;

end

Program 89 - idft : Inverse discrete Fourier transform

function fv = idft(N,fc)

h = 2*pi/N; wn = exp(-i*h);

for k=0:N-1

s = 0;

for j=0:N-1

s = s + fc(j+1)*wnˆ(-k*(j-N/2));

end

fv (k+1) = s;

end

10.9.1 The Gibbs Phenomenon

Consider the discontinuous function f (x) = x/π for x ∈ [0, π] and equal to

x/π ’ 2 for x ∈ (π, 2π], and compute its DFT using Program 88. The inter-

polate ΠF f is shown in Figure 10.3 (above) for N = 8, 16, 32. Notice the

N

spurious oscillations around the point of discontinuity of f whose maximum

amplitude, however, tends to a ¬nite limit. The arising of these oscillations

is known as Gibbs phenomenon and is typical of functions with isolated

jump discontinuities; it a¬ects the behavior of the truncated Fourier series

not only in the neighborhood of the discontinuity but also over the entire

interval, as can be clearly seen in the ¬gure. The convergence rate of the

truncated series for functions with jump discontinuities is linear in N ’1 at

every given non-singular point of the interval of de¬nition of the function

(see [CHQZ88], Section 2.1.4).

440 10. Orthogonal Polynomials in Approximation Theory

1

0.5

0

’0.5

’1

0 1 2 3 4 5 6

1

0.5

0

’0.5

’1

0 1 2 3 4 5 6

FIGURE 10.3. Above: Fourier interpolate of the sawtooth function (thick solid

line) for N = 8 (dash-dotted line), 16 (dashed line) and 32 (thin solid line).

Below: the same informations are plotted in the case of the Lanczos smoothing

Since the Gibbs phenomenon is related to the slow decay of the Fourier

coe¬cients of a discontinuous function, smoothing procedures can be prof-

itably employed to attenuate the higher-order Fourier coe¬cients. This can

be done by multiplying each coe¬cient fk by a factor σk such that σk is a

decreasing function of k. An example is provided by the Lanczos smoothing

sin(2(k ’ N/2)(π/N ))

k = 0, . . . , N ’ 1.

σk = , (10.56)

2(k ’ N/2)(π/N )

The e¬ect of applying the Lanczos smoothing to the computation of the

DFT of the above function f is represented in Figure 10.3 (below), which

shows that the oscillations have almost completely disappeared.

For a deeper analysis of this subject we refer to [CHQZ88], Chapter 2.

10.9.2 The Fast Fourier Transform

As pointed out in the previous section, computing the discrete Fourier

transform (DFT) or its inverse (IDFT) as a matrix-vector product, would

require N 2 operations. In this section we illustrate the basic steps of the

Cooley-Tukey algorithm [CT65], commonly known as Fast Fourier Trans-

form (FFT). The computation of a DFT of order N is split into DFTs of

order p0 , . . . , pm , where {pi } are the prime factors of N . If N is a power of

2, the computational cost has the order of N log2 N ¬‚ops.

A recursive algorithm to compute the DFT when N is a power of 2

is described in the following. Let f = (fi )T , i = 0, . . . , N ’ 1 and set

N ’1

fj xj . Then, computing the DFT of the vector f amounts to

1

p(x) = N

j=0

10.9 Fourier Trigonometric Polynomials 441

k’ N

) for k = 0, . . . , N ’1. Let us introduce the polynomials

2

evaluating p(WN

1 N

f0 + f2 x + . . . + fN ’2 x 2 ’1 ,

pe (x) =

N

1 N

f1 + f3 x + . . . + fN ’1 x 2 ’1 .

po (x) =

N

Notice that

p(x) = pe (x2 ) + xpo (x2 )

from which it follows that the computation of the DFT of f can be carried

2(k’ N )

2

out by evaluating the polynomials pe and po at the points WN ,k=

0, . . . , N ’ 1. Since

2πk

2(k’ N )

= exp ’i

2k’N k

2

WN = WN exp(i2π) = WN/2 ,

N/2

it turns out that we must evaluate pe and po at the principal roots of unity

of order N/2. In this manner the DFT of order N is rewritten in terms

of two DFTs of order N/2; of course, we can recursively apply again this

procedure to po and pe . The process is terminated when the degree of the

last generated polynomials is equal to one.

In Program 90 we propose a simple implementation of the FFT recursive

algorithm. The input parameters are the vector f containing the NN values

fk , where NN is a power of 2.

Program 90 - ¬trec : FFT algorithm in the recursive version

function [¬tv]=¬trec(f,NN)

N = length(f); w = exp(-2*pi*sqrt(-1)/N);

if N == 2

¬tv = f(1)+w.ˆ[-NN/2:NN-1-NN/2]*f(2);

else

a1 = f(1:2:N); b1 = f(2:2:N);

a2 = ¬trec(a1,NN); b2 = ¬trec(b1,NN);

for k=-NN/2:NN-1-NN/2

¬tv(k+1+NN/2) = a2(k+1+NN/2) + b2(k+1+NN/2)*wˆk;

end

end

Remark 10.4 A FFT procedure can also be set up when N is not a power

of 2. The simplest approach consists of adding some zero samples to the

˜

original sequence {fi } in such a way to obtain a total number of N = 2p

values. This technique, however, does not necessarily yield the correct re-

sult. Therefore, an e¬ective alternative is based on partitioning the Fourier

matrix C into subblocks of smaller size. Practical FFT implementations

can handle both strategies (see, for instance, the fft package available in

MATLAB).

442 10. Orthogonal Polynomials in Approximation Theory

10.10 Approximation of Function Derivatives

A problem which is often encountered in numerical analysis is the ap-

proximation of the derivative of a function f (x) on a given interval [a, b].

A natural approach to it consists of introducing in [a, b] n + 1 nodes

{xk , k = 0, . . . , n}, with x0 = a, xn = b and xk+1 = xk +h, k = 0, . . . , n’1

where h = (b ’ a)/n. Then, we approximate f (xi ) using the nodal values

f (xk ) as

m m

h ±k ui’k = βk f (xi’k ), (10.57)

k=’m k=’m

where {±k }, {βk } ∈ R are m + m + 1 coe¬cients to be determined and uk

is the desired approximation to f (xk ).

A non negligible issue in the choice of scheme (10.57) is the computa-

tional e¬ciency. Regarding this concern, it is worth noting that, if m = 0,

determining the values {ui } requires the solution of a linear system.

The set of nodes which are involved in constructing the derivative of f at

a certain node, is called a stencil. The band of the matrix associated with

system (10.57) increases as the stencil gets larger.

10.10.1 Classical Finite Di¬erence Methods

The simplest way to generate a formula like (10.57) consists of resorting to

the de¬nition of the derivative. If f (xi ) exists, then

f (xi + h) ’ f (xi )

f (xi ) = lim+ . (10.58)

h

h’0

Replacing the limit with the incremental ratio, with h ¬nite, yields the

approximation

f (xi+1 ) ’ f (xi )

0 ¤ i ¤ n ’ 1.

uF D = , (10.59)

i

h

Relation (10.59) is a special instance of (10.57) setting m = 0, ±0 = 1,

m = 1, β’1 = 1, β0 = ’1, β1 = 0.

The right side of (10.59) is called the forward ¬nite di¬erence and the

approximation that is being used corresponds to replacing f (xi ) with

the slope of the straight line passing through the points (xi , f (xi )) and

(xi+1 , f (xi+1 )), as shown in Figure 10.4.

To estimate the error that is made, it su¬ces to expand f in Taylor™s

series, obtaining

h2

with ξi ∈ (xi , xi+1 ).

f (xi+1 ) = f (xi ) + hf (xi ) + f (ξi )

2

10.10 Approximation of Function Derivatives 443

We assume henceforth that f has the required regularity, so that

h

f (xi ) ’ uF D = ’ f (ξi ). (10.60)

i

2

f (xi )

f (xi’1 )

f (xi+1 )

xi’1 xi xi+1

FIGURE 10.4. Finite di¬erence approximation of f (xi ): backward (solid line),

forward (pointed line) and centred (dashed line)

Obviously, instead of (10.58) we could employ a centred incremental

ratio, obtaining the following approximation

f (xi+1 ) ’ f (xi’1 )

1 ¤ i ¤ n ’ 1.

uCD = , (10.61)

i

2h

Scheme (10.61) is a special instance of (10.57) setting m = 0, ±0 = 1,

m = 1, β’1 = 1/2, β0 = 0, β1 = ’1/2.

The right side of (10.61) is called the centred ¬nite di¬erence and geometri-

cally amounts to replacing f (xi ) with the slope of the straight line passing

through the points (xi’1 , f (xi’1 )) and (xi+1 , f (xi+1 )) (see Figure 10.4).

Resorting again to Taylor™s series, we get

h2

f (xi ) ’ uCD = ’ f (ξi ). (10.62)

i

6

Formula (10.61) thus provides a second-order approximation to f (xi ) with

respect to h.

Finally, with a similar procedure, we can derive a backward ¬nite di¬er-

ence scheme, where

f (xi ) ’ f (xi’1 )

1 ¤ i ¤ n,

uBD = , (10.63)

i

h

444 10. Orthogonal Polynomials in Approximation Theory

which is a¬ected by the following error

h

f (xi ) ’ uBD = f (ξi ). (10.64)

i

2

The values of the parameters in (10.57) are m = 0, ±0 = 1, m = 1 and

β’1 = 0, β0 = 1, β1 = ’1.

Higher-order schemes, as well as ¬nite di¬erence approximations of higher-

order derivatives of f , can be constructed using Taylor™s expansions of

higher order. A remarkable example is the approximation of f ; if f ∈

C 4 ([a, b]) we easily get

f (xi+1 ) ’ 2f (xi ) + f (xi’1 )

f (xi ) =

h2

2

h

’ f (4) (xi + θi h) + f (4) (xi ’ ωi h) , 0 < θi , ωi < 1.

24

The following centred ¬nite di¬erence scheme can thus be derived

f (xi+1 ) ’ 2f (xi ) + f (xi’1 )

1¤i¤n’1

ui = , (10.65)

h2

which is a¬ected by the error

h2

f (xi ) ’ ui = ’ f (4) (xi + θi h) + f (4) (xi ’ ωi h) . (10.66)

24

Formula (10.65) provides a second-order approximation to f (xi ) with re-

spect to h.

10.10.2 Compact Finite Di¬erences

More accurate approximations are provided by using the following formula

(which we call compact di¬erences)

β γ

(fi+1 ’ fi’1 ) + (fi+2 ’ fi’2 )

±ui’1 + ui + ±ui+1 = (10.67)

2h 4h

for i = 2, . . . , n ’ 1. We have set, for brevity, fi = f (xi ).

The coe¬cients ±, β and γ are to be determined in such a way that the

relations (10.67) yield values ui that approximate f (xi ) up to the highest

order with respect to h. For this purpose, the coe¬cients are selected in

such a way as to minimize the consistency error (see Section 2.2)

(1) (1) (1)

’ ±fi+1

σi (h) = ±fi’1 + fi

(10.68)

β γ

’ (fi+1 ’ fi’1 ) + (fi+2 ’ fi’2 )

2h 4h

10.10 Approximation of Function Derivatives 445

which comes from “forcing” f to satisfy the numerical scheme (10.67). For

(k)

brevity, we set fi = f (k) (xi ), k = 1, 2, . . . .

Precisely, assuming that f ∈ C 5 ([a, b]) and expanding it in a Taylor™s

series around xi , we ¬nd

h2 (2) h3 (3) h4 (4) h5 (5)

(1)

fi±1 = fi ± hfi ± ± + O(h6 ),

+ 2 fi 6 fi + 24 fi 120 fi

h2 (3) h3 (4) h4 (5)

(1) (1) (2)

± hfi ± + O(h5 ).

fi±1 = fi + 2 fi 6 fi + 24 fi

Substituting into (10.68) we get

h2 (3) h4 (5)

(1) (1)

+ ± fi + ± fi ’ (β + γ)fi

σi (h) = (2± + 1)fi

2 12

h2 h4

β 2γ β

(3) (5)

’ ’ + 8γ fi + O(h6 ).

+ fi

2 6 3 60 2

Second-order methods are obtained by equating to zero the coe¬cient of

(1)

fi , i.e., if 2± + 1 = β + γ, while we obtain schemes of order 4 by equating

(3)

to zero also the coe¬cient of fi , yielding 6± = β +4γ and ¬nally, methods

(5)

of order 6 are obtained by setting to zero also the coe¬cient of fi , i.e.,

10± = β + 16γ.

The linear system formed by these last three equations has a nonsingular

matrix. Thus, there exists a unique scheme of order 6 that corresponds to

the following choice of the parameters

± = 1/3, β = 14/9, γ = 1/9, (10.69)

while there exist in¬nitely many methods of second and fourth order.

Among these in¬nite methods, a popular scheme has coe¬cients ± = 1/4,

β = 3/2 and γ = 0. Schemes of higher order can be generated at the

expense of furtherly expanding the computational stencil.

Traditional ¬nite di¬erence schemes correspond to setting ± = 0 and

allow for computing explicitly the approximant of the ¬rst derivative of f

at a node, in contrast with compact schemes which are required in any case

to solve a linear system of the form Au = Bf (where the notation has the

obvious meaning).

To make the system solvable, it is necessary to provide values to the vari-

ables ui with i < 0 and i > n. A particularly favorable instance is that

where f is a periodic function of period b ’ a, in which case ui+n = ui

for any i ∈ Z. In the nonperiodic case, system (10.67) must be supplied

by suitable relations at the nodes near the boundary of the approximation

interval. For example, the ¬rst derivative at x0 can be computed using the

relation

1

u0 + ±u1 = (Af1 + Bf2 + Cf3 + Df4 ),

h

446 10. Orthogonal Polynomials in Approximation Theory

and requiring that

1 ’ ± + 6D

3 + ± + 2D

A=’ , B = 2 + 3D, C = ’ ,

2 2

in order for the scheme to be at least second-order accurate (see [Lel92]

for the relations to enforce in the case of higher-order methods). Finally,

we notice that, for any given order of accuracy, compact schemes have a

stencil smaller than the one of standard ¬nite di¬erences.

Program 91 provides an implementation of the compact ¬nite di¬erence

schemes (10.67) for the approximation of the derivative of a given function f

which is assumed to be periodic on the interval [a, b). The input parameters

alpha, beta and gamma contain the coe¬cients of the scheme, a and b are

the endpoints of the interval, f is a string containing the expression of f

and n denotes the number of subintervals in which [a, b] is partitioned. The

output vectors u and x contain the computed approximate values ui and

the node coordinates. Notice that setting alpha=gamma=0 and beta=1 we

recover the centered ¬nite di¬erence approximation (10.61).

Program 91 - compdi¬ : Compact di¬erence schemes

function [u, x] = compdi¬(alpha,beta,gamma,a,b,n,f)

h=(b-a)/(n+1); x=[a:h:b]; fx = eval(f);

A = eye(n+2)+alpha*diag(ones(n+1,1),1)+alpha*diag(ones(n+1,1),-1);

rhs = 0.5*beta/h*(fx(4:n+1)-fx(2:n-1))+0.25*gamma/h*(fx(5:n+2)-fx(1:n-2));

if gamma == 0

rhs=[0.5*beta/h*(fx(3)-fx(1)), rhs, 0.5*beta/h*(fx(n+2)-fx(n))];

A(1,1:n+2) = zeros(1,n+2);

A(1,1) = 1; A(1,2)=alpha; A(1,n+1)=alpha;

rhs=[0.5*beta/h*(fx(2)-fx(n+1)), rhs];

A(n+2,1:n+2) = zeros(1,n+2);

A(n+2,n+2) = 1; A(n+2,n+1)=alpha; A(n+2,2)=alpha;

rhs=[rhs, 0.5*beta/h*(fx(2)-fx(n+1))];

else

rhs=[0.5*beta/h*(fx(3)-fx(1))+0.25*gamma/h*(fx(4)-fx(n+1)), rhs];

A(1,1:n+2) = zeros(1,n+2);

A(1,1) = 1; A(1,2)=alpha; A(1,n+1)=alpha;

rhs=[0.5*beta/h*(fx(2)-fx(n+1))+0.25*gamma/h*(fx(3)-fx(n)), rhs];

rhs=[rhs,0.5*beta/h*(fx(n+2)-fx(n))+0.25*gamma/h*(fx(2)-fx(n-1))];

A(n+2,1:n+2) = zeros(1,n+2);

A(n+2,n+2) = 1; A(n+2,n+1)=alpha; A(n+2,2)=alpha;

rhs=[rhs,0.5*beta/h*(fx(2)-fx(n+1))+0.25*gamma/h*(fx(3)-fx(n))];

end

u = A \ rhs™;

return

Example 10.4 Let us consider the approximate evaluation of the derivative of

the function f (x) = sin(x) on the interval [0, 2π]. Figure 10.5 shows the loga-

10.10 Approximation of Function Derivatives 447

rithm of the maximum nodal errors for the second-order centered ¬nite di¬er-

ence scheme (10.61) and of the fourth and sixth-order compact di¬erence schemes

•

introduced above, as a function of p = log(n).

0

10

’2

10

’4

10

’6

10

’8

10

’10

10

4 8 16 32 64

FIGURE 10.5. Maximum nodal errors for the second-order centred ¬nite di¬er-

ence scheme (solid line) and for the fourth (dashed line) and sixth-order (dotted

line) compact di¬erence schemes as functions of p = log(n)

Another nice feature of compact schemes is that they maximize the range

of well-resolved waves as we are going to explain. Assume that f is a real

and periodic function on [0, 2π], that is, f (0) = f (2π). Using the same

notation as in Section 10.9, we let N be an even positive integer and set

h = 2π/N . Then replace f by its truncated Fourier series

N/2’1

—

fk eikx .

fN (x) =

k=’N/2

¯

Since the function f is real-valued, fk = f ’k for k = 1, . . . , N/2 and

¯

f0 = f 0 . For sake of convenience, introduce the normalized wave number

wk = kh = 2πk/N and perform a scaling of the coordinates setting s = x/h.

As a consequence, we get

N/2’1 N/2’1

— iksh

fk eiwk s .

fN (x(s)) = fk e = (10.70)

k=’N/2 k=’N/2

Taking the ¬rst derivative of (10.70) with respect to s yields a function

whose Fourier coe¬cients are fk = iwk fk . We can thus estimate the ap-

—

proximation error on (fN ) by comparing the exact coe¬cients fk with the

corresponding ones obtained by an approximate derivative, in particular,

by comparing the exact wave number wk with the approximate one, say

wk,app .

448 10. Orthogonal Polynomials in Approximation Theory

Let us neglect the subscript k and perform the comparison over the whole

interval [0, π) where wk is varying. It is clear that methods based on the

Fourier expansion have wapp = w if w = π (wapp = 0 if w = π). The family

of schemes (10.67) is instead characterized by the wave number

a sin(z) + (b/2) sin(2z) + (c/3) sin(3z)

z ∈ [0, π)

wapp (z) = ,

1 + 2± cos(z) + 2β cos(2z)

(see [Lel92]). Figure 10.6 displays a comparison among wave numbers of

several schemes, of compact and non compact type.

The range of values for which the wave number computed by the numer-

ical scheme adequately approximates the exact wave number, is the set of

well-resolved waves. As a consequence, if wmin is the smallest well-resolved

wave, the di¬erence 1 ’ wmin /π represents the fraction of waves that are

unresolved by the numerical scheme. As can be seen in Figure 10.6, the

standard ¬nite di¬erence schemes approximate correctly the exact wave

number only for small wave numbers.

3

2.5

(d)

(c)

2

(b)

1.5

1

(a)

0.5

0

0.5 1 1.5 2 2.5 3

FIGURE 10.6. Computed wave numbers for centred ¬nite di¬erences (10.61) (a)

and for compact schemes of fourth (b), sixth (c) and tenth (d) order, compared

with the exact wave number (the straight line). On the x axis the normalized

coordinate s is represented

10.10.3 Pseudo-Spectral Derivative

An alternative way for numerical di¬erentiation consists of approximating

the ¬rst derivative of a function f with the exact ¬rst derivative of the

polynomial Πn f interpolating f at the nodes {x0 , . . . , xn }.

Exactly as happens for Lagrange interpolation, using equally spaced

nodes does not yield stable approximations to the ¬rst derivative of f for

n large. For this reason, we limit ourselves to considering the case where

the nodes are nonuniformly distributed according to the Gauss-Lobatto-

Chebyshev formula.

10.10 Approximation of Function Derivatives 449

For simplicity, assume that I = [a, b] = [’1, 1] and for n ≥ 1, take in

I the Gauss-Lobatto-Chebyshev nodes as in (10.21). Then, consider the

Lagrange interpolating polynomial ΠGL f , introduced in Section 10.3. We

n,w

de¬ne the pseudo-spectral derivative of f ∈ C 0 (I) to be the derivative of

the polynomial ΠGL f

n,w

Dn f = (ΠGL f ) ∈ Pn’1 (I).

n,w

The error made in replacing f with Dn f is of exponential type, that is, it

only depends on the smoothness of the function f . More precisely, there

exists a constant C > 0 independent of n such that

f ’ Dn f ¤ Cn1’m f m,w , (10.71)

w

for any m ≥ 2 such that the norm f m,w , introduced in (10.23), is ¬nite.

Recalling (10.19) and using (10.27) yields

n

f (¯j )¯j (¯i ),

(Dn f )(¯i ) =

x xlx i = 0, . . . , n, (10.72)

j=0

so that the pseudo-spectral derivative at the interpolation nodes can be

computed knowing only the nodal values of f and of ¯j . These values can

l

be computed once for all and stored in a matrix D ∈ R(n+1)—(n+1) : Dij =

¯ (¯i ) for i, j = 0, ..., n, called a pseudo-spectral di¬erentiation matrix.

lj x

Relation (10.72) can thus be cast in matrix form as f = Df , letting

f = [f (¯i )] and f = [(Dn f )(¯i )] for i = 0, ..., n.

x x

The entries of D have the following explicit form (see [CHQZ88], p. 69)

± l+j

dl (’1) ,

d x ’x l = j,

j ¯l ¯j

’¯j

x

1 ¤ l = j ¤ n ’ 1,

,

2(1 ’ x2 )

¯j

Dlj = (10.73)

2n2 + 1

’

, l = j = 0,

6

2n2 + 1

, l = j = n,

6

where the coe¬cients dl have been de¬ned in Section 10.3 (see also Example

5.13 concerning the approximation of the multiple eigenvalue » = 0 of D).

To compute the pseudo-spectral derivative of a function f over the generic

interval [a, b], we only have to resort to the change of variables considered

in Remark 10.3.

The second-order pseudo-spectral derivative can be computed as the

product of the matrix D and the vector f , that is, f = Df , or by di-

rectly applying matrix D2 to the vector f .

450 10. Orthogonal Polynomials in Approximation Theory

10.11 Transforms and Their Applications

In this section we provide a short introduction to the most relevant integral

transforms and discuss their basic analytical and numerical properties.

10.11.1 The Fourier Transform

De¬nition 10.1 Let L1 (R) denote the space of real or complex functions

de¬ned on the real line such that

∞

|f (t)| dt < +∞.

’∞

For any f ∈ L1 (R) its Fourier transform is a complex-valued function

F = F[f ] de¬ned as

∞

f (t)e’i2πνt dt.

F (ν) =

’∞

Should the independent variable t denote time, then ν would have the

meaning of frequency. Thus, the Fourier transform is a mapping that to a

function of time (typically, real-valued) associates a complex-valued func-

tion of frequency.

The following result provides the conditions under which an inversion for-

mula exists that allows us to recover the function f from its Fourier trans-

form F (for the proof see [Rud83], p. 199).

Property 10.3 (inversion theorem) Let f be a given function in L1 (R),

F ∈ L1 (R) be its Fourier transform and g be the function de¬ned by

∞

t ∈ R.

F (ν)ei2πνt dν,

g(t) = (10.74)

’∞

Then g ∈ C 0 (R), with lim|x|’∞ g(x) = 0, and f (t) = g(t) almost every-

where in R (i.e., for any t unless possibly a set of zero measure).

The integral at right hand side of (10.74) is to be meant in the Cauchy

principal value sense, i.e., we let

∞ a

i2πνt

F (ν)ei2πνt dν

F (ν)e dν = lim

a’∞

’∞ ’a

10.11 Transforms and Their Applications 451

and we call it the inverse Fourier transform or inversion formula of the

Fourier transform. This mapping that associates to the complex function

F the generating function f will be denoted by F ’1 [F ], i.e., F = F[f ] i¬

f = F ’1 [F ].

Let us brie¬‚y summarize the main properties of the Fourier transform and

its inverse.

1. F and F ’1 are linear operators, i.e.

F[±f + βg] = ±F[f ] + βF[g], ∀±, β ∈ C,

(10.75)

F ’1 [±F + βG] = ±F ’1 [F ] + βF ’1 [G], ∀±, β ∈ C;

2. scaling: if ± is any nonzero real number and f± is the function f± (t) =

f (±t), then

1

F[f± ] = F1

|±| ±

where F ± (ν) = F (ν/±);

1

3. duality: let f (t) be a given function and F (ν) be its Fourier trans-

form. Then the function g(t) = F (’t) has a Fourier transform given

by f (ν). Thus, once an associated function-transform pair is found,

another dual pair is automatically generated. An application of this

property is provided by the pair r(t)-F[r] in Example 10.5;

4. parity: if f (t) is a real even function then F (ν) is real and even, while

if f (t) is a real and odd function then F (ν) is imaginary and odd.

This property allows one to work only with nonnegative frequencies;

5. convolution and product: for any given functions f, g ∈ L1 (R), we

have

F[f — g] = F[f ]F[g], F[f g] = F — G, (10.76)

where the convolution integral of two functions φ and ψ is given by

∞

(φ — ψ)(t) = φ(„ )ψ(t ’ „ ) d„. (10.77)

’∞

Example 10.5 We provide two examples of the computation of the Fourier

transforms of functions that are typically encountered in signal processing.

Let us ¬rst consider the square wave (or rectangular) function r(t) de¬ned as

if ’ ¤t¤

T T

A ,

2 2

r(t) =

0 otherwise,

452 10. Orthogonal Polynomials in Approximation Theory

where T and A are two given positive numbers. Its Fourier transform F[r] is the

function

T /2

sin(πνT )

Ae’i2πνt dt = AT ν∈R

F (ν) = ,

πνT

’T /2

where AT is the area of the rectangular function.

Let us consider the sawtooth function

±

2At if ’ T ¤ t ¤ T

,

2 2

T

s(t) =

0 otherwise,

whose DFT is shown in Figure 10.3 and whose Fourier transform F[s] is the

function

AT sin(πνT )

cos(πνT ) ’ ν∈R

F (ν) = i ,

πνT πνT

and is purely imaginary since s is an odd real function. Notice also that the

functions r and s have a ¬nite support whereas their transforms have an in¬nite

support (see Figure 10.7). In signal theory this corresponds to saying that the

•

transform has an in¬nite bandwidth.

1 0.5

0.4

0.8

0.3

0.6 0.2

0.1

0.4

0

0.2

’0.1

’0.2

0

’0.3

’0.2

’0.4

’0.4 ’0.5

,

’10 ’8 ’6 ’4 ’2 0 2 4 6 8 10 ’10 ’8 ’6 ’4 ’2 0 2 4 6 8 10

FIGURE 10.7. Fourier transforms of the rectangular (left) and the sawtooth

(right) functions

Example 10.6 The Fourier transform of a sinusoidal function is of paramount

interest in signal and communication systems. To start with, consider the constant

function f (t) = A for a given A ∈ R. Since it has an in¬nite time duration its

Fourier transform F[A] is the function

a

sin(2πνa)

Ae’i2πνt dt = A lim

F (ν) = lim ,

πν

a’∞ a’∞

’a

where the integral above is again the Cauchy principal value of the corresponding

integral over (’∞, ∞). It can be proved that the limit exists and is unique in the

sense of distributions (see Section 12.4) yielding

F (ν) = Aδ(ν), (10.78)

10.11 Transforms and Their Applications 453

where δ is the so-called Dirac mass, i.e., a distribution that satis¬es

∞

δ(ξ)φ(ξ) dξ = φ(0) (10.79)

’∞

for any function φ continuous at the origin. From (10.78) we see that the trans-

form of a function with in¬nite time duration has a ¬nite bandwidth.

Let us now consider the computation of the Fourier transform of the function

f (t) = A cos(2πν0 t) where ν0 is a ¬xed frequency. Recalling Euler™s formula

eiθ + e’iθ

θ ∈ R,

cos(θ) = ,

2

and applying (10.78) twice we get

A A

F[A cos(2πν0 t)] = δ(ν ’ ν0 ) + δ(ν + ν0 ),

2 2

which shows that the spectrum of a sinusoidal function with frequency ν0 is

centred around ±ν0 (notice that the transform is even and real since the same

•

holds for the function f (t)).

It is worth noting that in real-life there do not exist functions (i.e. signals)

with in¬nite duration or bandwidth. Actually, if f (t) is a function whose

value may be considered as “negligible” outside of some interval (ta , tb ),

then we can assume that the e¬ective duration of f is the length ∆t =

tb ’ ta . In a similar manner, if F (ν) is the Fourier transform of f and

it happens that F (ν) may be considered as “negligible” outside of some

interval (νa , νb ), then the e¬ective bandwidth of f is ∆ν = νb ’νa . Referring

to Figure 10.7, we clearly see that the e¬ective bandwidth of the rectangular

function can be taken as (’10, 10).

10.11.2 (Physical) Linear Systems and Fourier Transform

Mathematically speaking, a physical linear system (LS) can be regarded

as a linear operator S that enjoys the linearity property (10.75). Denoting

by i(t) and u(t) an admissible input function for S and its corresponding

output function respectively, the LS can be represented as u(t) = S(i(t))

or S : i ’ u. A special category of LS are the so-called shift invariant (or

time-invariant) linear systems (ILS) which satisfy the property

S(i(t ’ t0 )) = u(t ’ t0 ), ∀t0 ∈ R

and for any admissible input function i.

Let S be an ILS system and let f and g be two admissible input functions

for S with w = S(g). An immediate consequence of the linearity and shift-

invariance is that

S((f — g)(t)) = (f — S(g))(t) = (f — w)(t) (10.80)

454 10. Orthogonal Polynomials in Approximation Theory

where — is the convolution operator de¬ned in (10.77).

Assume we take as input function the impulse function δ(t) introduced in

the previous section and denote by h(t) = S(δ(t)) the corresponding output

through S (usually referred to as the system impulse response function).

Property (10.79) implies that for any function φ, (φ — δ)(t) = φ(t), so that,

recalling (10.80) and taking φ(t) = i(t) we have

u(t) = S(i(t)) = S(i — δ)(t) = (i — S(δ))(t) = (i — h)(t).

Thus, S can be completely described through its impulse response function.

Equivalently, we can pass to the frequency domain by means of the ¬rst

relation in (10.76) obtaining

U (ν) = I(ν)H(ν), (10.81)

where I, U and H are the Fourier transforms of i(t), u(t) and h(t), respec-

tively; H is the so-called system transfer function.

Relation (10.81) plays a central role in the analysis of linear time-invariant

systems as it is simpler to deal with the system transfer function than with

the corresponding impulse response function, as demonstrated in the fol-

lowing example.

Example 10.7 (ideal low-pass ¬lter) An ideal low-pass ¬lter is an ILS char-

acterized by the transfer function

if |ν| ¤ ν0 /2,

1,

H(ν) =

0, otherwise.

Using the duality property, the impulse response function F ’1 [H] is

sin(πν0 t)

h(t) = ν0 .

πν0 t

Given an input signal i(t) with Fourier transform I(ν), the corresponding output

u(t) has a spectrum given by (10.81)

if |ν| ¤ ν0 /2,

I(ν),

I(ν)H(ν) =

0 otherwise.

The e¬ect of the ¬lter is to cut o¬ the input frequencies that lie outside the

window |ν| ¤ ν0 /2. •

The input/output functions i(t) and u(t) usually denote signals and

the linear system described by H(ν) is typically a communication system.

Therefore, as pointed out at the end of Section 10.11.1, we are legitimated

in assuming that both i(t) and u(t) have a ¬nite e¬ective duration. In

10.11 Transforms and Their Applications 455

particular, referring to i(t) we suppose i(t) = 0 if t ∈ [0, T0 ). Then, the

computation of the Fourier transform of i(t) yields

T0

i(t)e’i2πνt dt.

I(ν) =

0

Letting ∆t = T0 /n for n ≥ 1 and approximating the integral above by the

composite trapezoidal formula (9.14), we get

n’1

i(k∆t)e’i2πνk∆t .

˜

I(ν) = ∆t

k=0

˜

It can be proved (see, e.g., [Pap62]) that I(ν)/∆t is the Fourier transform

of the so-called sampled signal

∞

i(k∆t)δ(t ’ k∆t),

is (t) =

k=’∞

where δ(t ’ k∆t) is the Dirac mass at k∆t. Then, using the convolution

and the duality properties of the Fourier transform, we get

∞

j

˜ I ν’

I(ν) = , (10.82)

∆t

j=’∞

which amounts to replacing I(ν) by its periodic repetition with period

1/∆t. Let J∆t = [’ 2∆t , 2∆t ]; then, it su¬ces to compute (10.82) for ν ∈

1 1

J∆t . This can be done numerically by introducing a uniform discretization

of J∆t with frequency step ν0 = 1/(m∆t) for m ≥ 1. By doing so, the

˜

computation of I(ν) requires evaluating the following m+1 discrete Fourier

transforms (DFT)

n’1

i(k∆t)e’i2πjν0 k∆t , j = ’ m , . . . , m .

˜

I(jν0 ) = ∆t 2 2

k=0

For an e¬cient computation of each DFT in the formula above it is crucial

to use the FFT algorithm described in Section 10.9.2.

10.11.3 The Laplace Transform

The Laplace transform can be employed to solve ordinary di¬erential equa-

tions with constant coe¬cients as well as partial di¬erential equations.

De¬nition 10.2 Let f ∈ L1 ([0, ∞)) i.e., f ∈ L1 ([0, T ]) for any T > 0.

loc

Let s = σ + iω be a complex variable. The Laplace integral of f is de¬ned

456 10. Orthogonal Polynomials in Approximation Theory

as

∞ T

f (t)e’st dt = lim f (t)e’st dt.

T ’∞

0 0

If this integral exists for some s, it turns out to be a function of s; then,

the Laplace transform L[f ] of f is the function

∞

f (t)e’st dt.

L(s) =

0

The following relation between Laplace and Fourier transforms holds

L(s) = F (e’σt f (t)),

˜

˜ ˜

where f (t) = f (t) if t ≥ 0 while f (t) = 0 if t < 0.

Example 10.8 The Laplace transform of the unit step function f (t) = 1 if t > 0,

f (t) = 0 otherwise, is given by

∞

1

e’st dt =

L(s) = .

s

0

•

We notice that the Laplace integral exists if σ > 0.

In Example 10.8 the convergence region of the Laplace integral is the half-

plane {Re(s) > 0} of the complex ¬eld. This property is quite general, as

stated by the following result.

Property 10.4 If the Laplace transform exists for s = s then it exists

¯

for all s with Re(s) > Re(¯). Moreover, let E be the set of the real parts

s

of s such that the Laplace integral exists and denote by » the in¬mum of

E. If » happens to be ¬nite, the Laplace integral exists in the half-plane

Re(s) > ». If » = ’∞ then it exists for all s ∈ C; » is called the abscissa

of convergence.

We recall that the Laplace transform enjoys properties completely analo-

gous to those of the Fourier transform. The inverse Laplace transform is

denoted formally as L’1 and is such that

f (t) = L’1 [L(s)].

10.11 Transforms and Their Applications 457

Example 10.9 Let us consider the ordinary di¬erential equation y (t) + ay(t) =

g(t) with y(0) = y0 . Multiplying by est , integrating between 0 and ∞ and passing

to the Laplace transform, yields

sY (s) ’ y0 + aY (s) = G(s). (10.83)

Should G(s) be easily computable, (10.83) would furnish Y (s) and then, by ap-

plying the inverse Laplace transform, the generating function y(t). For instance,

if g(t) is the unit step function, we obtain

11 1 y0 1

y(t) = L’1 (1 ’ e’at ) + y0 e’at .

’ + =

as s+a s+a a

•

For an extensive presentation and analysis of the Laplace transform see,

e.g., [Tit37]. In the next section we describe a discrete version of the Laplace

transform, known as the Z-transform.

10.11.4 The Z-Transform

De¬nition 10.3 Let f be a given function, de¬ned for any t ≥ 0, and

∆t > 0 be a given time step. The function

∞

f (n∆t)z ’n , z∈C

Z(z) = (10.84)

n=0

is called the Z-transform of the sequence {f (n∆t)} and is denoted by

Z[f (n∆t)].

The parameter ∆t is the sampling time step of the sequence of samples

f (n∆t). The in¬nite sum (10.84) converges if

|z| > R = lim sup n |f (n∆t)|.

n’∞

It is possible to deduce the Z-transform from the Laplace transform as

follows. Denoting by f0 (t) the piecewise constant function such that f0 (t) =

f (n∆t) for t ∈ (n∆t, (n + 1)∆t), the Laplace transform L[f0 ] of f0 is the

function

∞ (n+1)∆t

∞

f0 (t)e’st dt = e’st f (n∆t) dt

L(s) =

n=0 n∆t

0

∞ ∞

e’ns∆t ’ e’(n+1)s∆t 1 ’ e’s∆t

f (n∆t)e’ns∆t .

= f (n∆t) =

s s

n=0 n=0

458 10. Orthogonal Polynomials in Approximation Theory

The discrete Laplace transform Z d [f0 ] of f0 is the function

∞

f (n∆t)e’ns∆t .

d

Z (s) =

n=0

Then, the Z-transform of the sequence {f (n∆t), n = 0, . . . , ∞} coincides

with the discrete Laplace transform of f0 up to the change of variable

z = e’s∆t . The Z-transform enjoys similar properties (linearity, scaling,

convolution and product) to those already seen in the continuous case.

The inverse Z-transform is denoted by Z ’1 and is de¬ned as

f (n∆t) = Z ’1 [Z(z)].

The practical computation of Z ’1 can be carried out by resorting to classi-

cal techniques of complex analysis (for example, using the Laurent formula

or the Cauchy theorem for residual integral evaluation) coupled with an

extensive use of tables (see, e.g., [Pou96]).

10.12 The Wavelet Transform

This technique, originally developed in the area of signal processing, has

successively been extended to many di¬erent branches of approximation

theory, including the solution of di¬erential equations. It is based on the

so-called wavelets, which are functions generated by an elementary wavelet

through traslations and dilations. We shall limit ourselves to a brief intro-

duction of univariate wavelets and their transform in both the continuous

and discrete cases referring to [DL92], [Dau88] and to the references cited

therein for a detailed presentation and analysis.

10.12.1 The Continuous Wavelet Transform

Any function

t’„

1

hs,„ (t) = √ h t∈R

, (10.85)

s

s

that is obtained from a reference function h ∈ L2 (R) by means of traslations

by a traslation factor „ and dilations by a positive scaling factor s is called

a wavelet. The function h is called an elementary wavelet.

Its Fourier transform, written in terms of ω = 2πν, is

√

Hs,„ (ω) = sH(sω)e’iω„ , (10.86)

where i denotes the imaginary unit and H(ω) is the Fourier transform of

the elementary wavelet. A dilation t/s (s > 1) in the real domain produces

10.12 The Wavelet Transform 459

therefore a contraction sω in the frequency domain. Therefore, the factor

1/s plays the role of the frequency ν in the Fourier transform (see Section

10.11.1). In wavelets theory s is usually referred to as the scale. Formula

(10.86) is known as the ¬lter of the wavelet transform.

De¬nition 10.4 Given a function f ∈ L2 (R), its continuous wavelet trans-

form Wf = W[f ] is a decomposition of f (t) onto a wavelet basis {hs,„ (t)},

that is

∞

¯

Wf (s, „ ) = f (t)hs,„ (t) dt, (10.87)

’∞

where the overline bar denotes complex conjugate.

When t denotes the time-variable, the wavelet transform of f (t) is a func-

tion of the two variables s (scale) and „ (time shift); as such, it is a repre-

sentation of f in the time-scale space and is usually referred to as time-scale

joint representation of f . The time-scale representation is the analogue of

the time-frequency representation introduced in the Fourier analysis. This

latter representation has an intrinsic limitation: the product of the res-

olution in time ∆t and the resolution in frequency ∆ω must satisfy the

following constraint (Heisenberg inequality)

1

∆t∆ω ≥ (10.88)

2

which is the counterpart of the Heisenberg uncertainty principle in quantum

mechanics. This inequality states that a signal cannot be represented as

a point in the time-frequency space. We can only determine its position

within a rectangle of area ∆t∆ω in the time-frequency space.

The wavelet transform (10.87) can be rewritten in terms of the Fourier

transform F (ω) of f as

√ ∞

s ¯

F (ω)H(sω)eiω„ dω,

Wf (s, „ ) =

2π

’∞

which shows that the wavelets transform is a bank of wavelet ¬lters char-

acterized by di¬erent scales. More precisely, if the scale is small the wavelet

is concentrated in time and the wavelet transform provides a detailed de-

scription of f (t) (which is the signal). Conversely, if the scale is large, the

wavelet transform is able to resolve only the large-scale details of f . Thus,

the wavelet transform can be regarded as a bank of multiresolution ¬lters.

The theoretical properties of this transform do not depend on the partic-

ular elementary wavelet that is considered. Hence, speci¬c bases of wavelets

can be derived for speci¬c applications. Some examples of elementary wave-

lets are reported below.

460 10. Orthogonal Polynomials in Approximation Theory

Example 10.10 (Haar wavelets) These functions can be obtained by choos-

ing as the elementary wavelet the Haar function de¬ned as

±

if x ∈ (0, 1 ),

1 2

’1 if x ∈ ( 1 , 1),

h(x) =

2

0 otherwise.

Its Fourier transform is the complex-valued function

ω

H(ω) = 4ie’iω/2 1 ’ cos( ) /ω,

2

which has symmetric module with respect to the origin (see Figure 10.8). The

bases that are obtained from this wavelet are not used in practice due to their

•

ine¬ective localization properties in the frequency domain.

1.5

1.5

1

1

0.5

0

0.5

’0.5

’1

0

’1.5

’80 ’60 ’40 ’20 0 20 40 60 80

’0.5 0 0.5 1 1.5

FIGURE 10.8. The Haar wavelet (left) and the module of its Fourier transform

(right)

Example 10.11 (Morlet wavelets) The Morlet wavelet is de¬ned as follows

(see [MMG87])

2

h(x) = eiω0 x e’x /2

.

Thus, it is a complex-valued function whose real part has a real positive Fourier

transform, symmetric with respect to the origin, given by

√ 2 2

π e’(ω’ω0 ) /2 + e’(ω+ω0 ) /2 .

H(ω) =

•

We point out that the presence of the dilation factor allows for the wavelets

to easily handle possible discontinuities or singularities in f . Indeed, using

the multi-resolution analysis, the signal, properly divided into frequency

bandwidths, can be processed at each frequency by suitably tuning up the

scale factor of the wavelets.

10.12 The Wavelet Transform 461

1.6

1

0.8 1.4

0.6

1.2

0.4

1

0.2

0.8

0

’0.2

0.6

’0.4

0.4

’0.6

0.2

’0.8

0

’1

’10 ’8 ’6 ’4 ’2 0 2 4 6 8 10

’10 ’8 ’6 ’4 ’2 0 2 4 6 8 10

FIGURE 10.9. The real part of the Morlet wavelet (left) and the real part of the

corresponding Fourier transforms (right) for ω0 = 1 (solid line), ω0 = 2.5 (dashed

line) and ω0 = 5 (dotted line)

Recalling what was already pointed out in Section 10.11.1, the time lo-

calization of the wavelet gives rise to a ¬lter with in¬nite bandwidth. In

particular, de¬ning the bandwidth ∆ω of the wavelet ¬lter as

«∞ 2

∞

∆ω = |H(ω)|2 dω ,

ω 2 |H(ω)|2 dω/

’∞ ’∞

then the bandwidth of the wavelet ¬lter with scale equal to s is

«∞ 2

∞

1

∆ωs = ω 2 |H(sω)|2 dω/ |H(sω)|2 dω = ∆ω.

s

’∞ ’∞

Consequently, the quality factor Q of the wavelet ¬lter, de¬ned as the in-

verse of the bandwidth of the ¬lter, is independent of s since

1/s

Q= = ∆ω

∆ωs

provided that (10.88) holds. At low frequencies, corresponding to large

values of s, the wavelet ¬lter has a small bandwidth and a large temporal

width (called window) with a low resolution. Conversely, at high frequencies

the ¬lter has a large bandwidth and a small temporal window with a high

resolution. Thus, the resolution furnished by the wavelet analysis increases

with the frequency of the signal. This property of adaptivity makes the

wavelets a crucial tool in the analysis of unsteady signals or signals with

fast transients for which the standard Fourier analysis turns out to be

ine¬ective.

10.12.2 Discrete and Orthonormal Wavelets

The continuous wavelet transform maps a function of one variable into a bi-

dimensional representation in the time-scale domain. In many applications

462 10. Orthogonal Polynomials in Approximation Theory

this description is excessively rich. Resorting to the discrete wavelets is

an attempt to represent a function using a ¬nite (and small) number of

parameters.

A discrete wavelet is a continuous wavelet that is generated by using

discrete scale and translation factors. For s0 > 1, denote by s = sj the

0

scale factors; the dilation factors usually depend on the scale factors by

setting „ = k„0 sj , „0 ∈ R. The corresponding discrete wavelet is

0

’j/2 ’j/2

h(s’j (t ’ k„0 sj )) = s0 h(s’j t ’ k„0 ).

hj,k (t) = s0 0 0 0

The scale factor sj corresponds to the magni¬cation or the resolution of

0

the observation, while the translation factor „0 is the location where the

observations are made. If one looks at very small details, the magni¬cation

must be large, which corresponds to large negative index j. In this case the

step of translation is small and the wavelet is very concentrated around the

observation point. For large and positive j, the wavelet is spread out and

large translation steps are used.