Proof. We prove the smooth case and indicate the changes for the real analytic

case. The proof will use an algorithm.

Note ¬rst that by (50.10) (by (50.12) in the real analytic case) the characteristic

polynomial

P (A(t))(») = det(A(t) ’ »I)

(1)

= »n ’ a1 (t)»n’1 + a2 (t)»n’2 ’ · · · + (’1)n an (t)

n

tr(Λi A(t))»n’i

=

i=0

is smoothly solvable (real analytically solvable), with smooth (real analytic) roots

»1 (t), . . . , »n (t) on the whole parameter interval.

Case 1: distinct eigenvalues. If A(0) has some eigenvalues distinct, then one

can reorder them in such a way that for i0 = 0 < 1 ¤ i1 < i2 < · · · < ik < n = ik+1

we have

»1 (0) = · · · = »i1 (0) < »i1 +1 (0) = · · · = »i2 (0) < · · · < »ik +1 (0) = · · · = »n (0).

50.14

50.14 50. Applications to perturbation theory of operators 547

For t near 0 we still have

»1 (t), . . . , »i1 (t) < »i1 +1 (t), . . . , »i2 (t) < · · · < »ik +1 (t), . . . , »n (t).

For j = 1, . . . , k + 1 we consider the subspaces

ij

(j)

{v ∈ V : (A(t) ’ »i (t))v = 0}.

Vt =

i=ij’1 +1

(j)

Then each Vt runs through a smooth (real analytic) vector subbundle of the

trivial bundle (’µ, µ) — V ’ (’µ, µ), which admits a smooth (real analytic) framing

k+1 (j)

eij’1 +1 (t), . . . , eij (t). We have V = j=1 Vt for each t.

In order to prove this statement, note that

(j)

= ker (A(t) ’ »ij’1 +1 (t)) —¦ . . . —¦ (A(t) ’ »ij (t)) ,

Vt

(j)

so Vt is the kernel of a smooth (real analytic) vector bundle homomorphism B(t)

of constant rank (even of constant dimension of the kernel), and thus is a smooth

(real analytic) vector subbundle. This together with a smooth (real analytic) frame

¬eld can be shown as follows: Choose a basis of V , constant in t, such that A(0)

is diagonal. Then by the elimination procedure one can construct a basis for the

kernel of B(0). For t near 0, the elimination procedure (with the same choices)

gives then a basis of the kernel of B(t); the elements of this basis are then smooth

(real analytic) in t for t near 0.

From the last result it follows that it su¬ces to ¬nd smooth (real analytic) eigen-

vectors in each subbundle V (j) separately, expanded in the smooth (real analytic)

frame ¬eld. But in this frame ¬eld the vector subbundle looks again like a constant

vector space. So feed each of these parts (A restricted to V (j) , as matrix with

respect to the frame ¬eld) into case 2 below.

Case 2: All eigenvalues at 0 are equal. So suppose that A(t) : V ’ V is

Hermitian with all eigenvalues at t = 0 equal to a1n , see (1).

(0)

Eigenvectors of A(t) are also eigenvectors of A(t)’ a1n I, so we may replace A(t) by

(t)

A(t) ’ a1n I and assume that for the characteristic polynomial (1) we have a1 = 0,

(t)

or assume without loss that »i (0) = 0 for all i, and so A(0) = 0.

If A(t) = 0 for all t we choose the eigenvectors constant.

(1)

Otherwise, let Aij (t) = tAij (t). From (1) we see that the characteristic polynomial

of the Hermitian matrix A(1) (t) is P1 (t) in the notation of (50.8), thus m(ai ) ≥ i

for 2 ¤ i ¤ n, which also follows from (50.5).

The eigenvalues of A(1) (t) are the roots of P1 (t), which may be chosen in a smooth

way, since they again satisfy the condition of theorem (50.10). In the real analytic

case we just have to invoke (50.12). Note that eigenvectors of A(1) are also eigen-

vectors of A. If the eigenvalues are still all equal, we apply the same procedure

50.14

548 Chapter X. Further Applications 50.16

again, until they are not all equal: we arrive at this situation by the assumption of

the theorem in the smooth case, and automatically in the real analytic case. Then

we apply case 1.

This algorithm shows that one may choose the eigenvectors xi (t) of Ai (t) in a

smooth (real analytic) way, locally in t. It remains to extend this to the whole

parameter interval.

If some eigenvalues coincide locally then on the whole of R, by the assumption. The

corresponding eigenspaces then form a smooth (real analytic) vector bundle over

R, by case 1, since those eigenvalues, which meet in isolated points are di¬erent

after application of case 2.

(j) (j)

So we we get V = Wt where the Wt are real analytic sub vector bundles of

V —R, whose dimension is the generic multiplicity of the corresponding smooth (real

analytic) eigenvalue function. It su¬ces to ¬nd global orthonormal smooth (real

analytic) frames for each of these; this exists since the vector bundle is smoothly

(real analytically) trivial, by using parallel transport with respect to a smooth (real

analytic) Hermitian connection.

50.15. Example. (see [Rellich, 1937, section 2]) That the last result cannot be

improved is shown by the following example which rotates a lot:

cos 1 ’ sin 1 1

»± (t) = ±e’ t2 ,

t t

x+ (t) := , x’ (t) := ,

1 1

sin t cos t

»+ (t) 0

(x+ (t), x’ (t))’1

A(t) := (x+ (t), x’ (t))

0 »’ (t)

cos 2 sin 2

’ t1 t t

=e .

2

2

’ cos 2

sin t t

Here t ’ A(t) and t ’ »± (t) are smooth, whereas the eigenvectors cannot be

chosen continuously.

50.16. Theorem. Let t ’ A(t) be a smooth curve of unbounded self-adjoint oper-

ators in a Hilbert space with common domain of de¬nition and compact resolvent.

Then the eigenvalues of A(t) may be arranged increasingly ordered in such a way

that each eigenvalue is continuous, and they can be rearranged in such a way that

they become C 1 -functions.

Suppose, moreover, that no two of the continuous eigenvalues meet of in¬nite order

at any t ∈ R if they are not equal. Then the eigenvalues and the eigenvectors can

be chosen smoothly in t on the whole parameter domain.

If on the other hand t ’ A(t) is a real analytic curve of unbounded self-adjoint

operators in a Hilbert space with common domain of de¬nition and with compact

resolvent. Then the eigenvalues and the eigenvectors can be chosen smoothly in t,

on the whole parameter domain.

The real analytic version of this theorem is due to [Rellich, 1940], see also [Kato,

1976, VII, 3.9] the smooth version is due to [Alekseevsky, Kriegl, Losik, Michor,

1996]; the proof follows the lines of the latter paper.

50.16

50.16 50. Applications to perturbation theory of operators 549

That A(t) is a smooth curve of unbounded operators means the following: There is

a dense subspace V of the Hilbert space H such that V is the domain of de¬nition

of each A(t) and such that A(t)— = A(t) with the same domains V , where the

adjoint operator A(t)— is de¬ned by A(t)u, v = u, A(t)— v for all v for which the

left hand side is bounded as functional in u ∈ V ‚ H. Moreover, we require that

t ’ A(t)u, v is smooth for each u ∈ V and v ∈ H. This implies that t ’ A(t)u is

smooth R ’ H for each v ∈ V by (2.3). Similar for the real analytic case, by (7.4).

The ¬rst part of the proof will show that t ’ A(t) smooth implies that the resolvent

(A(t) ’ z)’1 is smooth in t and z jointly, and mainly this is used later in the proof.

It is well known and in the proof we will show that if for some (t, z) the resolvent

(A(t) ’ z)’1 is compact then for all t ∈ R and z in the resolvent set of A(t).

Proof. We shall prove the smooth case and indicate the changes for the real ana-

lytic case.

For each t consider the norm u 2 := u 2 + A(t)u 2 on V . Since A(t) = A(t)—

t

is closed, (V, t ) is also a Hilbert space with inner product u, v t := u, v +

s ) ’ (V,

A(t)u, A(t)v . All these norms are equivalent since (V, t+ t)

is continuous and bijective, so an isomorphism by the open mapping theorem. Then

t ’ u, v t is smooth for ¬xed u, v ∈ V , and by the multilinear uniform boundedness

principle (5.18), the mapping t ’ , t is smooth and into the space of bounded

bilinear forms; in the real analytic case we use (11.14) instead. By the exponential

law (3.12) the mapping (t, u) ’ u 2 is smooth from R — (V, s ) ’ R for each

t

¬xed s. In the real analytic case we use (11.18) instead. Thus, all Hilbert norms

t are equivalent, since { u t : |t| ¤ K, u s ¤ 1} is bounded by LK,s in R, so

u t ¤ LK,s u s for all |t| ¤ K. Moreover, each A(s) is a globally de¬ned operator

t ) ’ H with closed graph and is thus bounded, and by using again the

(V,

(multi)linear uniform boundedness principle (5.18) (or (11.14) in the real analytic

case) as above we see that s ’ A(s) is smooth (real analytic) R ’ L((V, t ), H).

If for some (t, z) ∈ R — C the bounded operator A(t) ’ z : V ’ H is invertible, then

this is true locally and (t, z) ’ (A(t) ’ z)’1 : H ’ V is smooth since inversion is

smooth on Banach spaces.

Since each A(t) is Hermitian the global resolvent set {(t, z) ∈ R — C : (A(t) ’ z) :

V ’ H is invertible} is open, contains R — (C \ R), and hence is connected.

Moreover (A(t) ’ z)’1 : H ’ H is a compact operator for some (equivalently any)

(t, z) if and only if the inclusion i : V ’ H is compact, since i = (A(t) ’ z)’1 —¦

(A(t) ’ z) : V ’ H ’ H.

Let us ¬x a parameter s. We choose a simple smooth curve γ in the resolvent set

of A(s) for ¬xed s.

(1) Claim. For t near s, there are C 1 -functions t ’ »i (t) : 1 ¤ i ¤ N which

parameterize all eigenvalues (repeated according to their multiplicity) of

A(t) in the interior of γ. If no two of the generically di¬erent eigenvalues

meet of in¬nite order they can be chosen smoothly.

50.16

550 Chapter X. Further Applications 50.16

By replacing A(s) by A(s)’z0 if necessary we may assume that 0 is not an eigenvalue

of A(s). Since the global resolvent set is open, no eigenvalue of A(t) lies on γ or

equals 0, for t near s. Since

1

(A(t) ’ z)’1 dz =: P (t, γ)

t’’

2πi γ

is a smooth curve of projections (on the direct sum of all eigenspaces corresponding

to eigenvalues in the interior of γ) with ¬nite dimensional ranges, the ranks (i.e.

dimension of the ranges) must be constant: it is easy to see that the (¬nite) rank

cannot fall locally, and it cannot increase, since the distance in L(H, H) of P (t) to

the subset of operators of rank ¤ N = rank(P (s)) is continuous in t and is either

0 or 1. So for t near s, there are equally many eigenvalues in the interior, and we

may call them µi (t) : 1 ¤ i ¤ N (repeated with multiplicity). Let us denote by

ei (t) : 1 ¤ i ¤ N a corresponding system of eigenvectors of A(t). Then by the

residue theorem we have

N

1

z p (A(t) ’ z)’1 dz,

µi (t)p ei (t) ei (t), =’

2πi γ

i=1

which is smooth in t near s, as a curve of operators in L(H, H) of rank N , since 0

is not an eigenvalue.

(2) Claim. Let t ’ T (t) ∈ L(H, H) be a smooth curve of operators of rank

N in Hilbert space such that T (0)T (0)(H) = T (0)(H). Then t ’ tr(T (t))

is smooth (real analytic) (note that this implies T smooth (real analytic)

into the space of operators of trace class by (2.3) or (2.14.4), (by (10.3) and

(9.4) in the real analytic case) since all bounded linear functionals are of

the form A ’ tr(AB) for bounded B, see (52.33), e.g.

Let F := T (0)(H). Then T (t) = (T1 (t), T2 (t)) : H ’ F • F ⊥ and the image of

T (t) is the space

T (t)(H) = {(T1 (t)(x), T2 (t)(x)) : x ∈ H}

= {(T1 (t)(x), T2 (t)(x)) : x ∈ F } for t near 0

= {(y, S(t)(y)) : y ∈ F }, where S(t) := T2 (t) —¦ (T1 (t)|F )’1 .

Note that S(t) : F ’ F ⊥ is smooth (real analytic) in t by ¬nite dimensional

inversion for T1 (t)|F : F ’ F . Now

T1 (t)|F ⊥

1 0 T1 (t)|F 1 0

tr(T (t)) = tr

T2 (t)|F ⊥

’S(t) 1 T2 (t)|F S(t) 1

T1 (t)|F ⊥

T1 (t)|F 1 0

= tr

’S(t)T1 (t)|F ⊥ + T2 (t)|F ⊥

0 S(t) 1

T1 (t)|F ⊥

T1 (t)|F 1 0

= tr , since rank = N

0 0 S(t) 1

T1 (t)|F + (T1 (t)|F ⊥ )S(t) T1 (t)|F ⊥

= tr

0 0

= tr T1 (t)|F + (T1 (t)|F ⊥ )S(t) : F ’ F ,

50.16

50.16 50. Applications to perturbation theory of operators 551

which visibly is smooth (real analytic) since F is ¬nite dimensional.

From the claim (2) we now may conclude that

m

1

z p (A(t) ’ z)’1 dz

»i (t)p = ’ tr

2πi γ

i=’n

is smooth (real analytic) for t near s.

Thus, the Newton polynomial mapping sN (»’n (t), . . . , »m (t)) is smooth (real an-

alytic), so also the elementary symmetric polynomial σ N (»’n (t), . . . , »m (t)) is

smooth, and thus {µi (t) : 1 ¤ i ¤ N } is the set of roots of a polynomial with

smooth (real analytic) coe¬cients. By theorem (50.11), there is an arrangement

of these roots such that they become di¬erentiable. If no two of the generically

di¬erent ones meet of in¬nite order, by theorem (50.10) there is even a smooth ar-

rangement. In the real analytic case, by theorem (50.12) the roots may be arranged

in a real analytic way.

To see that in the general smooth case they are even C 1 note that the images of

the projections P (t, γ) of constant rank for t near s describe the ¬bers of a smooth

vector bundle. The restriction of A(t) to this bundle, viewed in a smooth framing,

becomes a smooth curve of symmetric matrices, for which by Rellich™s result (50.17)

below the eigenvalues can be chosen C 1 . This ¬nishes the proof of claim (1).

(3) Claim. Let t ’ »i (t) be a di¬erentiable eigenvalue of A(t), de¬ned on

some interval. Then

|»i (t1 ) ’ »i (t2 )| ¤ (1 + |»i (t2 )|)(ea|t1 ’t2 | ’ 1)

holds for a continuous positive function a = a(t1 , t2 ) which is independent

of the choice of the eigenvalue.

For ¬xed t near s take all roots »j which meet »i at t, order them di¬erentiably near

t, and consider the projector P (t, γ) onto the joint eigenspaces for only those roots

(where γ is a simple smooth curve containing only »i (t) in its interior, of all the

eigenvalues at t). Then the image of u ’ P (u, γ), for u near t, describes a smooth

¬nite dimensional vector subbundle of R — H, since its rank is constant. For each u

choose an orthonormal system of eigenvectors vj (u) of A(u) corresponding to these

»j (u). They form a (not necessarily continuous) framing of this bundle. For any

sequence tk ’ t there is a subsequence such that each vj (tk ) ’ wj (t) where wj (t)

is again an orthonormal system of eigenvectors of A(t) for the eigenspace of »i (t).

Now consider

A(t) ’ »i (t) A(tk ) ’ A(t) »i (tk ) ’ »i (t)

vi (tk ) ’

vi (tk ) + vi (tk ) = 0,

tk ’ t tk ’ t tk ’ t

take the inner product of this with wi (t), note that then the ¬rst summand vanishes,

and let tk ’ t to obtain

»i (t) = A (t)wi (t), wi (t) for an eigenvector wi (t) of A(t) with eigenvalue »i (t).

50.16

552 Chapter X. Further Applications 50.17

This implies, where Vt = (V, t ),

|»i (t)| ¤ A (t) wi (t) wi (t)

Vt H

L(Vt ,H)

2 2

= A (t) wi (t) + A(t)wi (t)

L(Vt ,H) H H

1 + »i (t)2 ¤ a + a|»i (t)|,

= A (t) L(Vt ,H)

2

for a constant a which is valid for a compact interval of t™s since t ’ t is

smooth on V . By Gronwall™s lemma (see e.g. [Dieudonn´, 1960,] (10.5.1.3)) this

e

implies claim (3).

By the following arguments we can conclude that all eigenvalues may be numbered

as »i (t) for i in N or Z in such a way that they are C 1 , or C ∞ under the stronger

assumption, or real analytic in the real analytic case, in t ∈ R. Note ¬rst that by

claim (3) no eigenvalue can go o¬ to in¬nity in ¬nite time since it may increase at

most exponentially. Let us ¬rst number all eigenvalues of A(0) increasingly.

We claim that for one eigenvalue (say »0 (0)) there exists a C 1 (or C ∞ or real

analytic) extension to all of R; namely the set of all t ∈ R with a C 1 (or C ∞ or

real analytic) extension of »0 on the segment from 0 to t is open and closed. Open

follows from claim (1). If this interval does not reach in¬nity, from claim (3) it

follows that (t, »0 (t)) has an accumulation point (s, x) at the the end s. Clearly

x is an eigenvalue of A(s), and by claim (1) the eigenvalues passing through (s, x)

can be arranged C 1 (or C ∞ or real analytic), and thus »0 (t) converges to x and

can be extended C 1 (or C ∞ or real analytic) beyond s.

By the same argument we can extend iteratively all eigenvalues C 1 (or C ∞ or real

analytic) to all t ∈ R: if it meets an already chosen one, the proof of (50.11) shows

that we may pass through it coherently. In the smooth case look at (50.10) instead,

and in the real analytic case look at the proof of (50.12).

Now we start to choose the eigenvectors smoothly, under the stronger assumption

in the smooth case, and in the real analytic case. Let us consider again eigenvalues

{»i (t) : 1 ¤ i ¤ N } contained in the interior of a smooth curve γ for t in an open

interval I. Then Vt := P (t, γ)(H) is the ¬ber of a smooth (real analytic) vector

bundle of dimension N over I. We choose a smooth framing of this bundle, and

use then the proof of theorem (50.14) to choose smooth (real analytic) sub vec-

tor bundles whose ¬bers over t are the eigenspaces of the eigenvalues with their

generic multiplicity. By the same arguments as in (50.14) we then get global vec-

tor sub bundles with ¬bers the eigenspaces of the eigenvalues with their generic

multiplicities, and thus smooth (real analytic) eigenvectors for all eigenvalues.

50.17. Result. ([Rellich, 1969, page 43], see also [Kato, 1976, II, 6.8]). Let A(t)

be a C 1 -curve of (¬nite dimensional) symmetric matrices. Then the eigenvalues

can be chosen C 1 in t, on the whole parameter interval.

This result is best possible for the degree of continuous di¬erentiability, as is shown

by the example in [Alekseevsky, Kriegl, Losik, Michor, 1996, 7.4]

50.17

553

51. The Nash-Moser Inverse Function Theorem

This section treats the hard implicit function theorem of Nash and Moser following

[Hamilton, 1982], in full generality and in condensed form, but with all details. The

main di¬culty of the proof of the hard implicit function theorem is the following:

By trying to use the Newton iteration procedure for a nonlinear partial di¬erential

equation one quickly ¬nds out that ˜loss of derivatives™ occurs and one cannot reach

the situation, where the Banach ¬xed point theorem is directly applicable. Using

smoothing operators after each iteration step one can estimate higher derivatives

by lower ones and ¬nally apply the ¬xed point theorem.

The core of this presentation is the following: one proves the theorem in a Fr´chet

e

space of exponentially decreasing sequences in a Banach space, where the smooth-

ing operators take a very simple form: essentially just cutting the sequences at some

index. The statement carries over to certain direct summands which respect ˜boun-

ded losses of derivatives™, and one can organize these estimates into the concept

of tame mappings and thus apply the result to more general situations. However

checking that the mappings and also the inverses of their linearizations in a certain

problem are tame mappings (a priori estimates) is usually very di¬cult. We do not

give any applications, in view of our remarks before.

51.1. Remark. Let f : E ⊇ U ’ V ⊆ E be a di¬eomorphisms. Then di¬erenti-

ation of f ’1 —¦ f = Id and f —¦ f ’1 = Id at x and f (x) yields using the chain-rule,

that f (x) is invertible with inverse (f ’1 ) (f (x)) and hence x ’ f (x)’1 is smooth

as well.

The inverse function theorem for Banach spaces assumes the invertibility of the

derivative only at one point. Openness of GL(E) in L(E) implies then local in-

vertibility and smoothness of inv : GL(E) ’ GL(E) implies the smoothness of

x ’ f (x)’1 .

Beyond Banach spaces we do not have openness of GL(E) in L(E) as the following

example shows.

51.2. Example. Let E := C ∞ (R, R) and P : E ’ E be given by P (f )(t) :=

f (t)’t f (t) f (t). Since multiplication with smooth functions and taking derivatives

are continuous linear maps, P is a polynomial of degree 2. Its derivative is given

by

P (f )(h)(t) = h(t) ’ t h(t) f (t) ’ t f (t) h (t).

In particular, the derivative P (0) is the identity, hence invertible. However, at the

1

constant functions fn = n the derivative P (fn ) is not injective, since hk (t) := tk

1 k

are in the kernel: P (fn )(hk )(t) = tk ’ t · 0 · tk ’ t · n · k · tk’1 = tk · (1 ’ n ).

Let us give an even more natural and geometric example:

51.3. Example. Let M be a compact smooth manifold. For Di¬(M ) we have

shown that the 1-parameter subgroup of Di¬(M ) with initial tangent vector X ∈

51.3

554 Chapter X. Further Applications 51.3

TId Di¬(M ) = X(M ) is given by the ¬‚ow FlX of X, see (43.1). Thus, the exponen-

tial mapping Exp : TId Di¬(M ) ’ Di¬(M ) is given by X ’ FlX .1

The derivative T0 exp : Te G = T0 (Te G) ’ Texp(0) (G) = Te G at 0 of the exponential

mapping exp : g = Te G ’ G is given by

FltX (1, e) = FlX (t, e) = Xe .

d d d

dt |t=0 dt |t=0 dt |t=0

T0 exp(X) := exp(tX) =

Thus, T0 exp = Idg . In ¬nite dimensions the inverse function theorem now implies

that exp : g ’ G is a local di¬eomorphism.

What is the corresponding situation for G = Di¬(M )? We take the simplest

compact manifold (without boundary), namely M = S 1 = R/2π Z. Since the

natural quotient mapping p : R ’ R/2πZ = S 1 is a covering map we can lift

˜

each di¬eomorphism f : S 1 ’ S 1 to a di¬eomorphism f : R ’ R. This lift

˜

is uniquely determined by its initial value f (0) ∈ p’1 ([0]) = 2πZ. A smooth

˜

mapping f : R ’ R projects to a smooth mapping f : S 1 ’ S 1 if and only if

˜ ˜

f (t + 2π) ∈ f (t) + 2πZ. Since 2πZ is discrete, f (t + 2π) ’ f (t) has to be 2πn for

˜

some n ∈ Z not depending on t. In order that a di¬eomorphism f : R ’ R factors

to a di¬eomorphism f : S 1 ’ S 1 the constant n has to be +1 or ’1. So we ¬nally

obtain an isomorphism {f ∈ Di¬(R) : f (t + 2π) ’ f (t) = ±1}/2πZ ∼ Di¬(S 1 ). In

=

particular, we have di¬eomorphisms Rθ given by translations with θ ∈ S 1 (In the

picture S 1 ⊆ C these are just the rotations by with angle θ).

Claim. Let f ∈ Di¬(S 1 ) be ¬xed point free and in the image of exp. Then f is

conjugate to some translation Rθ .

We have to construct a di¬eomorphism g : S 1 ’ S 1 such that f = g ’1 —¦ Rθ —¦ g.

Since p : R ’ R/2πZ = S 1 is a covering map it induces an isomorphism Tt p :

R ’ Tp(t) S 1 . In the picture S 1 ⊆ C this isomorphism is given by s ’ s p(t)⊥ ,

where p(t)⊥ is the normal vector obtained from p(t) ∈ S 1 via rotation by π/2.

Thus, the vector ¬elds on S 1 can be identi¬ed with the smooth functions S 1 ’ R

or, by composing with p : R ’ S 1 with the 2π-periodic functions X : R ’ R.

Let us ¬rst remark that the constant vector ¬eld X θ ∈ X(S 1 ), s ’ θ has as ¬‚ow

θ θ

FlX : (t, •) ’ • + t · θ. Hence exp(X θ ) = FlX = Rθ .

1

Xθ

FlX

Let f = exp(X) and suppose g —¦ f = Rθ —¦ g. Then g —¦ —¦g for t = 1.

= Flt

t

Let us assume that this is true for all t. Then di¬erentiating at t = 0 yields

T g(Xx ) = Xg(x) for all x ∈ S 1 . If we consider g as di¬eomorphism R ’ R this

θ

means that g (t) · X(t) = θ for all t ∈ R. Since f was assumed to be ¬xed point free

the vector ¬eld X is nowhere vanishing. Otherwise, there would be a stationary

tθ

point x ∈ S 1 . So the condition on g is equivalent to g(t) = g(0) + 0 X(s) ds. We

take this as de¬nition of g, where g(0) := 0, and where θ will be chosen such that

t+2π ds

g factors to an (orientation preserving) di¬eomorphism on S 1 , i.e. θ t X(s) =

2π

ds

g(t + 2π) ’ g(t) = 1. Since X is 2π-periodic this is true for θ = 1/ 0 X(s) . Since

the ¬‚ow of a transformed vector ¬eld is nothing else but the transformed ¬‚ow we

θ

obtain that g(FlX (t, x)) = FlX (t, g(x)), and hence g —¦ f = Rθ —¦ g.

51.3

51.4 51. The Nash-Moser inverse function theorem 555

In order to show that exp : X(S 1 ) ’ Di¬(S 1 ) is not locally surjective, it hence

su¬ces to ¬nd ¬xed point free di¬eomorphisms f arbitrarily close to the identity

which are not conjugate to translations. For this consider the translations R2π/n

and modify them inside the interval (0, 2π ) such that the resulting di¬eomorphism

n

π 3π k

f satis¬es f ( n ) ∈ n + 2πZ. Then f maps 0 to 2π, and thus the induced di¬eo-

/

morphism on S 1 has [0] as ¬xed point. If f would be conjugate to a translation,

the same would be true for f k , hence the translation would have a ¬xed point and

hence would have to be the identity. So f k must be the identity on S 1 , which is

impossible, since f ( n ) ∈ 3π + 2πZ.

π

/n

Let us ¬nd out the reason for this break-down of the inverse function theorem. For

this we calculate the derivative of exp at the constant vector ¬eld X := X 2π/k :

d

ds |s=0

exp (X)(Y )(x) = exp (X + sY )(x)

1

X+sY

d 2tπ

ds |s=0

= Fl (1, x) = Y (x + k ) dt,

0

where we have di¬erentiated the de¬ning equation for FlX+sY to obtain

FlX+sY (t, x) = |s=0 ‚t FlX+sY (t, x)

‚‚ ‚ ‚

‚t ‚s |s=0 ‚s

X+sY

‚

‚s |s=0 (X + sY )(Fl

= (t, x))

X

= Y (Fl (t, x)) + X (. . . )

= Y (x + t 2π ) + 0,

k

and the initial condition FlX+sY (0, x) = x gives

t

2π

X+sY

‚

‚s |s=0 Fl (t, x) = Y (x + „ ) d„.

k

0

If we take x ’ sin(kx) as Y then exp (X)(Y ) = 0, so exp (X) is not injective, and

since X can be chosen arbitrarily near to 0 we have that exp is not locally injective.

So we may conclude that a necessary assumption for an inverse function theorem

beyond Banach spaces is the invertibility of f (x) not only for one point x but for

a whole neighborhood.

For Banach spaces one then uses that x ’ f (x)’1 is continuous (or even smooth),

which follows directly from the smoothness of inv : GL(E) ’ GL(E), see (51.1).

However, for Fr´chet spaces the following example shows that inv is not even con-

e

tinuous (for the c∞ -topology).

51.4. Example. Let s be the Fr´chet space of all fast falling sequences, i.e.

e

s := {(xk )k ∈ RN : (xk )k n := sup{(1 + k)n |xk | : k ∈ N} < ∞ for all n ∈ N}.

Next we consider a curve c : R ’ GL(s) de¬ned by

c(t)((xk )k ) := ((1 ’ h0 (t))x0 , . . . , (1 ’ hk (t))xk , . . . ),

51.4

556 Chapter X. Further Applications 51.4

where hk (t) := (1 ’ 2’k ) h(kt) for an h ∈ C ∞ (R, R) which will be chosen appropri-

ately.

Then c(t) ∈ GL(s) provided h(0) = 0 and supp h is compact, since then the factors

1 ’ hk (t) are equal to 1 for almost all k. The inverse is given by multiplying with

1/(1 ’ hk (t)), which exists provided h(R) ⊆ [0, 1].

Let us show next that inv —¦c : R ’ GL(s) ⊆ L(s) is not even continuous. For this

take x ∈ s, and consider

1

1 ’1 k

t’c (x) = (. . . ; 1 xk ; . . . ) = (?, . . . , ?; 2 xk ; ?, . . . ),

k

1 ’ hk ( k )

provided h(1) = 1. Let x be de¬ned by xk := 2’k , then c( k )’1 (x)’c(0)’1 (x)

1

≥

0

1 ’ 2’k ’ 0.

It remains to show that c : R ’ GL(s) is continuous or even smooth. Since

smoothness of a curve depends only on the bounded sets, and boundedness in

GL(E) ⊆ L(E, E) can be tested pointwise because of the uniform boundedness

theorem (5.18), it is enough to show that evx —¦s : R ’ GL(s) ’ s is smooth.

Boundedness in a locally convex space can be tested by the continuous linear func-

tionals, so it would be enough to show that » —¦ evx —¦c : R ’ GL(s) ’ s ’ R is

smooth for all » ∈ s— . We want to use the particular functionals given by the coor-

dinate projections »k : (xk )k ’ xk . These, however, do not generate the bornology,

but if B ⊆ s is bounded, then so is k∈N »’1 (»k (B)). In fact, let B be bounded.

k

Then for every n ∈ N there exists a constant Cn such that (1 + k)n |xk | ¤ Cn for all

k and all x = (xk )k ∈ B. Then every y ∈ »’1 (»k (x)) (i.e., »k (y) = »k (x)) satis¬es

k

the same inequality for the given k, and hence k∈N »’1 (»k (B)) is bounded as well.

k

Obviously, »k —¦evx —¦c is smooth with derivatives (»k —¦evx —¦c)(p) (t) = (1’hk )(p) (t)xk .

Let cp (t) be the sequence with these coordinates. We claim that cp has values in s

and is (locally) bounded. So take an n ∈ N and consider

cp (t) = sup(1 + k)n |(1 ’ hk )(p) (t)xk |.

n

k

We have (1’hk )(p) (t) = 1(p) ’(1’2’k )k p h(p) (kt), and hence this factor is bounded

by 1+k p h(p) ∞ . Since (1+k n )(1+ h(p) ∞ k p )|xk | is by assumption on x bounded

we have that supt cp (t) n < ∞.

Now it is a general argument, that if we are given locally bounded curves cp : R ’ s

such that »k —¦ c0 is smooth with derivatives (»k —¦ c0 )(p) = »k —¦ cp , then c0 is smooth

with derivatives cp .

In fact, we consider for c = c0 the following expression

c(t) ’ c(0) »k (c(t)) ’ »k (c(0))

1 1

’ c1 (0) ’ »k (c1 (0)) ,

»k =

t t t t

which is by the classical mean value theorem contained in { 1 »k (c2 (s)) : s ∈

2

12

[0, t]}. Thus, taking for B the bounded set { 2 c (s) : s ∈ [0, 1]}, we conclude

51.4

51.7 51. The Nash-Moser inverse function theorem 557

that (c(t) ’ c(0))/t ’ c1 (0) /t is contained in the bounded set k∈N »’1 (»k (B)),

k

c(t)’c(0) 1 k 0

’ c (0). Doing the same for c = c shows that c is smooth

and hence t

with derivatives ck .

From this we conclude that in order to obtain an inverse function theorem we

have to assume beside local invertibility of the derivative also that x ’ f (x)’1 is

smooth. That this is still not enough is shown by the following example:

51.5. Example. Let E := C ∞ (R, R) and consider the map exp— : E ’ E given by

exp— (f )(t) := exp(f (t)). Then one can show that exp— is smooth. Its (directional)

derivative is given by

(f +sh)(t)

= h(t) · ef (t) ,

‚

‚s |s=0 e

(exp— ) (f )(h)(t) =

so (exp— ) (f ) is multiplication by exp— (f ). The inverse of (exp— ) (f ) is the multi-

plication operator with exp1 (f ) = exp— (’f ), and hence f ’ (exp— ) (f )’1 is smooth

—

as well. But the image of exp— consists of positive functions only, whereas the curve

c : t ’ (s ’ 1 ’ ts) is a smooth curve in E = C ∞ (R, R) through exp— (0) = 1, and

c(t) is not positive for all t = 0 (take s := 1 ).

t

So we will need additional assumptions. The idea of the proof is to use that a

Fr´chet space is built up from Banach spaces as projective limit, to solve the inverse

e

function theorem for the building blocks, and to try to approximate in that way an

inverse to the original function. In order to guarantee that such a process converges,

we need (a priori) estimates for the seminorms, and hence we have to ¬x the basis

of seminorms on our spaces.

51.6. De¬nition. A Fr´chet space is called graded, if it is provided with a ¬xed

e

increasing basis of its continuous seminorms. A linear map T between graded

Fr´chet spaces (E, (pk )k ) and (F, (qk )k ) is called tame of degree d and base b if

e

∀n ≥ b ∃Cn ∈ R ∀x ∈ E : qn (T x) ¤ Cn pn+d (x).

Recall that T is continuous if and only if

∀n ∃m ∃Cn ∈ R ∀x ∈ E : qn (T x) ¤ Cn pm (x).

Two gradings are called tame equivalent of degree r and base b if and only if the

identity is tame of degree r and base b in both directions.

51.7. Examples. Let M be a compact manifold. Then C ∞ (M, R) is a graded

Fr´chet space, where we consider as k-th norm the supremum of all derivatives

e

of order less or equal to k. In order that this de¬nition makes sense, we can

embed M as closed submanifold into some Rn . Choosing a tubular neighborhood

Rn ⊇ U ’ M we obtain an extension operator p— : C ∞ (M, R) ’ C ∞ (U, R), and

on the latter space the operator norms of derivatives f k (x) for f ∈ C ∞ (U, R) make

sense.

51.7

558 Chapter X. Further Applications 51.8

Another way to give sense to the de¬nition is to consider the vector bundle J k (M, R)

of k-jets of functions f : M ’ R. Its ¬ber over x ∈ M consists of all “Taylor-

polynomials” of functions f ∈ C ∞ (M, R). We obtain an injection of C ∞ (M, R)

into the space of sections of J k (M, R) by associating to f ∈ C ∞ (M, R) the section

having the Taylor-polynomial of f at a point x ∈ M . So it remains to de¬ne a

norm pk on the space C ∞ (M ← J k (M, R)) of sections. This is just the supremum

norm, if we consider some metric on the vector bundle J k (M, R) ’ M .

Another method of choosing seminorms would be to take a ¬nite atlas and a par-

tition of unity subordinated to the charts and use the supremum norms of the

derivatives of the chart representations.

A second example of a graded Fr´chet space, closely related to the ¬rst one, is the

e

space s(E) of fast falling sequences in a Banach space E, i.e.

:= sup{(1 + k)n xk | : k ∈ N} < ∞ for all n ∈ N}.

s(E) := {(xk )k ∈ E N : (xk )k n

A modi¬cation of this is the space Σ(E) of very fast falling sequences in a Banach

space E, i.e.

enk xk < ∞ for all n ∈ N}.

Σ(E) := {(xk )k ∈ E N : (xk )k :=

n

k∈N

51.8. Examples.

(1). Let T : s(E) ’ s(E) be the multiplication operator with a polynomial p, i.e.,

T ((xk )k ) := (p(k)xk )k .

We claim that T is tame of degree d := deg(p) and base 0. For this we estimate as

follows:

= sup{(1 + k)n p(k) xk : k ∈ N}

T ((xk )k ) n

¤ Cn sup{(1 + k)n+d xk : k ∈ N} = Cn (xk )k n+d ,

|p(k)|

where d is the degree of p and Cn := sup{ (1+k)d : k ∈ N}. Note that Cn < ∞,

since k ’ (1 + k)d is not vanishing on N, and the limit of the quotient for k ’ ∞

is the coe¬cient of p of degree d.

This shows that s(E) is tamely equivalent to the same space, where the seminorms

are replaced by k (1 + k)n xk . In fact, the sums are larger than the suprema.

Conversely, k (1+k)n xk ¤ k (1+k)’2 (1+k)n+2 x k ¤ ’2

k (1+k) x n+2 ,

showing that the identity in the reverse direction is tame of degree 2 and base 0.

(2). Let T : Σ(E) ’ Σ(E) be the multiplication operator with an exponential

function, i.e., T ((xk )k ) := (ak xk )k .

We claim that T is tame of some degree and base 0. For this we estimate as follows:

enk ak xk = e(n+log(a))k xk

T ((xk )k ) =

n

k∈N k∈N

e(n+d)k xk = (xk )k

¤ n+d ,

k∈N

51.8

51.10 51. The Nash-Moser inverse function theorem 559

where d is any integer greater or equal to log(a). Note however, that T is not well

de¬ned on s(E) for a > 1, and this is the reason to consider the space Σ(E).

Note furthermore, that as before one shows that one could equally well replace the

sum by the corresponding supremum in the de¬nition of Σ(E), one only has to use

that k∈N e’k = 1’1/e < ∞.

1

(3). As a similar example we consider a linear di¬erential operator D of degree

d, i.e., a local operator (the values Df depend at x only on the germ of f at x)

|±|

which is locally given in the form Df = |±|¤d g± · ‚ ± f , with smooth coe¬cient

‚x

∞

functions g± ∈ C (M, R) on a compact manifold M .

Then D : C ∞ (M, R) ’ C ∞ (M, R) is tame of degree d and base 0. In fact, by the

product rule we can write the k-th derivative of Df as linear combination of partial

derivatives of the g± and derivatives of order up to k + d of f .

(4). Now we give an example of a non-tame linear map. For this consider T :

C ∞ ([0, 1], R) ’ C ∞ ([’1, 1], R) given by T f (t) := f (t2 ). It was shown in the proof

∞

of (25.2) that the image of T consists exactly of the space Ceven ([’1, 1], R) of even

functions. Since (T f )(n) (t) = f (n) (t2 )(2t)n + 0<2k¤n cn f (n’k) (t2 )tn’2k with some

k

n

ck ∈ Z, we have that T is tame of order 0 and degree 0. But the inverse is not

tame since (T f )(2n) (0) is proportional to f (n) (0), hence in order to estimate the

n-th derivative of T ’1 g we need the 2n-th derivative of g.

51.9. De¬nition. A graded Fr´chet space F is called tame if there exists some

e

Banach space E such that F is a tame direct summand in Σ(E), i.e. there are tame

linear mappings i : F ’ Σ(E) and p : Σ(E) ’ F with p —¦ i = IdF .

Our next aim is to show that instead of Σ(E) we can equally well use s(E). For

this we consider a measured space (X, µ) and a measurable positive weight function

w : X ’ R and de¬ne

L1 (X, µ, w) := f ∈ L1 (X, µ) : f en w(x) |f (x)| dµ(x) < ∞ .

:=

n

Σ

X

51.10. Proposition. Every space L1 (X, µ, w) is a tame Fr´chet space.

e

Σ

Proof. Let Xk := {x ∈ X : k ¤ w(x) < k + 1}. Then the Xk form a countable

disjoint covering of X by measurable sets. Let χk be the characteristic function of

Xk , and let R : L1 (X, µ, w) ’ ΣL1 (X, µ) and L : ΣL1 (X, µ) ’ L1 (X, µ, w) be

Σ Σ

de¬ned by Rf := (χk · f )k and L((fk )k ) := k χk · fk . Then obviously L —¦ R = Id.

The linear map R is well-de¬ned and tame of degree 0 and base 0, since

enk χk f enk |f | dµ ¤

Rf = =

n 1

Xk

k k

ew(x)n |f (x)| dµ(x) = ew(x)n |f (x)| dµ(x) = f

¤ n.

Xk X

k

51.10

560 Chapter X. Further Applications 51.13

Finally, L is a well-de¬ned linear map, which is tame of degree 0 and base 0, since

en w(x) en w(x) |fk (x)| dµ(x)

L((fk )k ) = χk fk (x) dµ(x) =

n

X Xk

k k

en(k+1) |fk (x)| dµ(x) ¤ en(k+1) |fk (x)| dµ(x)

¤

Xk Xk

k k

= en enk fk = en (fk )k n.

1

k

51.11. Corollary. For every Banach space E the space s(E) is a tame Fr´chet

e

space.

Proof. This result follows immediately from the proposition (51.10) above, if one

replaces L1 (X, µ, w) by the vector valued function space L1 (X, µ, w; E) and simi-

Σ Σ

1 1

larly the space L (X, µ) by the Banach space L (X, µ; E).

Now let us show the converse direction:

51.12. Proposition. For every Banach space E the space Σ(E) is a tame direct

summand of s(E).

Proof. We de¬ne R : Σ(E) ’ s(E) and L : s(E) ’ Σ(E) by R((xk )k ) := (yk )k ,

where y[ek ] := xk and 0 otherwise, and L((yk )k ) := (y[ek ] )k . The map R is well-

de¬ned, linear and tame, since (yk )k n := k (1 + k)n yk = j (1 + [ej ])n xj ¤

jn n

j (2e ) xj = 2 (xj )j n . The map L is well-de¬ned, linear and tame, since

kn kn kn

y[ek ] ¤ ¤

(xk )k n := ke xk = ke k (1 + [e ]) y[ek ] j (1 +

j)n yj = y n . Obviously, L —¦ R = Id.

51.13. De¬nition. A non-linear map f : E ⊇ U ’ F between graded Fr´chet e

spaces is called tame of degree r and base b if it is continuous and every point in U

has a neighborhood V such that

∀n ≥ b ∃Cn ∀x ∈ V : f (x) ¤ Cn (1 + x n+r ).

n

Remark. Every continuous map from a graded Fr´chet space into a Banach space

e

is tame.

For ¬xed x0 ∈ U choose a constant C > f (x0 ) and let V := {x : f (x) < C}.

Then V is an open neighborhood of x0 , and for all n and all x ∈ V we have

f (x) n = f (x) ¤ C ¤ C(1 + x n ).

Every continuous map from a ¬nite dimensional space into a graded Fr´chet space

e

is tame.

Choose a compact neighborhood V of x0 . Let Cn := max{ f (x) n : x ∈ V }. Then

f (x) n ¤ Cn ¤ Cn (1 + x ).

It is easily checked that the composite of tame linear maps is tame. In fact

¤ C(1 + g(x)

f (g(x)) n+r )

n

¤ C(1 + C(1 + x ¤ C(1 + x

n+r+s )) n+r+s )

for all x in an appropriately chosen neighborhood and n ≥ bf and n + r ≥ bg .

51.13

51.15 51. The Nash-Moser inverse function theorem 561

51.14. Proposition. The de¬nition of tameness of degree r is coherent with the

one for linear maps, but the base may change.

Proof. Let ¬rst f be linear and tame as non-linear map. In particular, we have

locally around 0

f (x) n ¤ C(1 + x n+r ) for all n ≥ b.

If we increase b, we may assume that the 0-neighborhood is of the form {x :

x b+r ¤ µ} for some µ > 0. For y = 0 let x := y µb+r y, i.e., x b+r = µ.

Thus, f (x) n ¤ C(1 + x n+r ). By linearity of f , we get

y y y

b+r b+r b+r

¤C

f (y) = f (x) +x

n n n+r

µ µ µ

y b+r

=C +y .

n+r

µ

¤x for b ¤ n we get

Since y b+r n+r

1

¤C

f (y) +1 x n+r .

n

µ

Conversely, let f be a tame linear map. Then the inequality

¤C x ¤ C(1 + x for all n ≥ b

f (x) n+r )

n n+r

is true.

De¬nition. For functions f of two variables we will de¬ne tameness of bi-degree

(r, s) and base b if locally

∀n ≥ b ∃C ∀x, y : f (x, y) ¤ C(1 + x +y n+s );

n n+r

and similar for functions in several variables.

51.15. Lemma. Let f : U — E ’ F be linear in the second variable and tame of

b+r —

base b and degree (r, s) in a b+s -neighborhood. Then we have

∀n ≥ b ∃C : f (x)h ¤ C( h +x h b+s )

n n+s n+r

for all x in a b+r -neighborhood and all h.

If f : U — E1 — E2 is tame of base b and degree (r, s, t) in a — —

b+r b+s

b+t -neighborhood. Then we have

¤ C( h

f (x)(h, k) k +h k +x h k b+t )

n n+s b+t b+s n+t n+r b+s

for all x in a b + r-neighborhood and all h and k.

¯ µ

Proof. For arbitrary h let h := h. Then

h b+s

¯ ¯

f (x)h ¤ C(1 + x +h n+s ).

n+r

51.15

562 Chapter X. Further Applications 51.18

Therefore

h h µ

b+s b+s

¯ ¤

f (x)h = f (x)h C 1+ x + h

n n n+r n+s

µ µ h b+s

C h b+s C h b+s

¤ + x n+r + C h n+s

µ µ

1 C

¤C + 1 h n+s + x n+r h b+s .

µ µ

The second part is proved analogously.

51.16. Proposition. Interpolation formula for Σ(E).

·x ¤x ·x for 0 ¤ r ¤ n ¤ m.

x n m n’r m+r

Proof. Let us ¬rst consider the special case, where n = m and r = 1. Then

2

n’1 · ’x

x x =

n+1 n

e(n’1)k xk e(n+1)l xl ’ enk xk enl xl

=

k l k l

(e(n’1)k e(n+1)k ’ e2nk ) xk 2

=

k=l

(e(n’1)k e(n+1)l + e(n+1)k e(n’1)l ’ 2en(k+l) ) xk

+ xl .

k<l

In both subsummands the expression in brackets is positive, since

e(n’1)k e(n+1)l + e(n+1)k e(n’1)l ’ 2en(k+l) =

= 2en(k+l) (el’k + ek’l ’ 2) = 4en(k+l) (cosh(l ’ k) ’ 1) ≥ 0.

By transitivity, it is enough to show the general case for r = 1. Without loss of

generality we may assume x = 0. Then this case is equivalent to

xn x m+1

¤ for n ¤ m.

x n’1 xm

Again by transitivity it is enough to show this for m = n.

51.17. The Nash-Moser inverse function theorem. Let E and F be tame

Fr´chet spaces and let f : E ⊇ U ’ F be a tame smooth map. Suppose f has a

e

tame smooth family Ψ of inverses. Then f is locally bijective, and the inverse of f

is a tame smooth map.

The proof will take the rest of this section.

51.18. Proposition. Let E and F be tame Fr´chet spaces and let f : E ⊇ U ’ F

e

be a tame smooth map. Suppose f has a tame smooth family Ψ of linear left

inverses. Then f is locally injective.

51.18

51.19 51. The Nash-Moser inverse function theorem 563

51.19. Proposition. Let E and F be tame Fr´chet spaces and let f : E ⊇ U ’ F

e

be a smooth tame map. Suppose f has a tame smooth family Ψ of linear right

inverses. Then f is locally surjective (and locally has a smooth right inverse).

By a tame smooth mapping f we will for the moment understand an in¬nitely often

Gˆteaux di¬erentiable map, for which the derivatives f (n) (x) are multilinear and

a

are tame as maps U — E n ’ F .

By a tame smooth family of (one-sided) inverses of f we understand a family

(Ψ(x))x∈U : F ’ E of (one-sided) inverses of (f (x))x∈U , which gives a tame

smooth map Ψ§ : U — F ’ E.

Let us start with some preparatory remarks for the proofs. Contrary to good

manners the symbol C will almost never denote the same constant even not in the

same inequality. This constant may depend on the index of the norm n but not on

any argument of the norms.

For all three proofs we may assume that the initial values are f : 0 ’ 0 (apply

translations in the domain and the codomain).

Claim. We may assume that E = Σ(B) and F = Σ(C).

First for (51.18). In fact, E and F are direct summands in such spaces Σ(B) and

Σ(C). We extend f to a smooth tame mapping f : Σ(B) ⊇ U ’ Σ(B — C) ∼

˜ ˜ =

˜

Σ(B) — Σ(C), by setting U := p’1 (U ), where p : Σ(B) ’ E is the retraction,

˜

and f := (Id ’p, f —¦ p). Note that (Id ’p) preserves exactly that part which gets

˜

annihilated by f —¦ p. More precisely injectivity of f implies that of f . In fact,

f (x) = f (y) implies x = p(x), y = p(y), and hence (Id ’p)(x) = 0 = (Id ’p)(y), and

˜ ˜ ˜x ˜ ˜ ˜ ˜x

so f (x) = f (y). Since f (˜)(h) = ((Id ’p)(h), f (p(˜)) · p(h)), let Ψ(˜) := (Id ’p) —¦

x

˜x ˜x

pr1 +Ψ(p(˜))—¦pr2 . Then Ψ(˜)—¦ f (˜) = (Id ’p)—¦(Id ’p)+Ψ(p(˜))—¦f (p(˜))—¦p = Id.

x x x

Now for (51.19). Here we extend f to a smooth tame mapping f : Σ(B — C) ∼

˜ =

˜

˜ ˜

Σ(B) — Σ(C) ⊇ U ’ Σ(C), by setting U := p’1 (U ) — Σ(C) and f := (f —¦ p) •

(Id ’q), where p : Σ(B) ’ E and q : Σ(C) ’ F are the retractions. Since

˜x˜ ˜x˜

f (˜, y ) = f (p(˜)) —¦ p • (Id ’q) let Ψ(˜, y ) : Σ(C) ’ Σ(B) — Σ(C) be de¬ned

x

˜x˜ ˜ ˜ ˜ ˜x˜

by Ψ(˜, y )(k) := (Ψ(p(˜))(q(k)), (Id ’q)(k)), i.e. Ψ(˜, y ) := (Ψ(p(˜)) —¦ q, (Id ’q)).

x x

Then

˜x˜ ˜x˜

f (˜, y ) —¦ Ψ(˜, y ) = ((f —¦ p) (˜) • (Id ’q) (˜)) —¦ (Ψ(p(˜)) —¦ q, (Id ’q))

x y x

= f (p(˜)) —¦ p —¦ Ψ(p(˜)) —¦q + (Id ’q) —¦ (Id ’q)

x x

Ψ(p(˜))

x

= q + (Id ’2q + q 2 ) = Id .

Claim. We may assume that x ’ f (x), (x, h) ’ f (x)h, (x, h) ’ f (x)(h, h) and

(x, k) ’ Ψ(x)k satisfy tame estimates of degree 2r in x, of degree r in h and 0 in

k (for some r) and base 0 on the set { x 0 ¤ 1}.

Consider on Σ(B) the linear operators p which are de¬ned by ( p x)k := epk xk .

p

x n = x n+p . If f satis¬es f (x) n ¤ C(1 + x n+s ) on x a ¤ δ

Then

51.19

564 Chapter X. Further Applications 51.20

˜ ˜

for n ≥ b then f := q —¦ f —¦ ’p satis¬es f (x) m = f ( ’p x) m+q ¤ C(1 +

’p

x m+q+s ) = C(1 + x m+q+s’p ) on x a’p ¤ δ for m ≥ b ’ q.

Choosing q and p su¬ciently large, we may assume that f , f , f , and Ψ satisfy

tame estimates of base 0 (choose q large in comparison to b) on {x : x 0 ¤ δ}

(choose p large in comparison to a). Furthermore, we may achieve that (x, k) ’

Ψ(x)k is tame of order 0 (since by linearity we don™t need p for the neighborhood,

which is now global, but we have to choose it so that m + q + s ’ p ¤ m) in k (but

we cannot achieve that this is also true for f ). Now take r su¬ciently large such

that the degrees are dominated by 2r and r, and ¬nally replace f by x ’ f (cx) to

obtain δ = 1.

¤ 1 we have for all n ≥ 0 a Cn > 0 such that

Claim. On x 2r

¤ Cn x

f (x) n+2r ,

n

¤ Cn ( h

f (x)h +x h r ),

n n+r n+2r

¤ Cn ( h1

f (x)(h1 , h2 ) h2 + h1 h2 +x h1 h2 r ),

n n+r r r n+r n+2r r

¤ Cn ( k

Ψ(x)k +x k 0 ).

n n n+2r

The 2nd, 3rd and 4th inequality follow from the corresponding tameness and

(51.15), since the neighborhood is given by a norm with index higher then base

+ degree. For the ¬rst inequality one would expect f (x) n ¤ C(1 + x n+2r ), but

since f (0) = 0 one can drop the 1, which follows from integration of the second

estimate:

1

1

¤ C( x

f (x) = f (0) + f (tx)x dt + x x r ).

n n+r n+2r

2

n

0

¤x ¤x ¤ 1 we are done.

Since x and x

n+r n+2r r 2r

Proof of (51.18). The idea comes from the 1-dimensional situation, where f (x) =

f (y) implies by the mean value theorem that there exists an r ∈ [x, y] := {tx + (1 ’

t)y : 0 ¤ t ¤ 1} with f (r) = f (x)’f (y) = 0.

x’y

51.20. Sublemma. There exists a δ > 0 such that for xj 2r ¤ δ we have

x1 ’ x0 0 ¤ C f (x1 ) ’ f (x0 ) 0 . In particular, we have that f is injective on

{x : x 2r ¤ δ}.

Proof. Using the Taylor formula

1

(1 ’ t)f (x0 + t(x1 ’ x0 ))(x1 ’ x0 )2 dt

f (x1 ) = f (x0 ) + f (x0 )(x1 ’ x0 ) +

0

and Ψ(x0 ) —¦ f (x0 ) = Id, we obtain that x1 ’ x0 = Ψ(x0 )(k), where

1

(1 ’ t)f (x0 + t(x1 ’ x0 )) (x1 ’ x0 )2 dt.

k := f (x1 ) ’ f (x0 ) ’

0

51.20

51.21 51. The Nash-Moser inverse function theorem 565

¤ 1 we can use the tame estimates of f and interpolation to get

For xj 2r

f (x0 + t(x1 ’ x0 ))(x1 ’ x0 )2 ¤

n

2

¤C x1 ’ x0 x1 ’ x0 x1 ’ x0

+ ( x1 + x0 n+2r )

n+r r n+2r r

¤C x1 ’ x0 x1 ’ x0 x1 ’ x0 x1 ’ x0

+ ( x1 + x0 n+2r )

n+2r 0 n+2r 2r 0

¤ C ( x1 x1 ’ x0 x1 ’ x0

+ x0 n+2r ) + ( x1 + x0 n+2r )2δ

n+2r 0 n+2r 0

¤ C( x1 x1 ’ x0 0 .

+ x0 n+2r )

n+2r

Using the tame estimate

¤ C k 0 (1 + x0 ¤ C k 0,

Ψ(x0 )k 2r )

0

we thus get

x1 ’ x0 ¤C k ¤

= Ψ(x0 )k

0 0 0

+ 1 C( x1

¤C f (x1 ) ’ f (x0 ) x1 ’ x0

+ x0 2r )

0 2r 0

2

+ x1 ’ x0 2 )

¤ C ( f (x1 ) ’ f (x0 ) 0 r

¤ C ( f (x1 ) ’ f (x0 ) + x1 ’ x0 · x1 ’ x0 0 ).

0 2r

Now use x1 ’ x0 ¤ x1 ¤ 2δ to obtain

+ x0

2r 2r 2r

x1 ’ x0 ¤ C( f (x1 ) ’ f (x0 ) + 2δ x1 ’ x0 0 ).

0 0

1

Taking δ < yields the result.

2C

¤ δ with δ as before. Then for n ≥ 0 we have

51.21. Corollary. Let xj 2r

x1 ’ x0 ¤C f (x1 ) ’ f (x0 ) f (x1 ) ’ f (x0 )

+ ( x1 + x0 n+2r ) .

n n n+2r 0

Proof. As before we have

f (x0 + t(x1 ’ x0 ))(x1 ’ x0 )2 ¤ C( x1 x1 ’ x0 0 .

+ x0 n+2r )

n n+2r

Since Ψ is tame we obtain now

x1 ’ x0 = Ψ(x0 ) f (x1 ) ’ f (x0 )

n

1

(1 ’ t)f (x0 + t(x1 ’ x0 ))(x1 ’ x0 )2

’

n

0

¤ Ψ(x0 ) f (x1 ) ’ f (x0 ) +

n

1

(1 ’ t)f (x0 + t(x1 ’ x0 ))(x1 ’ x0 )2

+ Ψ(x0 )

n

0

51.21

566 Chapter X. Further Applications 51.21

¤C f (x1 ) ’ f (x0 ) · f (x1 ) ’ f (x0 )

+ x0 +

n n+2r 0

1

(1 ’ t)f (x0 + t(x1 ’ x0 ))(x1 ’ x0 )2

+C +

n

0

1

(1 ’ t)f (x0 + t(x1 ’ x0 ))(x1 ’ x0 )2

·

+ x0 n+2r

0

0

¤C f (x1 ) ’ f (x0 ) f (x1 ) ’ f (x0 )

+ ( x1 + x0 n+2r )

n n+2r 0

x1 ’ x0

+ ( x1 + x0 n+2r )

n+2r 0

¤C f (x1 )’f (x0 ) 0

· ( x1 x1 ’ x0

+ ( x1 + x0 n+2r ) + x0 2r )

n+2r 2r 0

¤2δ C f (x1 )’f (x0 ) 0

¤C f (x1 ) ’ f (x0 ) f (x1 ) ’ f (x0 )

+ ( x1 + x0 n+2r ) .

n n+2r 0

Proof of (51.19). As in (51.18) we may assume that the initial condition is f :

0 ’ 0 and that E = Σ(B) and F = Σ(C).

The idea of the proof is to solve the equation f (x) = y via a di¬erential equation

for a curve t ’ x(t) whose image under f joins 0 and y a¬nely. More precisely we

consider the parameterization t ’ h(t) y of the segment [0, y], where h(t) := 1’e’ct

is a smooth increasing function with h(0) = 0 and limt’+∞ h(t) = 1. Di¬erentiation

of f (x(t)) = h(t) y yields f (x(t)) · x (t) = h (t) y and (if f (x) is invertible) that

x (t) = c Ψ(x(t)) · e’ct y. Substituting e’ct y = (1 ’ h(t)) y = y ’ f (x(t)) gives

x (t) = c Ψ(x(t)) · (y ’ f (x(t))).

In Fr´chet spaces (like Σ(B)) we cannot guarantee that this di¬erential equation

e

with initial condition x(0) = 0 has a solution. The subspaces Bt := {(xk )k ∈

Σ(B) : xk = 0 for k > t} however are Banach spaces (isomorphic to ¬nite products

of B), and they are direct summands with the obvious projections. So the idea is

to modify the di¬erential equation in such a way that for ¬nite t it factors over Bt

and to prove that the solution of the modi¬ed equation still converges for t ’ ∞

to a solution x∞ of f (x∞ ) = y. Since t is a non-discrete parameter we have to

consider the spaces Bt as a continuous family of Banach spaces, and so we have to

¬nd a family (σt )t∈R of projections (called smoothing operators). For this we take

a smooth function σ : R ’ [0, 1] with σ(t) = 0 for t ¤ 0 and σ(t) = 1 for t ≥ 1.

Then we set σt (x)(k) := σ(t ’ k) · x(k).

We have to show that σt ’ Id, more precisely we

¤ cn,m e(n’m)t x

Claim. For n ≥ m there exists a cn,m such that σt x and

n m

(1 ’ σt )x m ¤ cn,m e(m’n)t x n .

nk

xk . Since (σt x)k ¤

Recall that x n := ke xk for all t and k and

51.21

51.22 51. The Nash-Moser inverse function theorem 567

(σt x)k = 0 for t ¤ k and ((1 ’ σt )x)k = 0 for t ≥ k + 1 we have

enk (σt x)k

σt x =

n

k

enk xk ¤ e(n’m)k emk xk ¤ e(n’m)t x

¤ m,

k¤t k¤t

emk xk ¤ e(m’n)k enk xk ¤ en’m e(m’n)t x k .

(1 ’ σt )x ¤

m

k≥t’1 k≥t’1

Now we modify our di¬erential equation by projecting the arguments of the de¬ning

function to Bt , i.e.

x (t) = c Ψ(σt (x(t))) · (σt (y ’ f (x(t)))) with x(0) = 0.

Thus, our modi¬ed di¬erential equation factors for ¬nite t over some Banach space.

The following sublemma now provides us with local solutions

51.22. Sublemma. If a function f : F ⊇ U ’ F factors via smooth maps over a

Banach space E “ i.e., f = g —¦ h, where h : F ⊇ U ’ W ⊆ E and g : E ⊇ W ’ F

are smooth maps “ then the di¬erential equation y (t) = f (y(t)) has locally unique

solutions depending continuously (smoothly) on the initial condition y0 ∈ U .

RR f j w F

j

h h

F

yh R hh

Thg

R

h h

h

wE

x

R

Proof. Suppose y is a solution of the di¬erential equation y = f —¦ y with initial

t

condition y(0) = y0 , or equivalently y(t) = y0 + 0 f (y(s)) ds. The idea is to consider

the curve x := h —¦ y in the Banach space E. Thus,

t t

x(t) = h y0 + g(h(y(s))) ds = h y0 + g(x(s)) ds .

0 0

Now conversely, if x is a solution of this integral equation, then t ’ y(t) := y0 +

t

g(x(s)) ds is a solution of the original integral equation and hence also of the

0

t

di¬erential equation, since x(t) = h(y0 + 0 g(x(s)) ds) = h(y(t)), and so y(t) =

t t t

y0 + 0 g(x(s)) ds = y0 + 0 g(h(y(s))) ds = y0 + 0 f (y(s)) ds.

In order to show that x exists, we consider the map

t

k:x’ t ’ h y0 + g(x(s)) ds

0

and show that it is a contraction.

Since h is smooth we can ¬nd a seminorm on F , a C > 0 and an · > 0 such

q

that

h(y1 ) ’ h(y0 ) ¤ C y1 ’ y0 ¤ ·.

for all yj

q q

51.22

568 Chapter X. Further Applications 51.22

Furthermore, since g is smooth we ¬nd a constant C > 0 and θ > 0 such that

g(x1 ) ’ g(x0 ) ¤ C x1 ’ x0 for all xj ¤ θ.

q

¤ C and that θ ¤ 1. So we

Since we may assume that h(0) = 0, that g(0) q

obtain

h(y) ¤ C y ¤ · and g(x) ¤ 2C for all x ¤ θ.

for all y

q q q

˜ ˜

Let U := {y0 ∈ F : y0 q ¤ δ}, let V := {x ∈ C([0, µ], E) : x(t) ¤ θ for all t},

˜˜

and let k : F — C([0, µ], E) ⊇ U — V ’ C([0, µ], E) be given by

t

k(y0 , x)(t) := h y0 + g(x(s)) ds .

0

˜

Then k is continuous with values in V and is a C 2 µ-contraction with respect to x.

t

In fact, y0 + 0 g(x(s)) ds q ¤ y0 q + µ sup{ g(x(s)) q : s} ¤ δ + 2C µ ¤ · for

su¬ciently small δ and µ. So k(y0 , x)(t) ¤ C · ¤ θ for su¬ciently small ·. Hence,

˜

k(y0 , x) ∈ V . Furthermore,

t

k(y0 , x1 )(t) ’ k(y0 , x0 )(t) ¤ C g(x1 (s)) ’ g(x0 (s)) ds

0 q

¤ C µ sup{ g(x1 (s)) ’ g(x0 (s)) : s}

q