important point to keep in mind is that the detector is a classical object which”unlike

the photon”has a well-de¬ned location in space. This is what makes the detection

amplitude a useful replacement for the missing photon wave function.

We extend this approach to two photons by pretending that |r1 , s1 ; r2 , s2 =

(’) (’)

Es1 (r1 ) Es2 (r2 ) |0 is a state with one photon at r1 (with polarization es1 ) and

another at r2 (with polarization es2 ). For a two-photon state |Ψ this suggests the

e¬ective wave function

Ψ (r1 , s1 ; r2 , s2 ) = r1 , s1 ; r2 , s2 |Ψ

(+) (+)

= 0 Es1 (r1 ) Es2 (r2 ) Ψ

= e—1 i e—2 j Ψij (r1 , r2 ) , (6.96)

s s

where

(+) (+)

Ψij (r1 , r2 ) = 0 Ei (r1 ) Ej (r2 ) Ψ . (6.97)

Applying the method used for G(1) to the evaluation of eqn (4.75) for the second-order

correlation function (with all time arguments equal) yields

(2) (’) (’) (+) (+)

Gklij (r1 , r2 ; r1 , r2 ) = Ek (r1 ) El (r2 ) Ei (r1 ) Ej (r2 )

= Ψij (r1 , r2 ) Ψ— (r1 , r2 ) , (6.98)

kl

¾½ Entangled states

which has the form of the two-particle density matrix corresponding to the pure two-

particle wave function Ψij (r1 , r2 ).

The physical interpretation of Ψij (r1 , r2 ) follows from the discussion of coincidence

counting in Section 9.2.4, which shows that the coincidence-counting rate for two fast

detectors placed at equal distances from the source of the ¬eld is proportional to

(es1 )k (es2 )l e—1 e—2

(2) 2

Gklij (r1 , r2 ; r1 , r2 ) = |Ψ (r1 , s1 ; r2 , s2 )| , (6.99)

s s

i j

where es1 and es2 are the polarizations admitted by the ¬lters associated with the

2

detectors. Since |Ψ (r1 , s1 ; r2 , s2 )| determines the two-photon counting rate, we will

refer to Ψ (r1 , s1 ; r2 , s2 )”or Ψij (r1 , r2 )”as the two-photon detection amplitude.

6.6.3 Pure state entanglement de¬ned by detection amplitudes

We are now ready to formulate an alternative de¬nition of entanglement, for pure

states of photons, that is directly related to observable counting rates. The detection

amplitude for the two-photon state |Ψ , de¬ned by eqn (6.83), can be evaluated by

using eqns (3.69) and (6.85) in eqn (6.97), with the result:

√

Cks,k s Fk (eks )i eik·r1 Fk (ek s )j eik ·r2 .

Ψij (r1 , r2 ) = 2 (6.100)

ks,k s

This expansion for the detection amplitude can be inverted, by Fourier transforming

with respect to r1 and r2 and projecting on the polarization basis, to get

(2 0 / )2

√

=’

Cks,k s Ψks,k s , (6.101)

2ωk ωk

where

1

d3 r2 e’ik·r1 e’ik ·r2 (e— )i (e— s )j Ψij (r1 , r2 ) .

d3 r1

Ψks,k s = (6.102)

ks k

V

According to eqns (6.100) and eqn (6.101), the two-photon detection amplitude and

the expansion coe¬cients Cks,k s provide equivalent descriptions of the two-photon

state. From eqn (6.100) we see that factorization of the expansion coe¬cients, accord-

ing to eqn (6.86), implies factorization of the detection amplitude, i.e.

Ψij (r1 , r2 ) = φi (r1 ) φj (r2 ) , (6.103)

where

γks Fk (eks )i eik·r .

φi (r) = 21/4 (6.104)

ks

In other words, the detection amplitude for a separable state factorizes, just as a two-

particle wave function does in nonrelativistic quantum mechanics. On the other hand,

eqn (6.101) shows that factorization of the detection amplitude implies factorization of

the expansion coe¬cients. Thus we are at liberty to use eqn (6.103) as a de¬nition of

a separable state that agrees with the de¬nition (6.86). This approach has the decided

¾½

Entanglement for photons

advantage that the detection amplitude is closely related to directly observable events,

e.g. current pulses emitted by the coincidence counter. The coincidence-counting rate

is proportional to the square of the amplitude, so for separable states the coincidence

rate is proportional to the product of the singles rates at the two detectors. This means

that the random counting events at the two detectors are stochastically independent,

i.e. the quantum ¬‚uctuations of the electromagnetic ¬eld at any pair of detectors are

uncorrelated. This is the analogue of Theorem 6.3, which states that a separable state

of two distinguishable particles yields uncorrelated quantum ¬‚uctuations for any pair

of observables.

For ks = k s the state |Ψ = |1ks , 1k s is entangled”according to the traditional

de¬nition”and evaluating eqn (6.100) in this case gives

(eks )i eik·r1 (ek s )j eik ·r2 + (eks )j eik·r2 (ek s )i eik ·r1 .

Ψij (r1 , r2 ) = Fk Fk

(6.105)

The de¬nition (6.96) in turn yields

Ψ (r1 , s1 ; r2 , s2 ) = φks (r1 , s1 ) φk s (r2 , s2 ) + φks (r2 , s2 ) φk s (r1 , s1 ) , (6.106)

where

φks (r, s1 ) = Fk e—1 · eks eik·r . (6.107)

s

This has the structure of an entangled-state wave function for two bosons”as shown in

eqn (6.80)”with similar physical consequences. In particular, if one photon is detected

in the mode ks, then a subsequent detection of the remaining photon is guaranteed

to ¬nd it in the mode k s . More generally, quantum ¬‚uctuations in the electromag-

netic ¬eld at the two detectors are correlated. According to the general de¬nition in

Section 6.5.3, an entangled two-photon state is dynamically entangled if the detection

amplitude cannot be expressed in the minimal form (6.106) required by Bose statistics.

We saw in Section 6.4.1 that reduced density operators, de¬ned by partial traces,

are quite useful in the discussion of distinguishable particles, but systems of identical

particles”such as photons”cannot be divided into distinguishable subsystems. The

key to overcoming this di¬culty is found in eqn (6.98) which shows that the second-

order correlation function has the form of a density matrix corresponding to the two-

photon detection amplitude Ψij (r1 , r2 ). This suggests that the analogue of the reduced

(1)

density matrix is the ¬rst-order correlation function Gij (r ; r), evaluated for the two-

photon state |Ψ .

The ¬rst evidence supporting this proposal is provided by considering a separable

state de¬ned by eqn (6.87). In this case

(1) (’) (+)

Gij (r ; r) = Ψ Ei (r ) Ej (r) Ψ

1

0 “2 Ei (r ) Ej (r) “†2 0

(’) (+)

=

2

1

0 “2 , Ei (r ) Ej (r) , “†2

(’) (+)

= 0, (6.108)

2

¾½ Entangled states

(+)

where the last line follows from the identity Ej (r) |0 = 0 and its adjoint. The ¬eld

operators and the operators “ and “† are both linear functions of the creation and

annihilation operators, so

(r) , “†2 = 2 Ej (r) , “† “† .

(+) (+)

Ej (6.109)

The remaining commutator is a c-number which is evaluated by using the expansions

(3.69) and (6.88) to get

Ej (r) , “† = 2’1/4 φj (r) ,

(+)

(6.110)

where φi (r) is de¬ned by eqn (6.104). Substituting this result, and the corresponding

(’)

expression for “, Ei (r ) , into eqn (6.108) yields

√

Gij (r ; r) = 2φj (r) φ— (r ) .

(1)

(6.111)

i

The conclusion is that the ¬rst-order correlation function for a separable state factor-

izes. This is the analogue of Theorem 6.1 for distinguishable particles.

Next let us consider a generic entangled state de¬ned by |Ψ = “† ˜† |0 , where

θks a†

˜† = (6.112)

ks

ks

and

|θks |2 = 1 . (6.113)

ks

For this argument, we can con¬ne attention to operators satisfying “, ˜† = 0, which

is equivalent to the orthogonality of the classical wave packets:

—

(θ, γ) ≡ θks γks = 0 . (6.114)

ks

The ¬rst-order correlation function for this state is

(1) (’) (+)

Gij (r ; r) = Ψ Ei (r ) Ej (r) Ψ

1

= √ {φj (r) φ— (r ) + ·j (r) ·i (r )} ,

—

(6.115)

i

2

where ·j (r) is de¬ned by replacing γks with θks in eqn (6.104). Thus for the entangled,

two-photon state |Ψ , the ¬rst-order correlation function (reduced density matrix) has

the standard form of the density matrix for a one-particle mixed state. This is the

analogue of Theorem 6.2 for distinguishable particles.

6.7 Exercises

6.1 Proof of Theorem 6.1

(1) To prove assertion (a), use the expression for the density operator resulting from

eqns (6.40) and (2.81) to evaluate the reduced density operators.

(2) To prove assertion (b), assume that |Ψ is entangled”so that it has Schmidt rank

r > 1”and derive a contradiction.

¾½

Exercises

6.2 Proof of Theorem 6.3

(1) For a separable state |Ψ show that Ψ |δA δB| Ψ = 0.

(2) Assume that Ψ |δA δB| Ψ = 0 for all A and B. Apply this to operators that are

diagonal in the Schmidt basis for |Ψ and thus show that |Ψ must be separable.

6.3 Singlet spin state

(1) Use the standard treatments of the Pauli matrices, given in texts on quantum

mechanics, to express the eigenstates of n · σ in the usual basis of eigenstates of

σz .

(2) Show that the singlet state |S = 0 , given by eqn (6.37), has the same form for all

choices of the quantization axis n.

2

|S = 0 = 0.

(3) Show that SA + SB

6.4 Correlations in a separable mixed state

Consider a system of two distinguishable spin-1/2 particles described by the ensemble

{|Ψ1 = |‘ |“ , |Ψ2 = |“ |‘ B}

A B A

of separable states, where the spin states are eigenstates of sA and sB .

z z

(1) Show that the density operator can be written as

ρ = p |Ψ1 Ψ1 | + (1 ’ p) |Ψ2 Ψ2 | ,

where 0 p 1.

(2) Evaluate the correlation function δsA δsB and use the result to show that the

z z

spins are only uncorrelated for the extreme values p = 0, 1.

(3) For intermediate values of p, argue that the correlation is exactly what would be

found for a pair of classical stochastic variables taking on the values ±1/2 with

the same assignment of probabilities.

7

Paraxial quantum optics

The generation and manipulation of paraxial beams of light forms the core of exper-

imental practice in quantum optics; therefore, it is important to extend the classical

treatment of paraxial optics to situations involving only a few photons, such as the

photon pairs produced by spontaneous down-conversion. In addition to the interac-

tion of quantized ¬elds with standard optical elements, the theory of quantum paraxial

propagation has applications to fundamental issues such as the generation and control

of orbital angular momentum and the meaning of localization for photons.

In geometric optics a beam of light is a bundle of rays making small angles with a

central ray directed along a unit vector u0 . The constituent rays of the bundle are said

to be paraxial. In wave optics, the bundle of rays is replaced by a bundle of unit vectors

normal to the wavefront; so a paraxial wave is de¬ned by a wavefront that is nearly

¬‚at. In this situation it is natural to describe the classical ¬eld amplitude, E (r, t), as a

function of the propagation variable ζ = r·u0 , the transverse coordinates r tangent to

the wavefront, and the time t. Paraxial wave optics is more complicated than paraxial

ray optics because of di¬raction, which couples the r -, ζ-, and t-dependencies of the

¬eld. For the most part, we will only consider a single paraxial wave; therefore, we can

choose the z-axis along u0 and set ζ = z.

The de¬nite wavevector associated with the plane wave created by a† (k) makes it

s

possible to recast the geometric-optics picture in terms of photons in plane-wave states.

This way of thinking about paraxial optics is useful but”as always”it must be treated

with caution. As explained in Section 3.6.1, there is no physically acceptable way to

de¬ne the position of a photon. This means that the natural tendency to visualize the

photons as beads sliding along the rays at speed c must be strictly suppressed. The

beads in this naive picture must be replaced by wave packets containing energy ω

and momentum k, where k is directed along the normal to the paraxial wavefront.

In the following section, we begin with a very brief review of classical paraxial

wave optics. In succeeding sections we will de¬ne a set of paraxial quantum states,

and then use them to obtain approximate expressions for the energy, momentum,

and photon number operators. This will be followed by the de¬nition of a slowly-

varying envelope operator that replaces the classical envelope ¬eld E (r, t). Some more

advanced topics”including the general paraxial expansion, angular momentum, and

an approximate notion of photon localizability”will be presented in the remaining

sections.

¾½

Paraxial states

7.1 Classical paraxial optics

As explained above, each photon is distributed over a wave packet, with energy ω and

momentum k, that propagates along the normal to the wavefront. However, this wave

optics description must be approached with equal caution. The standard approach in

classical, paraxial wave optics (Saleh and Teich, 1991, Sec. 2.2C) is to set

E (r, t) = E (r, t) ei(k0 ·r’ω0 t) , (7.1)

where ω0 and k0 = u0 n (ω0 ) ω0 /c are respectively the carrier frequency and the carrier

wavevector. The four-dimensional Fourier transform, E (k, ω), of the slowly-varying

envelope is assumed to be concentrated in a neighborhood of k = 0, ω = 0. The

equivalent conditions in the space“time domain are

‚ 2 E (r, t) ‚E (r, t)

ω0 E (r, t)

2

ω0 (7.2)

‚t2 ‚t

and

‚ 2 E (r, t) ‚E (r, t)

k0 E (r, t) ;

2

k0 (7.3)

‚z 2 ‚z

in other words, E (r, t) has negligible variation in time over an optical period and

negligible variation in space over an optical wavelength. As we have already seen in

the discussion of monochromatic ¬elds, these conditions cannot be applied to the ¬eld

operator E(+) (r, t); instead, they must be interpreted as constraints on the allowed

states of the ¬eld.

7.2 Paraxial states

7.2.1 The paraxial ray bundle

A paraxial beam associated with the carrier wavevector k0 , i.e. a bundle of wavevectors

k clustered around k0 , is conveniently described in terms of relative wavevectors q =

k ’ k0 , with |q| k0 . For each k = k0 + q the angle ‘k between k and k0 is given by

|k0 — k| |k0 — q| |q | q

sin ‘k = = = 1+O , (7.4)

k0 |k0 + q|

k0 k k0 k0

where q = q ’ qz k0 and qz = q · k0 . This shows that ‘k |q | /k0 , and further

suggests de¬ning the small parameter for the paraxial beam as the maximum opening

angle,

∆q

θ= 1, (7.5)

k0

where 0 < |q | < ∆q is the range of the transverse components of q. Variations in

the transverse coordinate r occur over a characteristic distance Λ de¬ned by the

Fourier transform uncertainty relation Λ ∆q ∼ 1; consequently, a useful length scale

for transverse variations is Λ = 1/∆q = 1/ (θk0 ).

A natural way to de¬ne the characteristic length Λ for longitudinal variations

is to interpret the transverse length scale Λ as the radius of an e¬ective circular

¾¾¼ Paraxial quantum optics

aperture. The conventional longitudinal scale is then the distance over which a beam

waist, initially equal to Λ , doubles in size. At this point, a strictly correct argument

would bring in classical di¬raction theory; but the same end can be achieved”with

only a little sleight of hand”with geometric optics. By combining the approximation

tan θ ≈ θ with elementary trigonometry, it is easy to show that the geometric image

of the aperture on a screen at a distance Λ has the radius Λ = Λ + θΛ . The trick

is to choose the longitudinal scale length Λ so that Λ = 2Λ , and this requires

Λ 1

= k0 Λ2 = 2 .

Λ= (7.6)

θ θ k0

We will see in Section 7.4 that Λ = k0 Λ2 is twice the Rayleigh range”as usually

de¬ned in classical di¬raction theory”for the aperture Λ . Thus our geometric-optics

trick has achieved the same result as a proper di¬raction theory argument. Since

propagation occurs along the direction characterized by Λ , the natural time scale is

T = Λ / (c/n0 ) = 1/ θ2 ω0 .

The spread, ∆qz , in the longitudinal component of q satis¬es Λ ∆qz ∼ 1, so the

longitudinal and transverse widths are related by

2

∆qz ∆q

= θ2 ,

= (7.7)

k0 k0

and the q-vectors are e¬ectively con¬ned to a disk-shaped region de¬ned by

Q0 = q satisfying |q | θ2 k0 .

θk0 , qz (7.8)

In a dispersive medium with index of refraction n (ω) the frequency ωk is a solution

of the dispersion relation ck = ωk n (ωk ), and wave packets propagate at the group

velocity vg (ωk ) = dωk /dk. The frequency width is therefore ∆ω = vg0 ∆k, where vg0

is the group velocity at the carrier frequency. The straightforward calculation outlined

in Exercise 7.1 yields the estimate

∆ω 1

≈ θ2 1, (7.9)

ω0 2

which is the criterion for a monochromatic ¬eld given by eqn (3.107).

7.2.2 The paraxial Hilbert space

The geometric-optics picture of a bundle of rays forming small angles with the central

propagation vector k0 is realized in the quantum theory by a family of states that only

contain photons with propagation vectors in the paraxial bundle. In order to satisfy

the superposition principle, the family of states must be chosen as the paraxial space,

H (k0 , θ) ‚ HF , spanned by the improper (continuum normalized) number states

M

a† m (qm ) |0 , M = 0, 1, . . . ,

|{qs}M = (7.10)

0s

m=1

where a0s (q) = as (k0 +q), {qs}M ≡ {q1 s1 , . . . , qM sM }, and each relative propagation

vector is constrained by the paraxial conditions (7.8). If the paraxial restriction were

¾¾½

Paraxial states

relaxed, eqn (7.10) would de¬ne a continuum basis set for the full Fock space, so the

paraxial space is a subspace of HF . The states satisfying the paraxiality condition

(7.8) also satisfy the monochromaticity condition (3.107); consequently, H (k0 , θ) is

a subspace of the monochromatic space H (ω0 ). A state |Ψ belonging to H (k0 , θ) is

called a pure paraxial state, and a density operator ρ describing an ensemble of

pure paraxial states is called a mixed paraxial state. A useful way to characterize

a paraxial state ρ in H (k0 , θ) is to note that the power spectrum

a† (k) as (k) = Tr ρa† (k) as (k)

p (k) = (7.11)

s s

s s

is strongly concentrated near k = k0 .

In the Schr¨dinger picture, a general paraxial state |Ψ (0) has an expansion in the

o

basis {|{qs}M }, and the time evolution is given by

|Ψ (t) = e’itH/ |Ψ (0) , (7.12)

where H is the total Hamiltonian, including interactions with atoms, etc. It is clear on

physical grounds that an initial paraxial state will not in general remain paraxial. For

example, a paraxial ¬eld injected into a medium containing strong scattering centers

will experience large-angle scattering and thus become nonparaxial as it propagates

through the medium. In more favorable cases, interaction with matter, e.g. transmis-

sion through lenses with moderate focal lengths, will conserve the paraxial property.

The only situation for which it is possible to make a rigorous general statement

is free propagation. In this case the basis vectors |{qs}M are eigenstates of the total

Hamiltonian, H = Hem , so that

∞

d3 q1 d3 qM

|Ψ (t) = ··· F ({qs}M )

(2π)3 (2π)3

s1 sM

M=0

(7.13)

M

— exp ’i ω (|k0 + qm |) t |{qs}M ,

m=1

where F ({qs}M ) = {qs}M |Ψ (0) . Consequently, the state |Ψ (t) remains in the

paraxial space H (k0 , θ) for all times.

For the sake of simplicity, we have analyzed the case of a single paraxial ray bun-

dle, but in many applications several paraxial beams are simultaneously present. The

reasons range from simple re¬‚ection by a mirror to complex wave mixing phenomena

in nonlinear media. The necessary generalizations can be understood by considering

two paraxial bundles with carrier waves k1 and k2 and opening angles θ1 and θ2 . The

two beams are said to be distinct if the vector ∆k = k1 ’ k2 satis¬es

|∆k| max [θ1 |k1 | , θ2 |k2 |] , (7.14)

i.e. the two bundles of wavevectors do not overlap. The multiparaxial space,

H (k1 , θ1 , k2 , θ2 ), for two distinct paraxial ray bundles is spanned by the basis vec-

tors

¾¾¾ Paraxial quantum optics

M K

a† m a† k (pk ) |0

(qm ) (M, K = 0, 1, . . .) , (7.15)

1s 2s

m=1 k=1

where a† (q) ≡ a† (kβ + q) (β = 1, 2) and the qs and ps are con¬ned to the respective

s

βs

regions Q1 and Q2 de¬ned by applying eqn (7.8) to each beam. The argument sug-

gested in Exercise 7.6 shows that the paraxial spaces H (k1 , θ1 ) and H (k2 , θ2 )”which

are subspaces of H (k1 , θ1 , k2 , θ2 )”may be treated as orthogonal within the paraxial

approximation. This description is readily extended to any number of distinct beams.

7.2.3 Photon number, momentum, and energy

The action of the number operator N on the paraxial space H (k0 , θ) is determined by

its action on the basis states in eqn (7.10); consequently, the commutation relation,

N, a† (q) = a† (q), permits the use of the e¬ective form

0s 0s

d3 q

a† (q) a0s (q) .

N N0 = (7.16)

0s

3

(2π)

Q0 s

Applying the same idea to the momentum operator, given by the continuum version

of eqn (3.153), leads to Pem = k0 N0 + P0 , where

d3 q

a† (q) a0s (q)

P0 = q (7.17)

0s

3

(2π)

Q0 s

is the paraxial momentum operator.

The continuum version of eqn (3.150) for the Hamiltonian in a dispersive medium

can be approximated by

d3 q

a† (q) a0s (q) ,

Hem = ω|k0 +q| (7.18)

0s

3

(2π)

Q0 s

when acting on a paraxial state. The small spread in frequencies across the paraxial

bundle, together with the weak dispersion condition (3.120), allows the dispersion

relation ωk = ck/n (ωk ) to be approximated by

ck

ωk = , (7.19)

(ωk ’ ω0 )

dn

n0 + dω 0

and a straightforward calculation yields

q

’ 1 + ··· .

ω|k0 +q| = ω0 + vg0 k0 k0 + (7.20)

k0

The conditions (7.8) allow the expansion

q2

q qz

+ 2 + O θ2 ,

k0 + =1+ (7.21)

k0 k0 2k0

¾¾¿

The slowly-varying envelope operator

which in turn leads to the expression Hem = ω0 N0 + HP + O θ2 , where

vg0 q 2

d3 q

a† (q) a0s (q)

HP = vg0 qz + (7.22)

0s

3 2k0

(2π)

Q0 s

is the paraxial Hamiltonian for the space H (k0 , θ).

The e¬ective orthogonality of distinct paraxial spaces”which corresponds to the

distinguishability of distinct paraxial beams”implies that the various global operators

are additive. Thus the operators for the total photon number, momentum, and energy

for a set of paraxial beams are

N= Nβ , Pem = ( kβ Nβ + Pβ ) , Hem = ( ωβ Nβ + HP β ) , (7.23)

β β β

where Nβ , Pβ , and HP β are respectively the paraxial number, momentum, and energy

operators for the βth beam.

7.3 The slowly-varying envelope operator

We next use the properties of the paraxial space H (k0 , θ) to justify an approximation

for the ¬eld operator, A(+) (r, t), that replaces eqn (7.1) for the classical ¬eld. In order

to emphasize the relation to the classical theory, we initially work in the Heisenberg

picture. The slowly-varying envelope operator ¦ (r, t) is de¬ned by

(vg0 /c)

¦ (r, t) ei(k0 ·r’ω0 t) .

A(+) (r, t) = (7.24)

2 0 k0 c

Comparing this de¬nition to the general plane-wave expansion (3.149) shows that

d3 q

a0s (q) es (k0 + q) ei(q·r’δq t) ,

¦ (r, t) = 3 fq (7.25)

(2π)

Q0 s

where

vg (|k0 + q|) k0

δq = ω|k0 +q| ’ ω0 and fq = . (7.26)

|k0 + q|

vg0

The corresponding expressions in the Schr¨dinger picture follow from the relation

o

(+) (+)

A (r) = A (r, t = 0).

The envelope operator will only be slowly varying when applied to paraxial states

in H (k0 , θ), so we begin by using eqn (7.10) to evaluate the action of the envelope

operator ¦ (r) = ¦ (r, 0) on a typical basis vector of H (k0 , θ):

M

a† m (qm ) |0

¦ (r) |{qs}M = ¦ (r) 0s

m=1

M

a† m (qm ) |0

= ¦ (r) , 0s

m=1

M M

¦ (r) , a† m (1 ’ δlm ) a† l (ql ) |0 ,

= (qm ) (7.27)

0s 0s

m=1 l=1

¾¾ Paraxial quantum optics

where the last line follows from the identity (C.49). Setting t = 0 in eqn (7.25) produces

the Schr¨dinger-picture representation of the envelope operator,

o

d3 q

a0s (q) es (k0 + q) eiq·r ,

¦ (r) = 3 fq (7.28)

(2π)

Q0 s

and using this in the calculation of the commutator yields

¦ (r) , a† m (qm ) = fqm es (k0 + qm ) eiqm ·r

0s

= es (k0 ) eiqm ·r + O (θ) . (7.29)

Thus when acting on paraxial states the exact representation (7.28) can be replaced

by the approximate form

¦ (r) = φs (r) e0s + O (θ) , (7.30)

s

where e0s = es (k0 ), and

d3 q

(q) eiq·r .

φs (r) = 3 a0s (7.31)

(2π)

Q0

The subscript Q0 on the integral is to remind us that the integration domain is re-

stricted by eqn (7.8). This representation can only be used when the operator acts on

a vector in the paraxial space. It is in this sense that the z-component of the envelope

operator is small, i.e.

Ψ1 |¦z (r)| Ψ2 = O (θ) , (7.32)

for any pair of normalized vectors |Ψ1 and |Ψ2 that both belong to H (k0 , θ). In the

leading paraxial approximation, i.e. neglecting O (θ)-terms, the electric ¬eld operator

is

ω0 (vg0 /c)

e0s φs (r, t) ei(k0 ·r’ω0 t) .

E(+) (r, t) = i (7.33)

2 0 n0 s

The commutation relations for the transverse components of the envelope operator

have the simple form

¦i (r, t) , ¦† (r , t) = δij δ (r ’ r ) (i, j = 1, 2) , (7.34)

j

which shows that the paraxial electromagnetic ¬eld is described by two independent

operators ¦1 (r) and ¦2 (r) satisfying local commutation relations. This re¬‚ects the

fact that the paraxial approximation eliminates the nonlocal features exhibited in the

exact commutation relations (3.16) by e¬ectively averaging the arguments r and r

over volumes large compared to »3 . By the same token, the delta function appearing

0

on the right side of eqn (7.34) is coarse-grained, i.e. it only gives correct results

when applied to functions that vary slowly on the scale of the carrier wavelength. This

feature will be important when we return to the problem of photon localization.

¾¾

The slowly-varying envelope operator

In most applications the operators φs (r, t), corresponding to de¬nite polarization

states, are more useful. They satisfy the commutation relations

φs (r, t) , φ† (r , t) = δss δ (r ’ r ) (s, s = ± or 1, 2) . (7.35)

s

The approximate expansion (7.31) can be inverted to get

d3 rφs (r) e’iq·r = d3 re— (k0 ) · ¦ (r) e’iq·r ,

a0s (q) = (7.36)

s

which is valid for q in the paraxial region Q0 . By using this inversion formula the

operators N0 , P0 , and HP can be expressed in terms of the slowly-varying envelope

operator:

φ† (r) φs (r) ,

d3 r

N0 = (7.37)

s

s

φ† (r) ∇φs (r) ,

d3 r

P0 = (7.38)

s

i

s

vg0 ∇2

φ† (r) vg0 ∇z ’

3

HP = dr φs (r) . (7.39)

s

i 2k0

s

We can gain a better understanding of the paraxial Hamiltonian by substituting

eqns (7.24) and (7.22) into the Heisenberg equation

‚ (+)

A (r, t) = A(+) (r, t) , Hem

i (7.40)

‚t

to get

‚

ω0 ¦ (r, t) + i ¦ (r, t) = ω0 [¦ (r, t) , N0 ] + [¦ (r, t) , HP ] . (7.41)

‚t

Since the envelope operator ¦ (r, t) is a sum of annihilation operators, it satis¬es

[¦ (r, t) , N0 ] = ¦ (r, t). Consequently, the term ω0 [¦ (r, t) , N0 ] is canceled by the

time derivative of the carrier wave. The Heisenberg equation for the envelope ¬eld

¦ (r, t) is therefore

‚

i ¦ (r, t) = [¦ (r, t) , HP ] . (7.42)

‚t

This shows that the paraxial Hamiltonian generates the time translation of the en-

velope ¬eld. By using the explicit form (7.22) of HP and the commutation relations

(7.34), it is simple to see that the Heisenberg equation can be written in the equivalent

forms

1‚ 12

i ∇z + ∇ ¦ (r, t) = 0

¦ (r, t) + (7.43)

vg0 ‚t 2k0

or

1‚ 12

i ∇z + ∇ φs (r, t) = 0 .

φs (r, t) + (7.44)

vg0 ‚t 2k0

Multiplying eqn (7.43) by the normalization factor in eqn (7.24) and passing to the

classical limit (A(+) (r, t) ’ A (r, t) exp [i (k0 · r ’ ω0 t)]) yields the standard paraxial

wave equation of the classical theory.

¾¾ Paraxial quantum optics

The single-beam argument can be applied to each of the distinct beams to give the

Schr¨dinger-picture representation,

o

(vgβ /c)

eβs φβs (r) eikβ ·r ,

A(+) (r) = (7.45)

2 0 kβ c

βs

where eβs = es (kβ ), ωβ = ω (kβ ) = ckβ /nβ , vgβ is the group velocity for the βth

carrier wave,

d3 q iq·r

φβs (r) = 3 aβs (q) e , (7.46)

Qβ (2π)

and

φβs (r) , φ† (r ) ≈ δββ δss δ (r ’ r ) (s, s = ± or 1, 2) . (7.47)

β s

The last result”which is established in Exercise 7.3”means that the envelope ¬elds

for distinct beams represent independent degrees of freedom.

The corresponding expression for the electric ¬eld operator in the paraxial approx-

imation is

ωβ (vgβ /c)

eβs φβs (r) eikβ ·r .

E(+) (r) = i (7.48)

2 0 nβ

βs

The operators for the photon number Nβ , the momentum Pβ , and the paraxial Hamil-

tonian HβP of the individual beams are obtained by applying eqns (7.37)“(7.39) to

each beam.

7.4 Gaussian beams and pulses

It is clear from the relation E = ’‚A/‚t that the electric ¬eld also satis¬es the

paraxial wave equation. For the special case of propagation along the z-axis through

vacuum, we ¬nd

12 1 ‚E

‚E

∇ E +i + = 0. (7.49)

2k0 ‚z c ‚t

For ¬elds with pulse duration much longer than any relevant time scale”or equiva-

lently with spectral width much smaller than any relevant frequency”the time depen-

dence of the slowly-varying envelope function can be neglected; that is, one can set

‚E/‚t = 0 in eqn (7.49). The most useful time-independent solutions of the paraxial

equation are those which exhibit minimal di¬ractive spreading. The fundamental solu-

tion with these properties”which is called a Gaussian beam or a Gaussian mode

(Yariv, 1989, Sec. 6.6)”is

w0 e’iφ(z) ρ2 ρ2

E (r, t) = E 0 (r , z) = E0 e0 exp ’ 2

exp ik0 , (7.50)

w (z) 2R (z) w (z)

where the polarization vector e0 is in the x“y plane and ρ = |r |. The functions of z

on the right side are de¬ned by

The paraxial expansion— ¾¾

2

z ’ zw

w (z) = w0 1+ , (7.51)

ZR

2

ZR

R (z) = z ’ zw + , (7.52)

z ’ zw

z ’ zw

φ (z) = tan’1 , (7.53)

ZR

where the Rayleigh range ZR is

2

πw0

ZR = > 0. (7.54)

»0

The function w (z)”which de¬nes the width of the transverse Gaussian pro¬le”has

the minimum value w0 (the spot size) at z = zw (the beam waist). The solution is

completely characterized by e0 , E0 , w0 , and zw . The function R (z)”which represents

the radius of curvature of the phase front”is negative for z < zw , and positive for

z > zw . The picture is of waves converging from the left and diverging to the right of

the focal point at the waist. The de¬nition (7.51) shows that

√

w (zw + ZR ) = 2w0 , (7.55)

so the Rayleigh range measures the distance required for di¬raction to double the area

of the spot. There are also higher-order Gaussian modes that are not invariant under

rotations around the beam axis (Yariv, 1989, Sec. 6.9).

The assumption ‚E/‚t = 0 means that the Gaussian beam represents an in¬nitely

long pulse, so we should expect that it is not a normalizable solution. This is readily

veri¬ed by showing that the normalization integral over the transverse coordinates has

the z-independent value

2 2

d2 r |E 0 (r , z)| = πw0 |E0 | ,

2

(7.56)

so that the z-integral diverges. A more realistic description is based on the observation

that

E P (r, t) = FP (z ’ ct) E 0 (r , z) (7.57)

is a time-dependent solution of eqn (7.49) for any choice of the function FP (z).

If FP (z) is normalizable, then the Gaussian pulse (or Gaussian wave packet)

E P (r, t) is normalizable at all times. The pulse-envelope function is frequently chosen

to be Gaussian also, i.e.

2

(z ’ z0 )

FP (z) = FP 0 exp ’ , (7.58)

L2

P

where LP is the pulse length and TP = LP /c is the pulse duration.

¾¾ Paraxial quantum optics

The paraxial expansion—

7.5

The approach to the quantum paraxial approximation presented above is su¬cient

for most practical purposes, but it does not provide any obvious way to calculate

corrections. A systematic expansion scheme is desirable for at least two reasons.

(1) It is not wise to depend on an approximation in the absence of any method for

estimating the errors involved.

(2) There are some questions of principle, e.g. the issue of photon localizability, which

require the evaluation of higher-order terms.

We will therefore very brie¬‚y outline a systematic expansion in powers of θ (Deutsch

and Garrison, 1991a) which is an extension of a method developed by Lax et al. (1974)

for the classical theory. In the interests of simplicity, only propagation in the vacuum

will be considered.

In order to construct a consistent expansion in powers of θ, it is ¬rst necessary

to normalize all physical quantities by using the characteristic lengths introduced in

Section 7.2.1. The ¬rst step is to de¬ne a characteristic volume

3

»0

’4

2

V0 = Λ Λ = θ , (7.59)

2π

and a dimensionless wavevector q = q + q z k0 , with q = q Λ and q z = qz Λ . In

terms of the scaled wavevector q, the paraxial constraints (7.8) are

Q0 = {q satisfying |q | 1 , qz 1} . (7.60)

The operators a† (k) have dimensions L3/2 , so the dimensionless operators a† (q) =

s s

’1/2 †

V0 as (k0 + q) satisfy the commutation relation

as (q) , a† (q ) = δss (2π) δ (q’q ) .

3

(7.61)

s

In the space“time domain, the operator ¦ (r, t) has dimensions L’3/2 , so it is

√

natural to de¬ne a dimensionless envelope ¬eld by ¦ r, t = V0 ¦ (r, t), where r =

r + z k0 and r = r /Λ , z = z/Λ . The scaled position-space variables satisfy

q · r = q · r = q · r + q z z. The operator ¦ r, t is related to as (q) by

d3 q

as (q) Xs (q, θ) eiq·r ,

¦ (r) = (7.62)

3

(2π)

Q0 s

where Xs (q, θ) is the c-number function:

∞

k0

θn X(n) (q) .

Xs (q, θ) = es (k0 + q) = (7.63)

|k0 + q| s

n=0

Substituting this expansion into eqn (7.62) and exchanging the sum over n with the

integral over q yields

Paraxial wave packets— ¾¾

∞

(n)

θn ¦

¦ (r) = (r) , (7.64)

n=0

where the nth-order coe¬cient is

d3 q

(n)

as (q) X(n) (q) eiq·r .

¦ (r) = (7.65)

s

3

(2π) s

The zeroth-order relation

d3 q

(0)

as (q) es (k0 ) eiq·r

¦ (r) = (7.66)

3

(2π) s

agrees with the previous paraxial approximation (7.31), and it can be inverted to give

(0)

(r) · e— (k0 ) e’iq·r .

d3 r¦

as (q) = (7.67)

s

Carrying out Exercise 7.5 shows that all higher-order coe¬cients can be expressed in

(0)

terms of ¦0 (r).

We can justify the operator expansion (7.64) by calculating the action of the exact

envelope operator on a typical basis vector in H (k0 , θ), and showing that the expansion

of the resulting vector in θ agrees”order-by-order”with the result of applying the

operator expansion. In the same way it can be shown that the operator expansion

reproduces the exact commutation relations (Deutsch and Garrison, 1991a).

Paraxial wave packets—

7.6

The use of non-normalizable basis states to de¬ne the paraxial space can be avoided

by employing wave packet creation operators. For this purpose, we restrict the polar-

ization amplitudes, ws (k), (introduced in Section 3.5.1) to those that have the form

1/2

ws (k0 + q) = V0 w s (q). Instead of con¬ning the relative wavevectors q to the re-

gion Q0 described by eqn (7.60), we de¬ne a paraxial wave packet (with carrier

wavevector k0 and opening angle θ) by the assumption that w s (q) vanishes rapidly

outside Q0 , i.e. w s (q) belongs to the space

n

P (k0 , θ) = lim |q| |w s (q)| = 0 for all n

w s (q) such that 0. (7.68)

|q|’∞

The inner product for this space of classical wave packets is de¬ned by

d3 q —

(w, v) = ws (k0 + q) vs (k0 + q) . (7.69)

3

(2π) s

Since the two wave packets belong to the same space, this can be written in terms of

scaled variables as

d3 q

w — (q) v s (q) .

(w, v) = (7.70)

s

3

(2π) s

¾¿¼ Paraxial quantum optics

For a paraxial wave packet, we set k = k0 + q in the general de¬nition (3.191) to

get

d3 q d3 q

a† (k0 + q) ws (k0 + q) =

a† [w] = a† (q) w s (q) . (7.71)

s

0s

3 3

(2π) (2π)

s s

The paraxial space de¬ned by eqn (7.10) can equally well be built up from the vacuum

by forming all linear combinations of states of the form

P

a† [wp ] |0 ,

|{w}P = (7.72)

p=1

where {w}P = {w1 , . . . , wP }, P = 0, 1, 2, . . ., and the wp s range over all of P (k0 , θ).

The only di¬erence from the construction of the full Fock space is the restriction of the

wave packets to the paraxial space P (k0 , θ) ‚ “em , where “em is the electromagnetic

phase space of classical wave packets de¬ned by eqn (3.189).

The multiparaxial Hilbert spaces introduced in Section 7.2.2 can also be described

in wave packet terms. The distinct paraxial beams considered there correspond to the

wave packet spaces P (k1 , θ1 ) and P (k2 , θ2 ). Paraxial wave packets, w ∈ P (k1 , θ1 )

and v ∈ P (k2 , θ2 ), are concentrated around k1 and k2 respectively, so it is eminently

plausible that w and v are e¬ectively orthogonal. More precisely, it is shown in Exercise

7.6 that

1

n |(w, v)| = 0 for all n

lim 1, (7.73)

θ2 ’0 (θ2 )

i.e. |(w, v)| vanishes faster than any power of θ2 . The symmetry of the inner product

guarantees that the same conclusion holds for θ1 ; consequently, the wave packet spaces

P (k1 , θ1 ) and P (k2 , θ2 ) can be treated as orthogonal to any ¬nite order in θ1 or θ2 .

The approximate orthogonality of the wave packets w and v combined with the

general rule (3.192) implies

a [w] , a† [v] = 0 (7.74)

whenever w and v belong to distinct paraxial wave packet spaces. From this it is easy

to see that the quantum paraxial spaces H (k1 , θ1 ) and H (k2 , θ2 ) are orthogonal to any

¬nite order in the small parameters θ1 and θ2 . In the paraxial approximation, distinct

paraxial wave packets behave as though they were truly orthogonal modes. This means

that the multiparaxial Hilbert space describing the situation in which several distinct

paraxial beams are present is generated from the vacuum by generalizing eqn (7.72)

to

Pβ

a† [wβp ] |0 ,

{w1 }P1 , {w2 }P2 , . . . , = (7.75)

β p=1

where Pβ = 0, 1, . . ., and the wβp s are chosen from P (kβ , θβ ).

Angular momentum—

7.7

The derivation of the paraxial approximation for the angular momentum J = L + S

is complicated by the fact”discussed in Section 3.4”that the operator L does not

Angular momentum— ¾¿½

have a convenient expression in terms of plane waves. Fortunately, the argument used

to show that the energy and the linear momentum are additive also applies to the

angular momentum; therefore, we can restrict attention to a single paraxial space. Let

us begin by rewriting the expression (3.58) for the helicity operator S as

d3 q k0 + q/k0

a† (q) a+ (q) ’ a† (q) a’ (q) .

S= (7.76)

’

+

3

(2π) k0 + q/k0

P

The ratio q/k0 can be expressed as

Λ qz

q Λq

k0 = θq + θ2 q z k0 ,

= + (7.77)

k0 Λ k0 Λ k0

so expanding in powers of θ gives the simple result

S0 = k0 S0 + O (θ) , (7.78)

where

d3 q

a† (q) a+ (q) ’ a† (q) a’ (q)

S0 = ’

+

3

(2π)

P

d3 r φ† (r) φ+ (r) ’ φ† (r) φ’ (r) .

= (7.79)

’

+

Thus, to lowest order, the helicity has only a longitudinal component; the leading

transverse component is O (θ). This is the natural consequence of the fact that each

photon has a wavevector close to k0 .

To develop the approximation for L we substitute the paraxial representation (7.24)

and the corresponding expression (7.48) for E(+) (r, t) into eqn (3.57) to get

1

(’) (+)

r — ∇ Aj

d3 rEj

L0 = 2i 0

i

1

d3 r¦† (r, t) e’ik0 ·r r — ∇ ¦j (r, t) eik0 ·r

= j

i

1

d3 r¦† (r, t) r — k0 + r — ∇ ¦j (r, t) ,

= (7.80)

j

i

where the last line follows from the identity

e’ik0 ·r ∇eik0 ·r ¦j (r, t) = (∇ + ik0 ) ¦j (r, t) . (7.81)

This remaining gradient term can be written as

r— ∇ = r— k0 ∇z + ∇

i i

= r — k0 ∇z + z k0 — ∇ + r — ∇ , (7.82)

i i i

¾¿¾ Paraxial quantum optics

so that

L0 = L0 + k0 L0z , (7.83)

where the transverse part is given by

d3 r¦† (r) r — k0 + r — k0 ∇z + z k0 — ∇

L0 = ¦j (r) , (7.84)

j

i i

and the longitudinal component is

d3 r¦† (r) r ∇ ’r ∇

L0z = ¦j (r) . (7.85)

1 2 2 1

j

i i

The transverse part L0 is dominated by the term proportional to k0 . After

expressing the integral in terms of the scaled variable r and scaled ¬eld ¦, one ¬nds

that L0 = O (1/θ). The similar terms ω0 N0 and k0 N0 in the momentum and energy

are O 1/θ2 , so they are even larger. This apparently singular behavior is physically

harmless; it simply represents the fact that all photons in the wave packet have energies

close to ω0 and momenta close to k0 .

For the angular momentum the situation is di¬erent. The angular momenta of in-

dividual photons in plane-wave modes k0 +q must exhibit large ¬‚uctuations due to the

tight constraints on the polar angle ‘k given by eqn (7.4). These ¬‚uctuations are not

conjugate to the longitudinal component J0z , since rotations around the z-axis leave

‘k unchanged. On the other hand, the transverse components L0 generate rotations

around the transverse axes which do change the value of ‘k . Thus we should expect

large ¬‚uctuations in the transverse components of the angular momentum, which are

described by the large transverse term L0 . Thus only the longitudinal component L0z

is meaningful for a paraxial state. By combining eqns (7.85) and (7.79), we see that

the lowest-order paraxial angular momentum operator is purely longitudinal,

J0 = k0 [L0z + S0 ] . (7.86)

Approximate photon localizability—

7.8

Mandel™s local number operator, de¬ned by eqn (3.204), displays peculiar nonlocal

properties. Despite this apparent ¬‚aw, Mandel was able to demonstrate that N (V )

»3 , where »0

behaves approximately like a local number operator in the limit V 0

is the characteristic wavelength for a monochromatic ¬eld state. The important role

played by this limit suggests using the paraxial expansion to investigate the alternative

de¬nitions of the local number operator in a systematic way. To this end we ¬rst

introduce a scaled version of the Mandel detection operator by

1

M (r) = √ M (r) eik0 z . (7.87)

V0

By combining the de¬nition (3.203) with the expansion (7.64), the identity (7.81), and

the scaled gradient

Approximate photon localizability— ¾¿¿

∇ 1 1 ‚

∇ + u3

=

k0 k0 k0 ‚z

= θ∇ + θ2 u3 ∇z , (7.88)

one ¬nds

(0) (1) (2)

+ θ2 M + O θ3 ,

M=M + θM (7.89)

(0) (1) (1)

where M = ¦, M =¦ , and

1

(2) (2) 2

’ ∇ + 2i∇z ¦ .

M =¦ (7.90)

4

The corresponding expansion for N (V ) is

N (V ) = N (0) (V ) + θ2 N (2) (V ) + O θ4 , (7.91)

where

(0)† (0)

(r) · ¦

N (0) (V ) = d3 r¦ (r) ,

(7.92)

(1)† (1) (0)† (2)

·M ·M

(2) 3

N (V ) = dr M +M + HC .

A simple calculation using the local commutation relations (7.34) for the zeroth-

order envelope ¬eld yields

N (0) (V ) , N (0) (V ) = 0 (7.93)

for nonoverlapping volumes, and

N (0) (V ) , ¦† (r) = χV (r) ¦† (r) , (7.94)

where the characteristic function χV (r) is de¬ned by

1 for r ∈ V ,

χV (r) = (7.95)

0 for r ∈ V .

/

Thus N (0) (V ) acts like a genuine local number operator. The nonlocal features dis-

cussed in Section 3.6.2 will only appear in the higher-order terms. It is, however,

important to remember that the delta function in the zeroth-order commutation rela-

tion (7.34) is really coarse-grained with respect to the carrier wavelength »0 . For this

»3 .

reason the localization volume V must satisfy V 0

The paraxial expansion of the alternative operator G (V ), introduced in eqn (3.210),

shows (Deutsch and Garrison, 1991a) that the two de¬nitions agree in lowest order,

G(0) (V ) = N (0) (V ), but disagree in second order, G(2) (V ) = N (2) (V ). This disagree-

ment between equally plausible de¬nitions for the local photon number operator is a

consequence of the fact that a photon with wavelength »0 cannot be localized to a

¾¿ Paraxial quantum optics

volume of order »3 . Since most experiments are well described by the paraxial approx-

0

imation, it is usually permissible to think of the photons as localized, provided that

the diameter of the localization region is larger than a wavelength.

(’)

The negative frequency part Ai (r) is a sum over creation operators, so it is

(’)

tempting to interpret Ai (r) as creating a photon at the point r. In view of the

impossibility of localizing photons, this temptation must be sternly resisted. On the

other hand, the cavity operator a† can be interpreted as creating a photon described by

κ

the cavity mode E κ (r), since the mode function extends over the entire cavity. In the

same way, the plane-wave operator a† can be interpreted as creating a photon in the

ks

(box-normalized) plane-wave state with wavenumber k and polarization eks . Finally

the wave packet operator a† [w] can be interpreted as creating a photon described by

the classical wave packet w, but it would be wrong to think of the photon as strictly

localized in the region where w (r) is large. With this caution in mind, one can regard

the pulse-envelope w (r) as an e¬ective photon wave function, provided that the pulse

duration contains many optical periods and the transverse pro¬le is large compared

to a wavelength.

There are other aspects of the averaged operators that also require some caution.

The operator N [w] = a† [w] a [w] satis¬es

N [w] , a† [w] = a† [w] , [N [w] , a [w]] = ’a [w] , (7.96)

so it serves as a number operator for w-photons, but these number operators are not

mutually commutative, since

[N [w] , N [u]] = (w,u) a† [u] a [w] ’ a† [w] a [u] . (7.97)

Thus distinct w photons and u photons cannot be independently counted unless the

classical wave packets w and u are orthogonal. This lack of commutativity can be

important in situations that require the use of non-orthogonal modes (Deutsch et al.,

1991).

7.9 Exercises

7.1 Frequency spread for a paraxial beam

(1) Show that the fractional change in the index of refraction across a paraxial beam

is

ω0 dn

∆n ∆k n0 dω 0

= ,

k0 1 + ω0 dω 0

dn

n0 n0

where n0 = n (ω0 ) = (ω0 ) / and (dn/dω)0 is evaluated at the carrier fre-

0

quency.

k0 + |q |2 + qz with eqns (7.5) and (7.7) to get

2 2

(2) Combine the relation k =

2

∆k 1 ∆q 12

θ + ··· .

+ O θ4 =

=

k0 2 k0 2

¾¿

Exercises

(3) Combine this with ∆ω = vg0 ∆k to ¬nd

∆ω n0 vg0 1 n0 12 1

k0 θ2 = θ < θ2 .

= dn

ω0 ck0 2 2 2

n0 + ω 0 dω 0

7.2 Distinct paraxial Hilbert spaces are e¬ectively orthogonal

Consider the paraxial subspaces H (k1 , θ1 ) and H (k2 , θ2 ) discussed in Section 7.2.2.

(1) For a typical basis vector |{qs}κ in H (k1 , θ1 ) show that as (k) |{qs}κ ≈ 0 when-

ever |k ’ k1 | θ1 |k1 |.

(2) Use this result to argue that each basis vector in H (k2 , θ2 ) is approximately or-

thogonal to every basis vector in H (k1 , θ1 ).

7.3 Distinct paraxial ¬elds are independent

Combine the de¬nition (7.46) with the de¬nition (7.14) for distinct beams to show that

eqn (7.47) is satis¬ed in the same sense that distinct paraxial spaces are orthogonal.

An analogy to many-body physics—

7.4

Consider a special paraxial state such that the z-dependence of the ¬eld φs (r) can

be neglected and only one polarization is excited, so that φs (r) ’ φ (r ) . De¬ne an

e¬ective photon mass M0 such that the paraxial Hamiltonian HP for this problem

is formally identical to a second quantized description of a two-dimensional, nonrela-

tivistic, many-particle system of bosons with mass M0 (Huang, 1963, Appendix A.3;

Feynman, 1972). This feature leads to interesting analogies between quantum optics

and many-body physics (Chiao et al., 1991; Deutsch et al., 1992; Wright et al., 1994).

Paraxial expansion—

7.5

(1) Expand Xs (q, θ) through O θ2 .

(1)

(r) = ik0 ∇ · ¦(0) .

(2) Show that ¦

(2)

(r) = 1 ∇ ∇ · ¦(0) + ∇2 + 2i∇z ¦(0) .

1

(3) Show that ¦ 2 4

Distinct paraxial wave packet spaces are e¬ectively orthogonal—

7.6

Consider two paraxial wave packets, w ∈ P (k1 , θ1 ) and v ∈ P (k2 , θ2 ), where k1 and

k2 satisfy eqn (7.14).

(1) Apply the de¬nitions of q (Section 7.5) and w — (q) (Section 7.6) to show that

s

d3 q

V2

w — (q) v s q + ∆k ,

(w, v) = s

3

V1 (2π) s

where ∆k = k1 ’ k2 and the arguments of w— and v s are scaled with θ1 and θ2

s

respectively.

|q|, and combine this with the rapid fall o¬

(2) Calculate ∆k, explain why ∆k

’n

condition in eqn (7.68) to conclude that θ2 (w, v) ’ 0 as θ2 ’ 0 for any value of

n.

¾¿ Paraxial quantum optics

’n

(3) Show that θ2 a [w] , a† [v] ’ 0 as θ2 ’ 0.

8

Linear optical devices

The manipulation of light beams by passive linear devices, such as lenses, mirrors,

stops, and beam splitters, is the backbone of experimental optics. In typical arrange-

ments the individual devices are separated by regions called propagation segments

in which the light propagates through air or vacuum. The index of refraction is usually

piece-wise constant, i.e. it is uniform in each device and in each propagation segment.

In most arrangements each device or propagation segment has an axis of symmetry

(the optic axis), and the angle between the rays composing the beam and the local

optic axis is usually small. The light beams are then said to be piece-wise paraxial.

Under these circumstances, it is useful to treat the interaction of a light beam with a

single device as a scattering problem in which the incident and scattered ¬elds both

propagate in vacuum. The optical properties of the device determine a linear relation

between the complex amplitudes of the incident and scattered classical waves. After

a brief review of this classical approach, we will present a phenomenological descrip-

tion of quantized electromagnetic ¬elds interacting with linear optical devices. This

approach will show that, at the quantum level, linear optical e¬ects can be viewed”in

a qualitative sense”as the propagation of photons guided by classical scattered waves.

The scattered waves are a rough analogue of wave functions for particles, so the asso-

ciated classical rays may be loosely considered as photon trajectories. These classical

analogies are useful for visualizing the interaction of photons with linear optical de-

vices but”as is always the case with applications of quantum theory”they must be

used with care. A more precise wave-function-like description of quantum propagation

through optical systems is given in Section 6.6.2.

8.1 Classical scattering

The general setting for this discussion is a situation in which one or more paraxial

beams interact with an optical device to produce several scattered paraxial beams.

Both the incident and the scattered beams are assumed to be mutually distinct, in

the sense de¬ned by eqn (7.14). Under these circumstances, the paraxial beams will

be called scattering channels; the incident classical ¬elds are input channels and

the scattered beams are output channels. Since this process is linear in the ¬elds, the

initial and ¬nal beams can be resolved into plane waves. The conventional classical de-

scription of propagation through optical elements pieces together plane-wave solutions

of Maxwell™s equations by applying the appropriate boundary conditions at the inter-

faces between media with di¬erent indices of refraction, as shown in Fig. 8.1(a). This

procedure yields a linear relation between the Fourier coe¬cients of the incident and

¾¿ Linear optical devices

k6 ’k6

k4 ’k4

Fig. 8.1 (a) A plane wave ±kI exp (ikI · r) ’k64

incident on a dielectric slab. The re-

k1 ’k1

¬‚ected and transmitted waves are respec-

tively ±kR exp (ikR · r) and ±kT exp (ikT · r).

(b) The time reversed version of (a). The ex-

= >

tra wave at ’kT R is discussed in the text.

scattered waves that is similar to the description of scattering in terms of stationary

states in quantum theory (Bransden and Joachain, 1989, Chap. 4). From the viewpoint

of scattering theory, the classical piecing procedure is simply a way to construct the

scattering matrix relating the incident and scattered ¬elds. Before considering the

general case, we analyze two simple examples: a propagation segment and a thin slab

of dielectric.

For the propagation segment, an incident plane wave ± exp (ik · r)”the input

channel”simply acquires the phase kL, where L is the length of the segment along

the propagation direction, i.e. the relation between the incident amplitude ± and the

scattered amplitude ± ”representing the output channel”is

± = eikL ± = eiωL/c ± . (8.1)

In some applications the propagation segment through vacuum is replaced by a length

L of dielectric. If the end faces of the dielectric sample are antire¬‚ection coated, then

the scattering relation is

± = eik(ω)L ± = ein(ω)ωL/c ± , (8.2)

where n (ω) is the index of refraction for the dielectric. Since the transmitted wave

can be expressed as

± eik(z’ωt) = ±ei[kz’ω(t’∆t)] , (8.3)

where ∆t = n (ω) L/c, the dielectric medium is called a retarder plate, or sometimes

a phase shifter.

We next turn to the example of a plane wave incident on a thin dielectric slab”

which is not antire¬‚ection coated”as shown in Fig. 8.1(a). Ordinary ray tracing,

using Snell™s law and the law of re¬‚ection at each interface between the dielectric

and vacuum, determines the directions of the propagation vectors kR and kT (where

R and T stand for the re¬‚ected and transmitted waves respectively) relative to the

propagation vector kI of the incoming wave. Since the transmitted wave crosses the

dielectric“vacuum interface twice, we ¬nd the familiar result kT = kI , i.e. the incident

and transmitted waves are described by the same spatial mode.

The plane of incidence is de¬ned by the vectors kI and n, where n is the unit vector

normal to the slab. Every incident electromagnetic plane wave can be resolved into two

¾¿

Classical scattering

polarization components: the TE- (or S-) polarization, with electric vector perpendic-

ular to the plane of incidence, and the TM- (or P-) polarization, with electric vector

in the plane of incidence. For optically isotropic dielectrics, these two polarizations

are preserved by re¬‚ection and refraction. Since scattering is a linear process, we lose

nothing by assuming that the incident wave is either TE- or TM-polarized. This allows

us to simplify the vector problem to a scalar problem by suppressing the polarization

vectors. The three waves outside the slab are then ±kI exp (ikI · r), ±kR exp (ikR · r),

and ±kT exp (ikT · r). The solution of Maxwell™s equations inside the slab is a linear

combination of the transmitted wave at the ¬rst interface and the re¬‚ected wave from

the second interface. Applying the boundary conditions at each interface (Jackson,

1999, Sec. 7.3) yields a set of equations relating the coe¬cients, and eliminating the

coe¬cients for the interior solution leads to

±kR = r ±kI , ±kT = t ±kI , (8.4)

where the complex parameters r and t are respectively the amplitude re¬‚ection and

transmission coe¬cients for the slab. This is the simplest example of the general piecing

procedure discussed above.

Important constraints on the coe¬cients r and t follow from the time-reversal

invariance of Maxwell™s equations. What this means is that the time-reversed ¬nal

¬eld will evolve into the time-reversed initial ¬eld. This situation is shown in Fig.

8.1(b), where the incident waves have propagation vectors ’kR and ’kT and the

scattered waves have ’kI and ’kT R . The amplitudes for this case are written as ±T , q

where T stands for time reversal. The usual calculation gives the scattered waves as

±T I = r ±’kR + t ±’kT ,

T T

’k

(8.5)

±T T R = t ±’kR + r ±’kT .

T T

’k

In Appendix B.3.3 it is shown that the linear polarization basis can be chosen so that

the time-reversed amplitudes are related to the original amplitudes by eqn (B.80). In

the present case, this yields ±T I = ±— I , ±’kR = ±kR , ±’kT = ±kT , and ±T T R =

— —

T T

’k k ’k

±— T R . Substituting these relations into eqn (8.5) and taking the complex conjugate

k

gives a second set of relations between the amplitudes ±kI , ±kR , and ±kT :

±kI = r— ±kR + t— ±kT ,

(8.6)

±kT R = t— ±kR + r— ±kT .

There is an apparent discrepancy here, since the original problem had no wave with

propagation vector kT R . Time-reversal invariance for the original problem therefore