. 9
( 27)


of this fact, we will refer to Ψ (r, s) as the one-photon detection amplitude. The
important point to keep in mind is that the detector is a classical object which”unlike
the photon”has a well-de¬ned location in space. This is what makes the detection
amplitude a useful replacement for the missing photon wave function.
We extend this approach to two photons by pretending that |r1 , s1 ; r2 , s2 =
(’) (’)
Es1 (r1 ) Es2 (r2 ) |0 is a state with one photon at r1 (with polarization es1 ) and
another at r2 (with polarization es2 ). For a two-photon state |Ψ this suggests the
e¬ective wave function

Ψ (r1 , s1 ; r2 , s2 ) = r1 , s1 ; r2 , s2 |Ψ
(+) (+)
= 0 Es1 (r1 ) Es2 (r2 ) Ψ
= e—1 i e—2 j Ψij (r1 , r2 ) , (6.96)
s s

(+) (+)
Ψij (r1 , r2 ) = 0 Ei (r1 ) Ej (r2 ) Ψ . (6.97)

Applying the method used for G(1) to the evaluation of eqn (4.75) for the second-order
correlation function (with all time arguments equal) yields
(2) (’) (’) (+) (+)
Gklij (r1 , r2 ; r1 , r2 ) = Ek (r1 ) El (r2 ) Ei (r1 ) Ej (r2 )
= Ψij (r1 , r2 ) Ψ— (r1 , r2 ) , (6.98)
¾½ Entangled states

which has the form of the two-particle density matrix corresponding to the pure two-
particle wave function Ψij (r1 , r2 ).
The physical interpretation of Ψij (r1 , r2 ) follows from the discussion of coincidence
counting in Section 9.2.4, which shows that the coincidence-counting rate for two fast
detectors placed at equal distances from the source of the ¬eld is proportional to

(es1 )k (es2 )l e—1 e—2
(2) 2
Gklij (r1 , r2 ; r1 , r2 ) = |Ψ (r1 , s1 ; r2 , s2 )| , (6.99)
s s
i j

where es1 and es2 are the polarizations admitted by the ¬lters associated with the
detectors. Since |Ψ (r1 , s1 ; r2 , s2 )| determines the two-photon counting rate, we will
refer to Ψ (r1 , s1 ; r2 , s2 )”or Ψij (r1 , r2 )”as the two-photon detection amplitude.

6.6.3 Pure state entanglement de¬ned by detection amplitudes
We are now ready to formulate an alternative de¬nition of entanglement, for pure
states of photons, that is directly related to observable counting rates. The detection
amplitude for the two-photon state |Ψ , de¬ned by eqn (6.83), can be evaluated by
using eqns (3.69) and (6.85) in eqn (6.97), with the result:

Cks,k s Fk (eks )i eik·r1 Fk (ek s )j eik ·r2 .
Ψij (r1 , r2 ) = 2 (6.100)
ks,k s

This expansion for the detection amplitude can be inverted, by Fourier transforming
with respect to r1 and r2 and projecting on the polarization basis, to get

(2 0 / )2

Cks,k s Ψks,k s , (6.101)
2ωk ωk
d3 r2 e’ik·r1 e’ik ·r2 (e— )i (e— s )j Ψij (r1 , r2 ) .
d3 r1
Ψks,k s = (6.102)
ks k

According to eqns (6.100) and eqn (6.101), the two-photon detection amplitude and
the expansion coe¬cients Cks,k s provide equivalent descriptions of the two-photon
state. From eqn (6.100) we see that factorization of the expansion coe¬cients, accord-
ing to eqn (6.86), implies factorization of the detection amplitude, i.e.

Ψij (r1 , r2 ) = φi (r1 ) φj (r2 ) , (6.103)

γks Fk (eks )i eik·r .
φi (r) = 21/4 (6.104)

In other words, the detection amplitude for a separable state factorizes, just as a two-
particle wave function does in nonrelativistic quantum mechanics. On the other hand,
eqn (6.101) shows that factorization of the detection amplitude implies factorization of
the expansion coe¬cients. Thus we are at liberty to use eqn (6.103) as a de¬nition of
a separable state that agrees with the de¬nition (6.86). This approach has the decided
Entanglement for photons

advantage that the detection amplitude is closely related to directly observable events,
e.g. current pulses emitted by the coincidence counter. The coincidence-counting rate
is proportional to the square of the amplitude, so for separable states the coincidence
rate is proportional to the product of the singles rates at the two detectors. This means
that the random counting events at the two detectors are stochastically independent,
i.e. the quantum ¬‚uctuations of the electromagnetic ¬eld at any pair of detectors are
uncorrelated. This is the analogue of Theorem 6.3, which states that a separable state
of two distinguishable particles yields uncorrelated quantum ¬‚uctuations for any pair
of observables.
For ks = k s the state |Ψ = |1ks , 1k s is entangled”according to the traditional
de¬nition”and evaluating eqn (6.100) in this case gives

(eks )i eik·r1 (ek s )j eik ·r2 + (eks )j eik·r2 (ek s )i eik ·r1 .
Ψij (r1 , r2 ) = Fk Fk
The de¬nition (6.96) in turn yields

Ψ (r1 , s1 ; r2 , s2 ) = φks (r1 , s1 ) φk s (r2 , s2 ) + φks (r2 , s2 ) φk s (r1 , s1 ) , (6.106)

φks (r, s1 ) = Fk e—1 · eks eik·r . (6.107)

This has the structure of an entangled-state wave function for two bosons”as shown in
eqn (6.80)”with similar physical consequences. In particular, if one photon is detected
in the mode ks, then a subsequent detection of the remaining photon is guaranteed
to ¬nd it in the mode k s . More generally, quantum ¬‚uctuations in the electromag-
netic ¬eld at the two detectors are correlated. According to the general de¬nition in
Section 6.5.3, an entangled two-photon state is dynamically entangled if the detection
amplitude cannot be expressed in the minimal form (6.106) required by Bose statistics.
We saw in Section 6.4.1 that reduced density operators, de¬ned by partial traces,
are quite useful in the discussion of distinguishable particles, but systems of identical
particles”such as photons”cannot be divided into distinguishable subsystems. The
key to overcoming this di¬culty is found in eqn (6.98) which shows that the second-
order correlation function has the form of a density matrix corresponding to the two-
photon detection amplitude Ψij (r1 , r2 ). This suggests that the analogue of the reduced
density matrix is the ¬rst-order correlation function Gij (r ; r), evaluated for the two-
photon state |Ψ .
The ¬rst evidence supporting this proposal is provided by considering a separable
state de¬ned by eqn (6.87). In this case

(1) (’) (+)
Gij (r ; r) = Ψ Ei (r ) Ej (r) Ψ
0 “2 Ei (r ) Ej (r) “†2 0
(’) (+)
0 “2 , Ei (r ) Ej (r) , “†2
(’) (+)
= 0, (6.108)
¾½ Entangled states

where the last line follows from the identity Ej (r) |0 = 0 and its adjoint. The ¬eld
operators and the operators “ and “† are both linear functions of the creation and
annihilation operators, so

(r) , Ҡ2 = 2 Ej (r) , Ҡ Ҡ .
(+) (+)
Ej (6.109)

The remaining commutator is a c-number which is evaluated by using the expansions
(3.69) and (6.88) to get
Ej (r) , “† = 2’1/4 φj (r) ,
where φi (r) is de¬ned by eqn (6.104). Substituting this result, and the corresponding
expression for “, Ei (r ) , into eqn (6.108) yields

Gij (r ; r) = 2φj (r) φ— (r ) .

The conclusion is that the ¬rst-order correlation function for a separable state factor-
izes. This is the analogue of Theorem 6.1 for distinguishable particles.
Next let us consider a generic entangled state de¬ned by |Ψ = “† ˜† |0 , where

θks a†
˜† = (6.112)

|θks |2 = 1 . (6.113)

For this argument, we can con¬ne attention to operators satisfying “, ˜† = 0, which
is equivalent to the orthogonality of the classical wave packets:

(θ, γ) ≡ θks γks = 0 . (6.114)

The ¬rst-order correlation function for this state is
(1) (’) (+)
Gij (r ; r) = Ψ Ei (r ) Ej (r) Ψ
= √ {φj (r) φ— (r ) + ·j (r) ·i (r )} ,

where ·j (r) is de¬ned by replacing γks with θks in eqn (6.104). Thus for the entangled,
two-photon state |Ψ , the ¬rst-order correlation function (reduced density matrix) has
the standard form of the density matrix for a one-particle mixed state. This is the
analogue of Theorem 6.2 for distinguishable particles.

6.7 Exercises
6.1 Proof of Theorem 6.1
(1) To prove assertion (a), use the expression for the density operator resulting from
eqns (6.40) and (2.81) to evaluate the reduced density operators.
(2) To prove assertion (b), assume that |Ψ is entangled”so that it has Schmidt rank
r > 1”and derive a contradiction.

6.2 Proof of Theorem 6.3
(1) For a separable state |Ψ show that Ψ |δA δB| Ψ = 0.
(2) Assume that Ψ |δA δB| Ψ = 0 for all A and B. Apply this to operators that are
diagonal in the Schmidt basis for |Ψ and thus show that |Ψ must be separable.

6.3 Singlet spin state
(1) Use the standard treatments of the Pauli matrices, given in texts on quantum
mechanics, to express the eigenstates of n · σ in the usual basis of eigenstates of
σz .
(2) Show that the singlet state |S = 0 , given by eqn (6.37), has the same form for all
choices of the quantization axis n.
|S = 0 = 0.
(3) Show that SA + SB

6.4 Correlations in a separable mixed state
Consider a system of two distinguishable spin-1/2 particles described by the ensemble

{|Ψ1 = |‘ |“ , |Ψ2 = |“ |‘ B}

of separable states, where the spin states are eigenstates of sA and sB .
z z

(1) Show that the density operator can be written as

ρ = p |Ψ1 Ψ1 | + (1 ’ p) |Ψ2 Ψ2 | ,

where 0 p 1.
(2) Evaluate the correlation function δsA δsB and use the result to show that the
z z
spins are only uncorrelated for the extreme values p = 0, 1.
(3) For intermediate values of p, argue that the correlation is exactly what would be
found for a pair of classical stochastic variables taking on the values ±1/2 with
the same assignment of probabilities.
Paraxial quantum optics

The generation and manipulation of paraxial beams of light forms the core of exper-
imental practice in quantum optics; therefore, it is important to extend the classical
treatment of paraxial optics to situations involving only a few photons, such as the
photon pairs produced by spontaneous down-conversion. In addition to the interac-
tion of quantized ¬elds with standard optical elements, the theory of quantum paraxial
propagation has applications to fundamental issues such as the generation and control
of orbital angular momentum and the meaning of localization for photons.
In geometric optics a beam of light is a bundle of rays making small angles with a
central ray directed along a unit vector u0 . The constituent rays of the bundle are said
to be paraxial. In wave optics, the bundle of rays is replaced by a bundle of unit vectors
normal to the wavefront; so a paraxial wave is de¬ned by a wavefront that is nearly
¬‚at. In this situation it is natural to describe the classical ¬eld amplitude, E (r, t), as a
function of the propagation variable ζ = r·u0 , the transverse coordinates r tangent to
the wavefront, and the time t. Paraxial wave optics is more complicated than paraxial
ray optics because of di¬raction, which couples the r -, ζ-, and t-dependencies of the
¬eld. For the most part, we will only consider a single paraxial wave; therefore, we can
choose the z-axis along u0 and set ζ = z.
The de¬nite wavevector associated with the plane wave created by a† (k) makes it
possible to recast the geometric-optics picture in terms of photons in plane-wave states.
This way of thinking about paraxial optics is useful but”as always”it must be treated
with caution. As explained in Section 3.6.1, there is no physically acceptable way to
de¬ne the position of a photon. This means that the natural tendency to visualize the
photons as beads sliding along the rays at speed c must be strictly suppressed. The
beads in this naive picture must be replaced by wave packets containing energy ω
and momentum k, where k is directed along the normal to the paraxial wavefront.
In the following section, we begin with a very brief review of classical paraxial
wave optics. In succeeding sections we will de¬ne a set of paraxial quantum states,
and then use them to obtain approximate expressions for the energy, momentum,
and photon number operators. This will be followed by the de¬nition of a slowly-
varying envelope operator that replaces the classical envelope ¬eld E (r, t). Some more
advanced topics”including the general paraxial expansion, angular momentum, and
an approximate notion of photon localizability”will be presented in the remaining
Paraxial states

7.1 Classical paraxial optics
As explained above, each photon is distributed over a wave packet, with energy ω and
momentum k, that propagates along the normal to the wavefront. However, this wave
optics description must be approached with equal caution. The standard approach in
classical, paraxial wave optics (Saleh and Teich, 1991, Sec. 2.2C) is to set

E (r, t) = E (r, t) ei(k0 ·r’ω0 t) , (7.1)

where ω0 and k0 = u0 n (ω0 ) ω0 /c are respectively the carrier frequency and the carrier
wavevector. The four-dimensional Fourier transform, E (k, ω), of the slowly-varying
envelope is assumed to be concentrated in a neighborhood of k = 0, ω = 0. The
equivalent conditions in the space“time domain are

‚ 2 E (r, t) ‚E (r, t)
ω0 E (r, t)
ω0 (7.2)
‚t2 ‚t
‚ 2 E (r, t) ‚E (r, t)
k0 E (r, t) ;
k0 (7.3)
‚z 2 ‚z
in other words, E (r, t) has negligible variation in time over an optical period and
negligible variation in space over an optical wavelength. As we have already seen in
the discussion of monochromatic ¬elds, these conditions cannot be applied to the ¬eld
operator E(+) (r, t); instead, they must be interpreted as constraints on the allowed
states of the ¬eld.

7.2 Paraxial states
7.2.1 The paraxial ray bundle
A paraxial beam associated with the carrier wavevector k0 , i.e. a bundle of wavevectors
k clustered around k0 , is conveniently described in terms of relative wavevectors q =
k ’ k0 , with |q| k0 . For each k = k0 + q the angle ‘k between k and k0 is given by
|k0 — k| |k0 — q| |q | q
sin ‘k = = = 1+O , (7.4)
k0 |k0 + q|
k0 k k0 k0

where q = q ’ qz k0 and qz = q · k0 . This shows that ‘k |q | /k0 , and further
suggests de¬ning the small parameter for the paraxial beam as the maximum opening
θ= 1, (7.5)
where 0 < |q | < ∆q is the range of the transverse components of q. Variations in
the transverse coordinate r occur over a characteristic distance Λ de¬ned by the
Fourier transform uncertainty relation Λ ∆q ∼ 1; consequently, a useful length scale
for transverse variations is Λ = 1/∆q = 1/ (θk0 ).
A natural way to de¬ne the characteristic length Λ for longitudinal variations
is to interpret the transverse length scale Λ as the radius of an e¬ective circular
¾¾¼ Paraxial quantum optics

aperture. The conventional longitudinal scale is then the distance over which a beam
waist, initially equal to Λ , doubles in size. At this point, a strictly correct argument
would bring in classical di¬raction theory; but the same end can be achieved”with
only a little sleight of hand”with geometric optics. By combining the approximation
tan θ ≈ θ with elementary trigonometry, it is easy to show that the geometric image
of the aperture on a screen at a distance Λ has the radius Λ = Λ + θΛ . The trick
is to choose the longitudinal scale length Λ so that Λ = 2Λ , and this requires
Λ 1
= k0 Λ2 = 2 .
Λ= (7.6)
θ θ k0
We will see in Section 7.4 that Λ = k0 Λ2 is twice the Rayleigh range”as usually
de¬ned in classical di¬raction theory”for the aperture Λ . Thus our geometric-optics
trick has achieved the same result as a proper di¬raction theory argument. Since
propagation occurs along the direction characterized by Λ , the natural time scale is
T = Λ / (c/n0 ) = 1/ θ2 ω0 .
The spread, ∆qz , in the longitudinal component of q satis¬es Λ ∆qz ∼ 1, so the
longitudinal and transverse widths are related by
∆qz ∆q
= θ2 ,
= (7.7)
k0 k0
and the q-vectors are e¬ectively con¬ned to a disk-shaped region de¬ned by
Q0 = q satisfying |q | θ2 k0 .
θk0 , qz (7.8)
In a dispersive medium with index of refraction n (ω) the frequency ωk is a solution
of the dispersion relation ck = ωk n (ωk ), and wave packets propagate at the group
velocity vg (ωk ) = dωk /dk. The frequency width is therefore ∆ω = vg0 ∆k, where vg0
is the group velocity at the carrier frequency. The straightforward calculation outlined
in Exercise 7.1 yields the estimate
∆ω 1
≈ θ2 1, (7.9)
ω0 2
which is the criterion for a monochromatic ¬eld given by eqn (3.107).

7.2.2 The paraxial Hilbert space
The geometric-optics picture of a bundle of rays forming small angles with the central
propagation vector k0 is realized in the quantum theory by a family of states that only
contain photons with propagation vectors in the paraxial bundle. In order to satisfy
the superposition principle, the family of states must be chosen as the paraxial space,
H (k0 , θ) ‚ HF , spanned by the improper (continuum normalized) number states
a† m (qm ) |0 , M = 0, 1, . . . ,
|{qs}M = (7.10)

where a0s (q) = as (k0 +q), {qs}M ≡ {q1 s1 , . . . , qM sM }, and each relative propagation
vector is constrained by the paraxial conditions (7.8). If the paraxial restriction were
Paraxial states

relaxed, eqn (7.10) would de¬ne a continuum basis set for the full Fock space, so the
paraxial space is a subspace of HF . The states satisfying the paraxiality condition
(7.8) also satisfy the monochromaticity condition (3.107); consequently, H (k0 , θ) is
a subspace of the monochromatic space H (ω0 ). A state |Ψ belonging to H (k0 , θ) is
called a pure paraxial state, and a density operator ρ describing an ensemble of
pure paraxial states is called a mixed paraxial state. A useful way to characterize
a paraxial state ρ in H (k0 , θ) is to note that the power spectrum

a† (k) as (k) = Tr ρa† (k) as (k)
p (k) = (7.11)
s s
s s

is strongly concentrated near k = k0 .
In the Schr¨dinger picture, a general paraxial state |Ψ (0) has an expansion in the
basis {|{qs}M }, and the time evolution is given by

|Ψ (t) = e’itH/ |Ψ (0) , (7.12)

where H is the total Hamiltonian, including interactions with atoms, etc. It is clear on
physical grounds that an initial paraxial state will not in general remain paraxial. For
example, a paraxial ¬eld injected into a medium containing strong scattering centers
will experience large-angle scattering and thus become nonparaxial as it propagates
through the medium. In more favorable cases, interaction with matter, e.g. transmis-
sion through lenses with moderate focal lengths, will conserve the paraxial property.
The only situation for which it is possible to make a rigorous general statement
is free propagation. In this case the basis vectors |{qs}M are eigenstates of the total
Hamiltonian, H = Hem , so that

d3 q1 d3 qM
|Ψ (t) = ··· F ({qs}M )
(2π)3 (2π)3
s1 sM
— exp ’i ω (|k0 + qm |) t |{qs}M ,

where F ({qs}M ) = {qs}M |Ψ (0) . Consequently, the state |Ψ (t) remains in the
paraxial space H (k0 , θ) for all times.
For the sake of simplicity, we have analyzed the case of a single paraxial ray bun-
dle, but in many applications several paraxial beams are simultaneously present. The
reasons range from simple re¬‚ection by a mirror to complex wave mixing phenomena
in nonlinear media. The necessary generalizations can be understood by considering
two paraxial bundles with carrier waves k1 and k2 and opening angles θ1 and θ2 . The
two beams are said to be distinct if the vector ∆k = k1 ’ k2 satis¬es

|∆k| max [θ1 |k1 | , θ2 |k2 |] , (7.14)

i.e. the two bundles of wavevectors do not overlap. The multiparaxial space,
H (k1 , θ1 , k2 , θ2 ), for two distinct paraxial ray bundles is spanned by the basis vec-
¾¾¾ Paraxial quantum optics

a† m a† k (pk ) |0
(qm ) (M, K = 0, 1, . . .) , (7.15)
1s 2s
m=1 k=1

where a† (q) ≡ a† (kβ + q) (β = 1, 2) and the qs and ps are con¬ned to the respective
regions Q1 and Q2 de¬ned by applying eqn (7.8) to each beam. The argument sug-
gested in Exercise 7.6 shows that the paraxial spaces H (k1 , θ1 ) and H (k2 , θ2 )”which
are subspaces of H (k1 , θ1 , k2 , θ2 )”may be treated as orthogonal within the paraxial
approximation. This description is readily extended to any number of distinct beams.

7.2.3 Photon number, momentum, and energy
The action of the number operator N on the paraxial space H (k0 , θ) is determined by
its action on the basis states in eqn (7.10); consequently, the commutation relation,
N, a† (q) = a† (q), permits the use of the e¬ective form
0s 0s

d3 q
a† (q) a0s (q) .
N N0 = (7.16)
Q0 s

Applying the same idea to the momentum operator, given by the continuum version
of eqn (3.153), leads to Pem = k0 N0 + P0 , where

d3 q
a† (q) a0s (q)
P0 = q (7.17)
Q0 s

is the paraxial momentum operator.
The continuum version of eqn (3.150) for the Hamiltonian in a dispersive medium
can be approximated by

d3 q
a† (q) a0s (q) ,
Hem = ω|k0 +q| (7.18)
Q0 s

when acting on a paraxial state. The small spread in frequencies across the paraxial
bundle, together with the weak dispersion condition (3.120), allows the dispersion
relation ωk = ck/n (ωk ) to be approximated by

ωk = , (7.19)
(ωk ’ ω0 )
n0 + dω 0

and a straightforward calculation yields

’ 1 + ··· .
ω|k0 +q| = ω0 + vg0 k0 k0 + (7.20)

The conditions (7.8) allow the expansion

q qz
+ 2 + O θ2 ,
k0 + =1+ (7.21)
k0 k0 2k0
The slowly-varying envelope operator

which in turn leads to the expression Hem = ω0 N0 + HP + O θ2 , where
vg0 q 2
d3 q
a† (q) a0s (q)
HP = vg0 qz + (7.22)
3 2k0
Q0 s

is the paraxial Hamiltonian for the space H (k0 , θ).
The e¬ective orthogonality of distinct paraxial spaces”which corresponds to the
distinguishability of distinct paraxial beams”implies that the various global operators
are additive. Thus the operators for the total photon number, momentum, and energy
for a set of paraxial beams are
N= Nβ , Pem = ( kβ Nβ + Pβ ) , Hem = ( ωβ Nβ + HP β ) , (7.23)
β β β

where Nβ , Pβ , and HP β are respectively the paraxial number, momentum, and energy
operators for the βth beam.

7.3 The slowly-varying envelope operator
We next use the properties of the paraxial space H (k0 , θ) to justify an approximation
for the ¬eld operator, A(+) (r, t), that replaces eqn (7.1) for the classical ¬eld. In order
to emphasize the relation to the classical theory, we initially work in the Heisenberg
picture. The slowly-varying envelope operator ¦ (r, t) is de¬ned by

(vg0 /c)
¦ (r, t) ei(k0 ·r’ω0 t) .
A(+) (r, t) = (7.24)
2 0 k0 c
Comparing this de¬nition to the general plane-wave expansion (3.149) shows that
d3 q
a0s (q) es (k0 + q) ei(q·r’δq t) ,
¦ (r, t) = 3 fq (7.25)
Q0 s

vg (|k0 + q|) k0
δq = ω|k0 +q| ’ ω0 and fq = . (7.26)
|k0 + q|
The corresponding expressions in the Schr¨dinger picture follow from the relation
(+) (+)
A (r) = A (r, t = 0).
The envelope operator will only be slowly varying when applied to paraxial states
in H (k0 , θ), so we begin by using eqn (7.10) to evaluate the action of the envelope
operator ¦ (r) = ¦ (r, 0) on a typical basis vector of H (k0 , θ):
a† m (qm ) |0
¦ (r) |{qs}M = ¦ (r) 0s
a† m (qm ) |0
= ¦ (r) , 0s
¦ (r) , a† m (1 ’ δlm ) a† l (ql ) |0 ,
= (qm ) (7.27)
0s 0s
m=1 l=1
¾¾ Paraxial quantum optics

where the last line follows from the identity (C.49). Setting t = 0 in eqn (7.25) produces
the Schr¨dinger-picture representation of the envelope operator,

d3 q
a0s (q) es (k0 + q) eiq·r ,
¦ (r) = 3 fq (7.28)
Q0 s

and using this in the calculation of the commutator yields

¦ (r) , a† m (qm ) = fqm es (k0 + qm ) eiqm ·r

= es (k0 ) eiqm ·r + O (θ) . (7.29)

Thus when acting on paraxial states the exact representation (7.28) can be replaced
by the approximate form

¦ (r) = φs (r) e0s + O (θ) , (7.30)

where e0s = es (k0 ), and

d3 q
(q) eiq·r .
φs (r) = 3 a0s (7.31)

The subscript Q0 on the integral is to remind us that the integration domain is re-
stricted by eqn (7.8). This representation can only be used when the operator acts on
a vector in the paraxial space. It is in this sense that the z-component of the envelope
operator is small, i.e.
Ψ1 |¦z (r)| Ψ2 = O (θ) , (7.32)
for any pair of normalized vectors |Ψ1 and |Ψ2 that both belong to H (k0 , θ). In the
leading paraxial approximation, i.e. neglecting O (θ)-terms, the electric ¬eld operator
ω0 (vg0 /c)
e0s φs (r, t) ei(k0 ·r’ω0 t) .
E(+) (r, t) = i (7.33)
2 0 n0 s

The commutation relations for the transverse components of the envelope operator
have the simple form

¦i (r, t) , ¦† (r , t) = δij δ (r ’ r ) (i, j = 1, 2) , (7.34)

which shows that the paraxial electromagnetic ¬eld is described by two independent
operators ¦1 (r) and ¦2 (r) satisfying local commutation relations. This re¬‚ects the
fact that the paraxial approximation eliminates the nonlocal features exhibited in the
exact commutation relations (3.16) by e¬ectively averaging the arguments r and r
over volumes large compared to »3 . By the same token, the delta function appearing
on the right side of eqn (7.34) is coarse-grained, i.e. it only gives correct results
when applied to functions that vary slowly on the scale of the carrier wavelength. This
feature will be important when we return to the problem of photon localization.
The slowly-varying envelope operator

In most applications the operators φs (r, t), corresponding to de¬nite polarization
states, are more useful. They satisfy the commutation relations

φs (r, t) , φ† (r , t) = δss δ (r ’ r ) (s, s = ± or 1, 2) . (7.35)

The approximate expansion (7.31) can be inverted to get

d3 rφs (r) e’iq·r = d3 re— (k0 ) · ¦ (r) e’iq·r ,
a0s (q) = (7.36)

which is valid for q in the paraxial region Q0 . By using this inversion formula the
operators N0 , P0 , and HP can be expressed in terms of the slowly-varying envelope
φ† (r) φs (r) ,
d3 r
N0 = (7.37)

φ† (r) ∇φs (r) ,
d3 r
P0 = (7.38)

vg0 ∇2
φ† (r) vg0 ∇z ’
HP = dr φs (r) . (7.39)
i 2k0
We can gain a better understanding of the paraxial Hamiltonian by substituting
eqns (7.24) and (7.22) into the Heisenberg equation
‚ (+)
A (r, t) = A(+) (r, t) , Hem
i (7.40)
to get

ω0 ¦ (r, t) + i ¦ (r, t) = ω0 [¦ (r, t) , N0 ] + [¦ (r, t) , HP ] . (7.41)
Since the envelope operator ¦ (r, t) is a sum of annihilation operators, it satis¬es
[¦ (r, t) , N0 ] = ¦ (r, t). Consequently, the term ω0 [¦ (r, t) , N0 ] is canceled by the
time derivative of the carrier wave. The Heisenberg equation for the envelope ¬eld
¦ (r, t) is therefore

i ¦ (r, t) = [¦ (r, t) , HP ] . (7.42)
This shows that the paraxial Hamiltonian generates the time translation of the en-
velope ¬eld. By using the explicit form (7.22) of HP and the commutation relations
(7.34), it is simple to see that the Heisenberg equation can be written in the equivalent
1‚ 12
i ∇z + ∇ ¦ (r, t) = 0
¦ (r, t) + (7.43)
vg0 ‚t 2k0
1‚ 12
i ∇z + ∇ φs (r, t) = 0 .
φs (r, t) + (7.44)
vg0 ‚t 2k0
Multiplying eqn (7.43) by the normalization factor in eqn (7.24) and passing to the
classical limit (A(+) (r, t) ’ A (r, t) exp [i (k0 · r ’ ω0 t)]) yields the standard paraxial
wave equation of the classical theory.
¾¾ Paraxial quantum optics

The single-beam argument can be applied to each of the distinct beams to give the
Schr¨dinger-picture representation,

(vgβ /c)
eβs φβs (r) eikβ ·r ,
A(+) (r) = (7.45)
2 0 kβ c

where eβs = es (kβ ), ωβ = ω (kβ ) = ckβ /nβ , vgβ is the group velocity for the βth
carrier wave,
d3 q iq·r
φβs (r) = 3 aβs (q) e , (7.46)
Qβ (2π)

φβs (r) , φ† (r ) ≈ δββ δss δ (r ’ r ) (s, s = ± or 1, 2) . (7.47)
β s

The last result”which is established in Exercise 7.3”means that the envelope ¬elds
for distinct beams represent independent degrees of freedom.
The corresponding expression for the electric ¬eld operator in the paraxial approx-
imation is
ωβ (vgβ /c)
eβs φβs (r) eikβ ·r .
E(+) (r) = i (7.48)
2 0 nβ

The operators for the photon number Nβ , the momentum Pβ , and the paraxial Hamil-
tonian HβP of the individual beams are obtained by applying eqns (7.37)“(7.39) to
each beam.

7.4 Gaussian beams and pulses
It is clear from the relation E = ’‚A/‚t that the electric ¬eld also satis¬es the
paraxial wave equation. For the special case of propagation along the z-axis through
vacuum, we ¬nd
12 1 ‚E
∇ E +i + = 0. (7.49)
2k0 ‚z c ‚t
For ¬elds with pulse duration much longer than any relevant time scale”or equiva-
lently with spectral width much smaller than any relevant frequency”the time depen-
dence of the slowly-varying envelope function can be neglected; that is, one can set
‚E/‚t = 0 in eqn (7.49). The most useful time-independent solutions of the paraxial
equation are those which exhibit minimal di¬ractive spreading. The fundamental solu-
tion with these properties”which is called a Gaussian beam or a Gaussian mode
(Yariv, 1989, Sec. 6.6)”is

w0 e’iφ(z) ρ2 ρ2
E (r, t) = E 0 (r , z) = E0 e0 exp ’ 2
exp ik0 , (7.50)
w (z) 2R (z) w (z)

where the polarization vector e0 is in the x“y plane and ρ = |r |. The functions of z
on the right side are de¬ned by
The paraxial expansion— ¾¾

z ’ zw
w (z) = w0 1+ , (7.51)

R (z) = z ’ zw + , (7.52)
z ’ zw
z ’ zw
φ (z) = tan’1 , (7.53)
where the Rayleigh range ZR is
ZR = > 0. (7.54)

The function w (z)”which de¬nes the width of the transverse Gaussian pro¬le”has
the minimum value w0 (the spot size) at z = zw (the beam waist). The solution is
completely characterized by e0 , E0 , w0 , and zw . The function R (z)”which represents
the radius of curvature of the phase front”is negative for z < zw , and positive for
z > zw . The picture is of waves converging from the left and diverging to the right of
the focal point at the waist. The de¬nition (7.51) shows that

w (zw + ZR ) = 2w0 , (7.55)

so the Rayleigh range measures the distance required for di¬raction to double the area
of the spot. There are also higher-order Gaussian modes that are not invariant under
rotations around the beam axis (Yariv, 1989, Sec. 6.9).
The assumption ‚E/‚t = 0 means that the Gaussian beam represents an in¬nitely
long pulse, so we should expect that it is not a normalizable solution. This is readily
veri¬ed by showing that the normalization integral over the transverse coordinates has
the z-independent value

2 2
d2 r |E 0 (r , z)| = πw0 |E0 | ,

so that the z-integral diverges. A more realistic description is based on the observation
E P (r, t) = FP (z ’ ct) E 0 (r , z) (7.57)
is a time-dependent solution of eqn (7.49) for any choice of the function FP (z).
If FP (z) is normalizable, then the Gaussian pulse (or Gaussian wave packet)
E P (r, t) is normalizable at all times. The pulse-envelope function is frequently chosen
to be Gaussian also, i.e.
(z ’ z0 )
FP (z) = FP 0 exp ’ , (7.58)

where LP is the pulse length and TP = LP /c is the pulse duration.
¾¾ Paraxial quantum optics

The paraxial expansion—
The approach to the quantum paraxial approximation presented above is su¬cient
for most practical purposes, but it does not provide any obvious way to calculate
corrections. A systematic expansion scheme is desirable for at least two reasons.
(1) It is not wise to depend on an approximation in the absence of any method for
estimating the errors involved.
(2) There are some questions of principle, e.g. the issue of photon localizability, which
require the evaluation of higher-order terms.
We will therefore very brie¬‚y outline a systematic expansion in powers of θ (Deutsch
and Garrison, 1991a) which is an extension of a method developed by Lax et al. (1974)
for the classical theory. In the interests of simplicity, only propagation in the vacuum
will be considered.
In order to construct a consistent expansion in powers of θ, it is ¬rst necessary
to normalize all physical quantities by using the characteristic lengths introduced in
Section 7.2.1. The ¬rst step is to de¬ne a characteristic volume
V0 = Λ Λ = θ , (7.59)

and a dimensionless wavevector q = q + q z k0 , with q = q Λ and q z = qz Λ . In
terms of the scaled wavevector q, the paraxial constraints (7.8) are

Q0 = {q satisfying |q | 1 , qz 1} . (7.60)

The operators a† (k) have dimensions L3/2 , so the dimensionless operators a† (q) =
s s
’1/2 †
V0 as (k0 + q) satisfy the commutation relation

as (q) , a† (q ) = δss (2π) δ (q’q ) .

In the space“time domain, the operator ¦ (r, t) has dimensions L’3/2 , so it is

natural to de¬ne a dimensionless envelope ¬eld by ¦ r, t = V0 ¦ (r, t), where r =
r + z k0 and r = r /Λ , z = z/Λ . The scaled position-space variables satisfy
q · r = q · r = q · r + q z z. The operator ¦ r, t is related to as (q) by

d3 q
as (q) Xs (q, θ) eiq·r ,
¦ (r) = (7.62)
Q0 s

where Xs (q, θ) is the c-number function:

θn X(n) (q) .
Xs (q, θ) = es (k0 + q) = (7.63)
|k0 + q| s

Substituting this expansion into eqn (7.62) and exchanging the sum over n with the
integral over q yields
Paraxial wave packets— ¾¾

θn ¦
¦ (r) = (r) , (7.64)

where the nth-order coe¬cient is
d3 q
as (q) X(n) (q) eiq·r .
¦ (r) = (7.65)
(2π) s

The zeroth-order relation
d3 q
as (q) es (k0 ) eiq·r
¦ (r) = (7.66)
(2π) s

agrees with the previous paraxial approximation (7.31), and it can be inverted to give
(r) · e— (k0 ) e’iq·r .
d3 r¦
as (q) = (7.67)

Carrying out Exercise 7.5 shows that all higher-order coe¬cients can be expressed in
terms of ¦0 (r).
We can justify the operator expansion (7.64) by calculating the action of the exact
envelope operator on a typical basis vector in H (k0 , θ), and showing that the expansion
of the resulting vector in θ agrees”order-by-order”with the result of applying the
operator expansion. In the same way it can be shown that the operator expansion
reproduces the exact commutation relations (Deutsch and Garrison, 1991a).

Paraxial wave packets—
The use of non-normalizable basis states to de¬ne the paraxial space can be avoided
by employing wave packet creation operators. For this purpose, we restrict the polar-
ization amplitudes, ws (k), (introduced in Section 3.5.1) to those that have the form
ws (k0 + q) = V0 w s (q). Instead of con¬ning the relative wavevectors q to the re-
gion Q0 described by eqn (7.60), we de¬ne a paraxial wave packet (with carrier
wavevector k0 and opening angle θ) by the assumption that w s (q) vanishes rapidly
outside Q0 , i.e. w s (q) belongs to the space

P (k0 , θ) = lim |q| |w s (q)| = 0 for all n
w s (q) such that 0. (7.68)

The inner product for this space of classical wave packets is de¬ned by

d3 q —
(w, v) = ws (k0 + q) vs (k0 + q) . (7.69)
(2π) s

Since the two wave packets belong to the same space, this can be written in terms of
scaled variables as
d3 q
w — (q) v s (q) .
(w, v) = (7.70)
(2π) s
¾¿¼ Paraxial quantum optics

For a paraxial wave packet, we set k = k0 + q in the general de¬nition (3.191) to
d3 q d3 q
a† (k0 + q) ws (k0 + q) =
a† [w] = a† (q) w s (q) . (7.71)
3 3
(2π) (2π)
s s

The paraxial space de¬ned by eqn (7.10) can equally well be built up from the vacuum
by forming all linear combinations of states of the form
a† [wp ] |0 ,
|{w}P = (7.72)

where {w}P = {w1 , . . . , wP }, P = 0, 1, 2, . . ., and the wp s range over all of P (k0 , θ).
The only di¬erence from the construction of the full Fock space is the restriction of the
wave packets to the paraxial space P (k0 , θ) ‚ “em , where “em is the electromagnetic
phase space of classical wave packets de¬ned by eqn (3.189).
The multiparaxial Hilbert spaces introduced in Section 7.2.2 can also be described
in wave packet terms. The distinct paraxial beams considered there correspond to the
wave packet spaces P (k1 , θ1 ) and P (k2 , θ2 ). Paraxial wave packets, w ∈ P (k1 , θ1 )
and v ∈ P (k2 , θ2 ), are concentrated around k1 and k2 respectively, so it is eminently
plausible that w and v are e¬ectively orthogonal. More precisely, it is shown in Exercise
7.6 that
n |(w, v)| = 0 for all n
lim 1, (7.73)
θ2 ’0 (θ2 )

i.e. |(w, v)| vanishes faster than any power of θ2 . The symmetry of the inner product
guarantees that the same conclusion holds for θ1 ; consequently, the wave packet spaces
P (k1 , θ1 ) and P (k2 , θ2 ) can be treated as orthogonal to any ¬nite order in θ1 or θ2 .
The approximate orthogonality of the wave packets w and v combined with the
general rule (3.192) implies
a [w] , a† [v] = 0 (7.74)
whenever w and v belong to distinct paraxial wave packet spaces. From this it is easy
to see that the quantum paraxial spaces H (k1 , θ1 ) and H (k2 , θ2 ) are orthogonal to any
¬nite order in the small parameters θ1 and θ2 . In the paraxial approximation, distinct
paraxial wave packets behave as though they were truly orthogonal modes. This means
that the multiparaxial Hilbert space describing the situation in which several distinct
paraxial beams are present is generated from the vacuum by generalizing eqn (7.72)

a† [wβp ] |0 ,
{w1 }P1 , {w2 }P2 , . . . , = (7.75)
β p=1

where Pβ = 0, 1, . . ., and the wβp s are chosen from P (kβ , θβ ).

Angular momentum—
The derivation of the paraxial approximation for the angular momentum J = L + S
is complicated by the fact”discussed in Section 3.4”that the operator L does not
Angular momentum— ¾¿½

have a convenient expression in terms of plane waves. Fortunately, the argument used
to show that the energy and the linear momentum are additive also applies to the
angular momentum; therefore, we can restrict attention to a single paraxial space. Let
us begin by rewriting the expression (3.58) for the helicity operator S as

d3 q k0 + q/k0
a† (q) a+ (q) ’ a† (q) a’ (q) .
S= (7.76)

(2π) k0 + q/k0

The ratio q/k0 can be expressed as
Λ qz
q Λq
k0 = θq + θ2 q z k0 ,
= + (7.77)
k0 Λ k0 Λ k0

so expanding in powers of θ gives the simple result

S0 = k0 S0 + O (θ) , (7.78)

d3 q
a† (q) a+ (q) ’ a† (q) a’ (q)
S0 = ’

d3 r φ† (r) φ+ (r) ’ φ† (r) φ’ (r) .
= (7.79)


Thus, to lowest order, the helicity has only a longitudinal component; the leading
transverse component is O (θ). This is the natural consequence of the fact that each
photon has a wavevector close to k0 .
To develop the approximation for L we substitute the paraxial representation (7.24)
and the corresponding expression (7.48) for E(+) (r, t) into eqn (3.57) to get

(’) (+)
r — ∇ Aj
d3 rEj
L0 = 2i 0
d3 r¦† (r, t) e’ik0 ·r r — ∇ ¦j (r, t) eik0 ·r
= j
d3 r¦† (r, t) r — k0 + r — ∇ ¦j (r, t) ,
= (7.80)

where the last line follows from the identity

e’ik0 ·r ∇eik0 ·r ¦j (r, t) = (∇ + ik0 ) ¦j (r, t) . (7.81)

This remaining gradient term can be written as

r— ∇ = r— k0 ∇z + ∇
i i
= r — k0 ∇z + z k0 — ∇ + r — ∇ , (7.82)
i i i
¾¿¾ Paraxial quantum optics

so that
L0 = L0 + k0 L0z , (7.83)
where the transverse part is given by

d3 r¦† (r) r — k0 + r — k0 ∇z + z k0 — ∇
L0 = ¦j (r) , (7.84)
i i

and the longitudinal component is

d3 r¦† (r) r ∇ ’r ∇
L0z = ¦j (r) . (7.85)
1 2 2 1
i i

The transverse part L0 is dominated by the term proportional to k0 . After
expressing the integral in terms of the scaled variable r and scaled ¬eld ¦, one ¬nds
that L0 = O (1/θ). The similar terms ω0 N0 and k0 N0 in the momentum and energy
are O 1/θ2 , so they are even larger. This apparently singular behavior is physically
harmless; it simply represents the fact that all photons in the wave packet have energies
close to ω0 and momenta close to k0 .
For the angular momentum the situation is di¬erent. The angular momenta of in-
dividual photons in plane-wave modes k0 +q must exhibit large ¬‚uctuations due to the
tight constraints on the polar angle ‘k given by eqn (7.4). These ¬‚uctuations are not
conjugate to the longitudinal component J0z , since rotations around the z-axis leave
‘k unchanged. On the other hand, the transverse components L0 generate rotations
around the transverse axes which do change the value of ‘k . Thus we should expect
large ¬‚uctuations in the transverse components of the angular momentum, which are
described by the large transverse term L0 . Thus only the longitudinal component L0z
is meaningful for a paraxial state. By combining eqns (7.85) and (7.79), we see that
the lowest-order paraxial angular momentum operator is purely longitudinal,

J0 = k0 [L0z + S0 ] . (7.86)

Approximate photon localizability—
Mandel™s local number operator, de¬ned by eqn (3.204), displays peculiar nonlocal
properties. Despite this apparent ¬‚aw, Mandel was able to demonstrate that N (V )
»3 , where »0
behaves approximately like a local number operator in the limit V 0
is the characteristic wavelength for a monochromatic ¬eld state. The important role
played by this limit suggests using the paraxial expansion to investigate the alternative
de¬nitions of the local number operator in a systematic way. To this end we ¬rst
introduce a scaled version of the Mandel detection operator by
M (r) = √ M (r) eik0 z . (7.87)

By combining the de¬nition (3.203) with the expansion (7.64), the identity (7.81), and
the scaled gradient
Approximate photon localizability— ¾¿¿

∇ 1 1 ‚
∇ + u3
k0 k0 k0 ‚z
= θ∇ + θ2 u3 ∇z , (7.88)

one ¬nds
(0) (1) (2)
+ θ2 M + O θ3 ,
M=M + θM (7.89)
(0) (1) (1)
where M = ¦, M =¦ , and

(2) (2) 2
’ ∇ + 2i∇z ¦ .
M =¦ (7.90)
The corresponding expansion for N (V ) is

N (V ) = N (0) (V ) + θ2 N (2) (V ) + O θ4 , (7.91)

(0)† (0)
(r) · ¦
N (0) (V ) = d3 r¦ (r) ,
(1)† (1) (0)† (2)
·M ·M
(2) 3
N (V ) = dr M +M + HC .

A simple calculation using the local commutation relations (7.34) for the zeroth-
order envelope ¬eld yields

N (0) (V ) , N (0) (V ) = 0 (7.93)

for nonoverlapping volumes, and

N (0) (V ) , ¦† (r) = χV (r) ¦† (r) , (7.94)

where the characteristic function χV (r) is de¬ned by

1 for r ∈ V ,
χV (r) = (7.95)
0 for r ∈ V .

Thus N (0) (V ) acts like a genuine local number operator. The nonlocal features dis-
cussed in Section 3.6.2 will only appear in the higher-order terms. It is, however,
important to remember that the delta function in the zeroth-order commutation rela-
tion (7.34) is really coarse-grained with respect to the carrier wavelength »0 . For this
»3 .
reason the localization volume V must satisfy V 0
The paraxial expansion of the alternative operator G (V ), introduced in eqn (3.210),
shows (Deutsch and Garrison, 1991a) that the two de¬nitions agree in lowest order,
G(0) (V ) = N (0) (V ), but disagree in second order, G(2) (V ) = N (2) (V ). This disagree-
ment between equally plausible de¬nitions for the local photon number operator is a
consequence of the fact that a photon with wavelength »0 cannot be localized to a
¾¿ Paraxial quantum optics

volume of order »3 . Since most experiments are well described by the paraxial approx-
imation, it is usually permissible to think of the photons as localized, provided that
the diameter of the localization region is larger than a wavelength.
The negative frequency part Ai (r) is a sum over creation operators, so it is
tempting to interpret Ai (r) as creating a photon at the point r. In view of the
impossibility of localizing photons, this temptation must be sternly resisted. On the
other hand, the cavity operator a† can be interpreted as creating a photon described by
the cavity mode E κ (r), since the mode function extends over the entire cavity. In the
same way, the plane-wave operator a† can be interpreted as creating a photon in the
(box-normalized) plane-wave state with wavenumber k and polarization eks . Finally
the wave packet operator a† [w] can be interpreted as creating a photon described by
the classical wave packet w, but it would be wrong to think of the photon as strictly
localized in the region where w (r) is large. With this caution in mind, one can regard
the pulse-envelope w (r) as an e¬ective photon wave function, provided that the pulse
duration contains many optical periods and the transverse pro¬le is large compared
to a wavelength.
There are other aspects of the averaged operators that also require some caution.
The operator N [w] = a† [w] a [w] satis¬es

N [w] , a† [w] = a† [w] , [N [w] , a [w]] = ’a [w] , (7.96)

so it serves as a number operator for w-photons, but these number operators are not
mutually commutative, since

[N [w] , N [u]] = (w,u) a† [u] a [w] ’ a† [w] a [u] . (7.97)

Thus distinct w photons and u photons cannot be independently counted unless the
classical wave packets w and u are orthogonal. This lack of commutativity can be
important in situations that require the use of non-orthogonal modes (Deutsch et al.,

7.9 Exercises
7.1 Frequency spread for a paraxial beam
(1) Show that the fractional change in the index of refraction across a paraxial beam
ω0 dn
∆n ∆k n0 dω 0
= ,
k0 1 + ω0 dω 0
n0 n0

where n0 = n (ω0 ) = (ω0 ) / and (dn/dω)0 is evaluated at the carrier fre-
k0 + |q |2 + qz with eqns (7.5) and (7.7) to get
2 2
(2) Combine the relation k =

∆k 1 ∆q 12
θ + ··· .
+ O θ4 =
k0 2 k0 2

(3) Combine this with ∆ω = vg0 ∆k to ¬nd
∆ω n0 vg0 1 n0 12 1
k0 θ2 = θ < θ2 .
= dn
ω0 ck0 2 2 2
n0 + ω 0 dω 0

7.2 Distinct paraxial Hilbert spaces are e¬ectively orthogonal
Consider the paraxial subspaces H (k1 , θ1 ) and H (k2 , θ2 ) discussed in Section 7.2.2.
(1) For a typical basis vector |{qs}κ in H (k1 , θ1 ) show that as (k) |{qs}κ ≈ 0 when-
ever |k ’ k1 | θ1 |k1 |.
(2) Use this result to argue that each basis vector in H (k2 , θ2 ) is approximately or-
thogonal to every basis vector in H (k1 , θ1 ).

7.3 Distinct paraxial ¬elds are independent
Combine the de¬nition (7.46) with the de¬nition (7.14) for distinct beams to show that
eqn (7.47) is satis¬ed in the same sense that distinct paraxial spaces are orthogonal.

An analogy to many-body physics—
Consider a special paraxial state such that the z-dependence of the ¬eld φs (r) can
be neglected and only one polarization is excited, so that φs (r) ’ φ (r ) . De¬ne an
e¬ective photon mass M0 such that the paraxial Hamiltonian HP for this problem
is formally identical to a second quantized description of a two-dimensional, nonrela-
tivistic, many-particle system of bosons with mass M0 (Huang, 1963, Appendix A.3;
Feynman, 1972). This feature leads to interesting analogies between quantum optics
and many-body physics (Chiao et al., 1991; Deutsch et al., 1992; Wright et al., 1994).

Paraxial expansion—
(1) Expand Xs (q, θ) through O θ2 .
(r) = ik0 ∇ · ¦(0) .
(2) Show that ¦
(r) = 1 ∇ ∇ · ¦(0) + ∇2 + 2i∇z ¦(0) .
(3) Show that ¦ 2 4

Distinct paraxial wave packet spaces are e¬ectively orthogonal—
Consider two paraxial wave packets, w ∈ P (k1 , θ1 ) and v ∈ P (k2 , θ2 ), where k1 and
k2 satisfy eqn (7.14).
(1) Apply the de¬nitions of q (Section 7.5) and w — (q) (Section 7.6) to show that

d3 q
w — (q) v s q + ∆k ,
(w, v) = s
V1 (2π) s

where ∆k = k1 ’ k2 and the arguments of w— and v s are scaled with θ1 and θ2
|q|, and combine this with the rapid fall o¬
(2) Calculate ∆k, explain why ∆k
condition in eqn (7.68) to conclude that θ2 (w, v) ’ 0 as θ2 ’ 0 for any value of
¾¿ Paraxial quantum optics

(3) Show that θ2 a [w] , a† [v] ’ 0 as θ2 ’ 0.
Linear optical devices

The manipulation of light beams by passive linear devices, such as lenses, mirrors,
stops, and beam splitters, is the backbone of experimental optics. In typical arrange-
ments the individual devices are separated by regions called propagation segments
in which the light propagates through air or vacuum. The index of refraction is usually
piece-wise constant, i.e. it is uniform in each device and in each propagation segment.
In most arrangements each device or propagation segment has an axis of symmetry
(the optic axis), and the angle between the rays composing the beam and the local
optic axis is usually small. The light beams are then said to be piece-wise paraxial.
Under these circumstances, it is useful to treat the interaction of a light beam with a
single device as a scattering problem in which the incident and scattered ¬elds both
propagate in vacuum. The optical properties of the device determine a linear relation
between the complex amplitudes of the incident and scattered classical waves. After
a brief review of this classical approach, we will present a phenomenological descrip-
tion of quantized electromagnetic ¬elds interacting with linear optical devices. This
approach will show that, at the quantum level, linear optical e¬ects can be viewed”in
a qualitative sense”as the propagation of photons guided by classical scattered waves.
The scattered waves are a rough analogue of wave functions for particles, so the asso-
ciated classical rays may be loosely considered as photon trajectories. These classical
analogies are useful for visualizing the interaction of photons with linear optical de-
vices but”as is always the case with applications of quantum theory”they must be
used with care. A more precise wave-function-like description of quantum propagation
through optical systems is given in Section 6.6.2.

8.1 Classical scattering
The general setting for this discussion is a situation in which one or more paraxial
beams interact with an optical device to produce several scattered paraxial beams.
Both the incident and the scattered beams are assumed to be mutually distinct, in
the sense de¬ned by eqn (7.14). Under these circumstances, the paraxial beams will
be called scattering channels; the incident classical ¬elds are input channels and
the scattered beams are output channels. Since this process is linear in the ¬elds, the
initial and ¬nal beams can be resolved into plane waves. The conventional classical de-
scription of propagation through optical elements pieces together plane-wave solutions
of Maxwell™s equations by applying the appropriate boundary conditions at the inter-
faces between media with di¬erent indices of refraction, as shown in Fig. 8.1(a). This
procedure yields a linear relation between the Fourier coe¬cients of the incident and
¾¿ Linear optical devices

k6 ’k6
k4 ’k4
Fig. 8.1 (a) A plane wave ±kI exp (ikI · r) ’k64
incident on a dielectric slab. The re-
k1 ’k1
¬‚ected and transmitted waves are respec-
tively ±kR exp (ikR · r) and ±kT exp (ikT · r).
(b) The time reversed version of (a). The ex-
= >
tra wave at ’kT R is discussed in the text.

scattered waves that is similar to the description of scattering in terms of stationary
states in quantum theory (Bransden and Joachain, 1989, Chap. 4). From the viewpoint
of scattering theory, the classical piecing procedure is simply a way to construct the
scattering matrix relating the incident and scattered ¬elds. Before considering the
general case, we analyze two simple examples: a propagation segment and a thin slab
of dielectric.
For the propagation segment, an incident plane wave ± exp (ik · r)”the input
channel”simply acquires the phase kL, where L is the length of the segment along
the propagation direction, i.e. the relation between the incident amplitude ± and the
scattered amplitude ± ”representing the output channel”is

± = eikL ± = eiωL/c ± . (8.1)

In some applications the propagation segment through vacuum is replaced by a length
L of dielectric. If the end faces of the dielectric sample are antire¬‚ection coated, then
the scattering relation is

± = eik(ω)L ± = ein(ω)ωL/c ± , (8.2)

where n (ω) is the index of refraction for the dielectric. Since the transmitted wave
can be expressed as
± eik(z’ωt) = ±ei[kz’ω(t’∆t)] , (8.3)
where ∆t = n (ω) L/c, the dielectric medium is called a retarder plate, or sometimes
a phase shifter.
We next turn to the example of a plane wave incident on a thin dielectric slab”
which is not antire¬‚ection coated”as shown in Fig. 8.1(a). Ordinary ray tracing,
using Snell™s law and the law of re¬‚ection at each interface between the dielectric
and vacuum, determines the directions of the propagation vectors kR and kT (where
R and T stand for the re¬‚ected and transmitted waves respectively) relative to the
propagation vector kI of the incoming wave. Since the transmitted wave crosses the
dielectric“vacuum interface twice, we ¬nd the familiar result kT = kI , i.e. the incident
and transmitted waves are described by the same spatial mode.
The plane of incidence is de¬ned by the vectors kI and n, where n is the unit vector
normal to the slab. Every incident electromagnetic plane wave can be resolved into two
Classical scattering

polarization components: the TE- (or S-) polarization, with electric vector perpendic-
ular to the plane of incidence, and the TM- (or P-) polarization, with electric vector
in the plane of incidence. For optically isotropic dielectrics, these two polarizations
are preserved by re¬‚ection and refraction. Since scattering is a linear process, we lose
nothing by assuming that the incident wave is either TE- or TM-polarized. This allows
us to simplify the vector problem to a scalar problem by suppressing the polarization
vectors. The three waves outside the slab are then ±kI exp (ikI · r), ±kR exp (ikR · r),
and ±kT exp (ikT · r). The solution of Maxwell™s equations inside the slab is a linear
combination of the transmitted wave at the ¬rst interface and the re¬‚ected wave from
the second interface. Applying the boundary conditions at each interface (Jackson,
1999, Sec. 7.3) yields a set of equations relating the coe¬cients, and eliminating the
coe¬cients for the interior solution leads to

±kR = r ±kI , ±kT = t ±kI , (8.4)

where the complex parameters r and t are respectively the amplitude re¬‚ection and
transmission coe¬cients for the slab. This is the simplest example of the general piecing
procedure discussed above.
Important constraints on the coe¬cients r and t follow from the time-reversal
invariance of Maxwell™s equations. What this means is that the time-reversed ¬nal
¬eld will evolve into the time-reversed initial ¬eld. This situation is shown in Fig.
8.1(b), where the incident waves have propagation vectors ’kR and ’kT and the
scattered waves have ’kI and ’kT R . The amplitudes for this case are written as ±T , q
where T stands for time reversal. The usual calculation gives the scattered waves as

±T I = r ±’kR + t ±’kT ,
±T T R = t ±’kR + r ±’kT .

In Appendix B.3.3 it is shown that the linear polarization basis can be chosen so that
the time-reversed amplitudes are related to the original amplitudes by eqn (B.80). In
the present case, this yields ±T I = ±— I , ±’kR = ±kR , ±’kT = ±kT , and ±T T R =
— —
’k k ’k
±— T R . Substituting these relations into eqn (8.5) and taking the complex conjugate
gives a second set of relations between the amplitudes ±kI , ±kR , and ±kT :

±kI = r— ±kR + t— ±kT ,
±kT R = t— ±kR + r— ±kT .

There is an apparent discrepancy here, since the original problem had no wave with
propagation vector kT R . Time-reversal invariance for the original problem therefore


. 9
( 27)