. 22
( 27)


than two relevant levels, or it may be necessary to consider more than one atom at a
time. In either case the computational di¬culty grows rapidly with the dimensionality
of the sample Hilbert space.
Quantum jumps

In general, a numerical simulation will take place in a sample Hilbert space with
some dimension M . The master equation is then an equation for an M — M matrix,
and the computational cost for solving the problem scales as M 2 . This is an impor-
tant consideration, since increasing the accuracy of the simulation typically requires
enlarging the Hilbert space. On the other hand, if one could work with a state vector
instead of the density operator, the cost of a solution would only scale as M . This gain
alone justi¬es the development of the Monte Carlo wave function technique described

The Monte Carlo wave function method—
According to eqn (18.115), the change in the density operator over a time step ∆t is
[HS , ρS ] + ∆tLdis ρS + O ∆t2 .
ρS (t + ∆t) = ρS (t) + (18.144)
By combining the ¬rst two terms in eqn (18.117) for Ldis with the Hamiltonian term,
this can be rewritten as
i∆t i∆t † †
ρS (t + ∆t) = ρS (t) ’ Hdis ρS (t) + ρS (t) Hdis + ∆t Ck ρS (t) Ck , (18.145)

where the dissipative Hamiltonian is
i †
= HS ’
Hdis Ck Ck . (18.146)

This suggests de¬ning a dissipative, nonunitary time translation operator,
Udis (∆t) = e’i∆tHdis / = 1 ’ Hdis + O ∆t2 , (18.147)

and then using it to rewrite eqn (18.145) as
† †
ρS (t + ∆t) = Udis (∆t) ρS (t) Udis (∆t) + ∆t Ck ρS (t) Ck , (18.148)

correct to O (∆t).
The ensemble de¬nition (2.116) of the density operator shows that this is equivalent

|Ψe (t + ∆t) Pe Ψe (t + ∆t)| = Pe Udis (∆t) |Ψe (t) Ψe (t)| Udis (∆t)
e e

Pe ∆t Ck |Ψe (t) Ψe (t)| Ck ,
e k=1
where the Pe s are the probabilities de¬ning the initial state, and |Ψe (0) = |˜e .
The ¬rst term on the right side of this equation evidently represents the dissipative
The master equation

evolution of each state in the ensemble. This is closely related to the Weisskopf“Wigner
approach to perturbation theory, which we used in Section 11.2.2 to derive the decay
of an excited atomic state by spontaneous emission.
This is all very well, but what is the meaning of the second term on the right side
of eqn (18.149)? One way to answer this question is to ¬x attention on a single state
in the ensemble, say |Ψe (t) , and to de¬ne the normalized states
Ck |Ψe (t)
|φek (t) = , k = 1, . . . , K . (18.150)

Ψe (t) Ck Ck Ψe (t)

With this notation, the contribution of |Ψe (t) to the second term in eqn (18.149) is
(“e (t) ∆t) ρe
meas (t), where

Pk |φek (t) φek (t)| ,
ρe e
(t) = (18.151)

Ψe (t) Ck Ck Ψe (t)
Pk =
, (18.152)
“e (t)

“e (t) = Ψe (t) Ck Ck Ψe (t) (18.153)

is the total transition (quantum-jump) rate of |Ψe (t) into the collection of normalized
states de¬ned by eqn (18.150). Since the coe¬cients Pk satisfy 0 Pk 1 and
e e

Pk = 1 ,

they can be treated as probabilities.
With this interpretation, ρe
meas has the form (2.127) of the mixed state describing
the sample after a measurement has been performed, but before the particular outcome
is known. This suggests that we interpret the second term on the right side of eqn
(18.149) as a wave packet reduction resulting from a measurement-like interaction
with the reservoir.
After summing over the ensemble, eqn (18.148) becomes

ρS (t + ∆t) = Udis (∆t) ρS (t) Udis (∆t) + “ (t) ∆t ρmeas (t) , (18.155)

P e ρe
ρmeas (t) = meas (t) , (18.156)

Pe “e (t)
Pe = , (18.157)
“ (t)
Quantum jumps

Pe “e (t)
“ (t) = (18.158)

is the ensemble-averaged transition rate.

A The Monte Carlo wave function algorithm
In quantum theory, a system evolves smoothly by the Schr¨dinger equation until a
measurement event forces a discontinuous change. This feature is the basis for the
procedure described here.
It is plausible to expect that only one of the two terms in eqn (18.155)”dissipative
evolution or wave packet reduction”will operate during a su¬ciently small time step.
We will ¬rst describe the Monte Carlo wave function algorithm (MCWFA) that follows
from this assumption, and then show that the density operator calculated in this way
is an approximate solution of the master equation (18.115).
In order to simplify the presentation we assume that the initial ensemble is de¬ned
states {|˜1 , . . . , |˜M } ,
probabilities {P1 , . . . , PM } ,
so that the index e = 1, 2, . . . , M .
In each time step, a choice between dissipative evolution and wave packet reduc-
tion”i.e. a quantum jump”has to be made. For this purpose, we note that the prob-
ability of a quantum jump during the interval (t, t + ∆t) is ∆Pe (t) = “e (t) ∆t, where
“e (t) is the total transition rate de¬ned by eqn (18.153). The discrete scheme will
only be accurate if the jump probability during a time step is small, i.e. ∆Pe (t) 1.
Consequently, the time step ∆t must satisfy “e (t) ∆t 1.
With this preparation, we are now ready to state the algorithm for integrating the
master equation in the interval (0, T ).
(1) Set e = 1 and de¬ne the discrete times tn = (n ’ 1) ∆t, where 1 n N and
(N ’ 1) ∆t = T .
(2) At the initial time t = 0, set |Ψ (0) = |Ψe (0) = |˜e .
(3) For n = 2, . . . , N choose a random number r in the interval (0, 1). If ∆Pe (tn’1 ) < r
go to (a), and if ∆Pe (tn’1 ) > r go to (b). Since we have imposed ∆Pe (t) 1,
this procedure guarantees that quantum jumps are relatively rare interruptions of
continuous evolution.
(a) In this case there is no quantum jump, and the state vector is advanced from
tn’1 to tn by dissipative evolution followed by normalization:

Udis (∆t) |Ψe (tn’1 )
|Ψe (tn ) =

Ψe (tn’1 ) Udis (∆t) Udis (∆t) Ψe (tn’1 )

1’ Hdis |Ψe (tn’1 )
= , (18.160)
1 ’ ∆Pe (tn’1 )

where the last line follows from the de¬nition (18.147) of Udis (∆t).
¼ The master equation

(b) In this case there is a quantum jump, and the new state vector is de¬ned
by choosing k randomly from {1, 2, . . . , K}”conditioned by the probability
distribution Pk de¬ned in eqn (18.152)”and setting

|Ψe (tn ) = |φek (tn’1 ) , (18.161)

i.e. |Ψe (tn ) jumps to one of the states permitted by the second term in eqn
(4) Repeat step (3) Ntraj times to get Ntraj discrete representations

{|Ψej (tn ) , 1 N } , j = 1, . . . , Ntraj
n (18.162)

of the state vector. These representations are distinct, due to the random choices
made in each time step. The density operator that evolves from the original pure
state |˜e is then given by
|Ψej (tn ) Ψej (tn )| .
ρe (tn ) = (18.163)
Ntraj j=1

(5) Replace e by e + 1. If e + 1 M go to step (2). If e + 1 > M go to step (6).
(6) The density operator ρ (t) that evolves from the initial density operator ρ (0)”
de¬ned by the ensemble (18.159)”is given by
Pe ρe (tn ) .
ρ (tn ) = (18.164)

The computational cost of this method scales as Ntraj N , where N is the dimen-
sionality of the sample Hilbert space HS . Consequently, the MCWFA would not be
very useful as a technique for solving the master equation, if the required number of
trials is itself of order N . Fortunately, there are applications with large N for which
one can get good statistics with Ntraj N.

B Proof that the MCWFA generates a solution
If each of the density operators ρe (t) satis¬es the master equation, then so will the
overall density operator de¬ned by eqn (18.164); therefore, it is su¬cient to give the
proof for a single ρe (t). For a su¬ciently large number of trials, the evolution of the
pure state operators,
ρej (tn ) = |Ψej (tn ) Ψej (tn )| , (18.165)
is e¬ectively given by step (2a) with probability 1 ’ ∆Pe (tn’1 ) and by step (2b) with
probability ∆Pe (tn’1 ). In other words,

ρej (tn ) = (1 ’ ∆Pe (tn’1 )) Ψdis (tn ) Ψdis (tn )
ej ej

Pk (tn’1 ) |φek (tn’1 ) φek (tn’1 )| ,
+ ∆Pe (tn’1 ) (18.166)
Quantum jumps

1’ Hdis |Ψej (tn’1 )
Ψdis (tn ) = . (18.167)
1 ’ ∆Pe (tn’1 )
The |φek (tn’1 ) s are de¬ned by substituting |Ψej (tn’1 ) for |Ψe (tn’1 ) in eqn (18.150).
Using the de¬nitions of ∆Pe , Pk , and Hdis in this equation and neglecting O ∆t2 -

terms leads to
ρej (tn ) ’ ρej (tn’1 ) i
= ’ [HS , ρej (tn’1 )] + Ldis ρej (tn’1 ) . (18.168)
Averaging this result over the trials, according to eqn (18.163), and taking the limit
∆t ’ 0 shows that ρe (t) satis¬es the master equation (18.115).
Laser-induced ¬‚uorescence—
For a concrete application of the MCWFA, we return to the trapped three-level ion
considered in Section 18.7.1. For this example, however, we replace the incoherent
source driving 3 ” 1 by a coherent laser ¬eld E L e’iωL t that is close to resonance,
i.e. |ωL ’ ω31 | ωL . In the interests of simplicity, we also drop the ¬eld driving
3 ” 2. The semiclassical approximation for the laser is applied by substituting E(+) ’
E L e’iωL t in the general results (11.36) and (11.40) of Section 11.1.4.
In the resonant wave approximation, the Schr¨dinger-picture Hamiltonian is HS =
HS0 + HS1 , where
HS0 = q Sqq , (18.169)

HS1 = „¦L S31 e’iωL t + HC , (18.170)
and „¦L = ’d31 · E L / is the Rabi frequency for the laser driving the 1 ” 3 transition.
The Sqp s are the atomic transition operators de¬ned in Section 11.1.4, and the labels
q and p range over the values 1, 2, 3.
The form of the dissipative operator Ldis for the three-level ion can be inferred from
the result (18.44) for the two-level atom, by identifying each pair of levels connected
by a decay channel with a two-level atom. For example, the lowering operator σ’ in
eqn (18.44) will be replaced by S13 for the 3 ’ 1 decay channel, and the remaining
transitions are treated in the same way.
There are two important simpli¬cations in the present case. The ¬rst is that the
phase-changing collision term in eqn (18.44) is absent for an isolated ion. The second
simpli¬cation is the assumption that the reservoirs coupled to the three transitions”
i.e. the modes of the radiation ¬eld near resonance”are at zero temperature. This
approximation is generally accurate at optical frequencies, since kB T ωopt for any
reasonable temperature.
One can use these features to show that Ldis is de¬ned by
Ldis ρS = ’ (S31 S13 ρS + ρS S31 S13 ’ 2S13 ρS S31 )
’ (S32 S23 ρS + ρS S32 S23 ’ 2S23 ρS S32 )
’ (S21 S12 ρS + ρS S21 S12 ’ 2S12 ρS S21 ) . (18.171)
¾ The master equation

This expression for Ldis can be cast into the general Lindblad form (18.117) by setting
K = 3 and de¬ning the operators

C1 = “31 S13 , C2 = “32 S23 , C3 = “2 S12 , (18.172)

corresponding respectively to the decay channels 3 ’ 1, 3 ’ 2, and 2 ’ 1.
The Rabi frequency „¦L is small compared to the laser frequency ωL , so the
Schr¨dinger-picture master equation,

ρS (t) = [HS , ρS (t)] + Ldis ρS (t) ,
i (18.173)

involves two very di¬erent time scales, 1/ωL 1/„¦L . Di¬erential equations with this
feature are said to be sti¬, and it is usually very di¬cult to obtain accurate numerical
solutions for them (Press et al., 1992, Sec. 16.6). In the case at hand, this di¬culty
can be avoided by transforming to the interaction picture.
The general results in Section 4.8 yield the transformed master equation

‚I †
ρS (t) = HS1 , ρI (t) + U0 (t) Ldis ρS (t) U0 (t) ,
i (18.174)

where U0 (t) = exp (’iHS0 t/ ) and the transform of any operator X is X I (t) =

U0 (t) XU0 (t). Applying this rule to the transition operators gives

Sqp (t) = U0 (t) Sqp U0 (t) = eiωqp t Sqp ,

and this in turn leads to

U0 (t) Ldis ρS (t) U0 (t) = Ldis ρI (t) . (18.176)

Thus we arrive at the useful conclusion that Ldis has the same form in both pictures.
The transformed interaction Hamiltonian is

HS1 = „¦L S31 e’iδt + HC ,

where δ = ωL ’ ω31 . The interaction-picture master equation (18.174) is not sti¬, but
it still has time-dependent coe¬cients. This annoyance can be eliminated by a further
ρS (t) = eitF ρI (t) e’itF , (18.178)

F= fq Sqq . (18.179)

The algebra involved here is essentially identical to the original transformation to
the interaction picture, and it is not di¬cult to show that the equation of motion
Quantum jumps

for ρ (t) will have constant coe¬cients provided that the parameters fq are chosen to
f3 ’ f1 = δ . (18.180)
The simple solution f1 = f2 = 0, and f3 = δ, leads to

ρ (t) = H S1 , ρS (t) + Ldis ρS (t) ,
i (18.181)
‚t S
where the transformed interaction Hamiltonian is
⎡ ¤
0 0 „¦— L
H S1 = ⎣ 0 0 0 ¦ . (18.182)
„¦L 0 ’δ
We are now in a position to calculate all the bits and pieces that are needed for the
direct solution of the master equation (18.181), or the application of the MCWFA. We
leave the algebra as an exercise for the reader and proceed directly to the numerical
solution of the master equation. The density operator for this problem is represented
by a 3 — 3 hermitian matrix which is determined by nine real numbers. Thus the
master equation in this case consists of nine linear, ordinary di¬erential equations
with constant coe¬cients. There are many packaged programs that can be used to
solve this problem.
Of course, this means that we do not really need the MCWFA, but it is still useful
to have a solvable problem as a check on the method. In Fig. 18.7 we compare the
direct solution to the average over 48 trials of the MCWFA. The match between the
averaged results and the direct solution can be further improved by using more trials
in the average, but it should already be clear that the MCWFA is converging on a
solution of the master equation.
Following the general practice in physics, we assume”on the basis of this special
case”that the MCWFA can be con¬dently applied in all cases. In particular, this
includes those applications for which the dimension of the relevant Hilbert space is
large compared to the number of trials needed.

Quantum trajectories—
The results displayed in Fig. 18.7 show that the full-blown master equation”whether
solved directly or by averaging over repeated trials of the MCWFA”does no better
than the rate equations of Section 18.7.1 in describing the phenomenon of interrupted
¬‚uorescence. This should not be a surprise, since the master equation describes the
evolution of the entire ensemble of state vectors for the ion.
What is needed for the description of quantum jumps (interrupted ¬‚uorescence)
is an improved version of the simple on-and-o¬ model used to derive the random
telegraph signal in Fig. 18.3. This is where single trials of the MCWFA come into
play. Each trial yields a sequence of state vectors
|Ψ (t1 ) , |Ψ (t2 ) , . . . , |Ψ (tN ) , (18.183)
which is a discrete sampling of a continuous function |Ψ (t) . This has led to the use
of the name discrete quantum trajectory for each individual trial of the MCWFA.
The master equation



Fig. 18.7 The population of |µ3 as a function of time. The smooth curve represents the
direct solution of eqn (18.181) and the jagged curve is the result of averaging over 48 trials of
the Monte Carlo wave function algorithm. Time is measured in units of the decay time 1/“31
for the 3 ’ 1 transition. In these units „¦L = 0.5, δ = 0, “32 = 0.01, and “21 = 0.001.

An example of the upper-level population P3 obtained from a single quantum trajec-
tory is shown in Fig. 18.8. Once again, a judicious choice from the results for several
trajectories nicely exhibits the random telegraph signal characterizing interrupted ¬‚u-
According to the standard rules of quantum theory, the information from a com-
pleted measurement”in particular, the collapse of the state vector”should be taken
into account immediately. In the algorithm presented in Section 18.7.3 the new infor-



Fig. 18.8 The population of |µ3 as a function of time for a single quantum trajectory. The
parameter values are the same as in Fig. 18.7.
Quantum jumps

mation is not used until the next time step at tn + ∆t, so single trials of the Monte
Carlo wave function method are approximations to the true quantum trajectory.
A more re¬ned treatment involves allowing for the projection or collapse event to
occur one or more times during the interval ∆t, and using the dissipative Hamiltonian
to propagate the state vector in the subintervals between collapses. With this kind
of analysis, it can be shown that the Monte Carlo method is accurate to order ∆t.
Increasing the accuracy to order ∆t2 requires the inclusion of jumps at both ends of
the interval and also the possibility that two jumps can occur in succession (Plenio
and Knight, 1998).
Results like that shown in Fig. 18.8 might tempt one to believe that the Monte Carlo
technique”or the more re¬ned quantum trajectory method”provides a description of
single quantum events in isolated microscopic samples. Any such conclusion would be
completely false. A large sample of trials for the Monte Carlo technique will resemble
a corresponding set of experimental runs, but the relation between the two sets is
purely statistical. Both will yield the same expectation values, correlation functions,
etc. In other words, the Monte Carlo or quantum trajectory methods are still based
on ensembles. The di¬erence between these methods and the full master equation is
that the ensembles are conditioned, i.e. reduced, by taking experimental results into

Quantum state di¬usion—
As explained above, the standard formulations of quantum theory do not apply to
individual microscopic samples, but rather to ensembles of identically prepared sam-
ples. Several of the founders of the quantum theory, including Einstein (Einstein et al.,
1935) and Schr¨dinger (Schr¨dinger, 1935b), were not at all satis¬ed with this feature,
o o
and there have been many subsequent e¬orts to reformulate the theory so that it ap-
plies to individual microscopic objects. One approach, which has attracted a great deal
of attention, is to replace the Schr¨dinger equation for an ensemble by a stochastic
equation”e.g. a di¬usion equation in the Hilbert space of quantum states”for an
individual system.
The universal empirical success of conventional quantum theory evidently requires
that the new stochastic equation should agree with the Schr¨dinger equation when ap-
plied to ensembles. Many such equations are possible, but symmetry considerations”
see Gisin and Percival (1992) and references contained therein”have led to an essen-
tially unique form.
For a sample described by the Lindblad master equation (18.115) the stochastic
equation for the state vector can be written as
d 1 1

|Ψ (t) = Hdis |Ψ (t) + Ck (t) ’ |Ψ (t)
Ck (t) Ck (t) Ψ
dt i 2

[Ck (t) ’ Ck (t) Ψ ] |Ψ (t)
+ ζk (t) , (18.184)

where Hdis is the dissipative Hamiltonian de¬ned by eqn (18.146), and

= Ψ |X| Ψ
X (18.185)
The master equation

is the expectation value in the state. The c-numbers ζk (t) are delta-correlated random
variables, i.e.

ζk (t) ζk (t ) P = δkk δ (t ’ t ) , (18.186)
where the average · · · P is de¬ned by the probability distribution P for the random
variables ζk .
We have chosen to write the stochastic equation for the state vector so that it
resembles the operator Langevin equations discussed in Chapter 14, but most authors
prefer to use the more mathematically respectable Ito form (Gardiner, 1991). The
presence of the averages Ck (t) Ψ makes this equation nonlinear, so that analytical
solutions are hard to come by.
In this approach, quantum jumps appear as smooth transitions between discrete
quantum states. The transitions occur on a short time scale, that is determined by
the equation itself. Physical interactions describing measurements of an observable
lead to irreversible di¬usion toward one of the eigenstates of the observable, so that
no separate collapse postulate is required. In applications, the numerical solution of
eqn (18.184) has the same kind of advantage over the direct solution of the master
equation as the Monte Carlo wave function method.
Given the close relation between the master equation, quantum jumps, and quan-
tum state di¬usion, it is not very surprising to learn that quantum state di¬usion
can be derived as a limiting case of the quantum-jump method. The limiting case is
that of in¬nitely many jumps, where each jump causes an in¬nitesimal change in the
state vector. This mathematical procedure is related to the experimental technique
of balanced heterodyne detection discussed in Section 9.3. Thus the quantum state
di¬usion method can be regarded as a new conceptual approach to quantum theory,
or as a particular method for solving the master equation.

18.8 Exercises
18.1 Averaging over the environment
(1) Combine ρW (0) = ρS (0) ρE (0) and the assumption brν = 0 with eqn (18.14)
to derive eqn (18.15).
(2) Drop the assumption brν E = 0, and introduce the ¬‚uctuation operators δbrν =
brν ’ brν E . Show how to rede¬ne HS and HE , so that eqn (18.15) will still be

18.2 Master equation for a cavity mode
(1) Use the discussion in Section 18.4.1 to argue that the general expression (18.20)
for the double commutator C2 (t, t ) can be replaced by C2 (t, t ) = F† (t) , G (t )
+ HC .
(2) Use the expression (18.25) for F to show that TrE F† (t) , G (t ) can be expressed
in terms of the correlation functions in eqns (18.28) and (18.29).
(3) Put everything together to derive eqn (18.30). Do not forget the end-point rule.
(4) Transform back to the Schr¨dinger picture to derive eqns (18.31)“(18.33).

18.3 Master equation for a two-level atom
(1) Use the Markov assumptions (14.142) and (14.143) to verify eqns (18.40) and
(2) Use these expressions to evaluate the double commutator G2 .
(3) Given the assumptions made in Section 18.4.2, ¬nd out which terms in G2 have
vanishing traces over the environment.
(4) Evaluate the traces of the surviving terms and thus derive the master equation in
the environment picture.
(5) Transform back to the Schr¨dinger picture to derive eqns (18.42)“(18.44).

18.4 Thermal equilibrium for a cavity mode
(1) Derive eqn (18.34) from eqn (18.31).
(2) Solve the recursion relation (18.37), subject to eqn (18.38), to ¬nd eqn (18.39).

18.5 Fokker“Planck equation
(1) Carry out the chain rule calculation needed to derive eqn (18.81).
(2) Derive and solve the di¬erential equations for the functions introduced in eqn
(3) Derive eqn (18.93).

Lindblad form for the two-level atom—
Determine the three operators C1 , C2 , and C3 for the two-level atom.

Evolution of the purity of a general state—
(1) Use the cyclic invariance of the trace operation to deduce eqn (18.119) from eqn
(2) Suppose that a single cavity mode is in thermal equilibrium with the cavity walls at
temperature T . At t = 0 the cavity walls are suddenly cooled to zero temperature.
Calculate the initial rate of change of the purity.
Bell™s theorem and its optical tests

Since this is a book on quantum optics, we have assumed throughout that quantum
theory is correct in its entirety, including all its strange and counterintuitive predic-
tions. As far as we know, all of these predictions”even the most counterintuitive
ones”have been borne out by experiment. Einstein accepted the experimentally ver-
i¬ed predictions of quantum theory, but he did not believe that quantum mechanics
could be the entire story. His position was that there must be some underlying, more
fundamental theory, which satis¬ed the principles of locality and realism.
According to the principle of locality, a measurement occurring in a ¬nite volume
of space in a given time interval could not possibly in¬‚uence”or be in¬‚uenced by”
measurements in a distant volume of space at a time before any light signal could
connect the two localities. In the language of special relativity, two such localities are
said to be space-like separated.
The principle of realism contains two ideas. The ¬rst is that the physical properties
of objects exist independently of any measurements or observations. This point of view
was summed up in his rhetorical question to Abraham Pais, while they were walking
one moonless night together on a path in Princeton: ˜Is the Moon there when nobody
looks?™ The second is the condition of spatial separability: the physical properties
of spatially-separated systems are mutually independent.
The combination of the principles of locality and realism with the EPR thought
experiment convinced Einstein that quantum theory must be an incomplete description
of physical reality.
For many years after the EPR paper, this discussion appeared to be more concerned
with philosophy than physics. The situation changed dramatically when Bell (1964)
showed that every local realistic theory”i.e. a theory satisfying a plausible inter-
pretation of the metaphysical principles of locality and realism favored by Einstein”
predicts that a certain linear combination of correlations is uniformly bounded. Bell
further showed that this inequality is violated by the predictions of quantum mechan-
Subsequent work has led to various generalizations and reformulations of Bell™s orig-
inal approach, but the common theme continues to be an inequality satis¬ed by some
linear combination of correlations. We will refer to these inequalities generically as
Bell inequalities. Most importantly, two-photon, coincidence-counting experiments
have shown that a particular Bell inequality is, in fact, violated by nature. One must
therefore give up one or the other”or possibly even both”of the principles of locality
and realism (Chiao and Garrison, 1999).
The Einstein“Podolsky“Rosen paradox

Bell thereby successfully transformed what seemed to be an essentially philosoph-
ical problem into experimentally testable physical propositions. This resulted in what
Shimony has aptly called experimental metaphysics. The ¬rst experiment to test Bell™s
theorem was performed by Freedman and Clauser (1972). This early experiment al-
ready indicated that there must be something wrong with Einstein™s fundamental
One of the most intriguing developments in recent years is that the Bell inequalities
”which began as part of an investigation into the conceptual foundations of quantum
theory”have turned out to have quite practical applications to ¬elds like quantum
cryptography and quantum computing.
Quantum optics is an important tool for investigating the phenomenon of quantum
nonlocality connected with EPR states and the EPR paradox. Although Einstein,
Podolsky, and Rosen formulated their argument in the language of nonrelativistic
quantum mechanics, the problem they posed also arises in the case of two relativistic
particles ¬‚ying o¬ in di¬erent directions, for example, the two photons emitted in
spontaneous down-conversion.

19.1 The Einstein“Podolsky“Rosen paradox
The Einstein“Podolsky“Rosen paper (Einstein et al., 1935) adds two further ideas to
the principles of locality and realism presented above. The ¬rst is the de¬nition of an
element of physical reality:
If, without in any way disturbing a system, we can predict with certainty (i.e. with
probability equal to unity) the value of a physical quantity, then there exists an
element of physical reality corresponding to this physical quantity.

The second is a criterion of completeness for a physical theory:
. . .every element of the physical reality must have a counterpart in the physical

The argument in the EPR paper was formulated in terms of the entangled two-body
wave function

dk ik(xA ’xB ’L)
ψ (xA , xB ) = e , (19.1)
’∞ 2π

which is a special case of the EPR states de¬ned by eqn (6.1), but we will use a
simpler example due to Bohm (1951, Chap. 22), which more closely resembles the
actual experimental situations that we will study. Hints for carrying out the original
argument can be found in Exercise 19.1.
Bohm™s example is modeled on the decay of a spin-zero particle into two distin-
guishable spin-1/2 particles, and it”like the original EPR argument”is expressed in
the language of nonrelativistic quantum mechanics. In the rest frame of the parent
particle, conservation of the total linear momentum implies that the daughter parti-
cles are emitted in opposite directions, and conservation of spin angular momentum
implies that the total spin must vanish.
¼ Bell™s theorem and its optical tests

In this situation, the decay channel in which the particles travel along the z-axis,
with momenta k0 and ’ k0 , is described by the two-body state

= eik0 (zA ’zB ) |¦
|Ψ , (19.2)

where the spins σA and σB are described by the Bohm singlet state
= √ {|‘
|¦ |“ ’ |“ |‘ B} , (19.3)
which is expressed in the notation introduced in eqns (6.37) and (6.38).
The choice of the quantization axis n is left open, since”as seen in Exercise 6.3”
the spherical symmetry of the Bohm singlet state guarantees that it has the same form
for any choice of n. Since only spin measurements will be considered, the following
discussion will be carried out entirely in terms of the spin part |¦ AB of the two-body
state vector.
The spins of the daughter particles can be measured separately by means of two
Stern“Gerlach magnets placed to intercept them, as shown in Fig. 19.1. Correlations
between the spatially well-separated spin measurements can then be determined by
means of coincidence-counting circuitry connecting the four counters.
Let us ¬rst suppose that the magnetic ¬elds”and consequently the spatial quan-
tization axes”of the two Stern“Gerlach magnets are directed along the x-axis, i.e.
n = ux . A measurement of the spin component Sx with the result +1/2 is signalled
by a click in the upper Geiger counter of the Stern“Gerlach apparatus A. Applying
von Neumann™s projection postulate to the Bohm singlet state yields the reduced state

|¦ = |‘x |“x
, (19.4)

where |‘x A is an eigenstate of Sx with eigenvalue +1/2, etc.

The reduced state is also an eigenstate of Sx with eigenvalue ’1/2; therefore, any

measurement of Sx would certainly yield the value ’1/2, corresponding to a click
in the lower counter of apparatus B. Since this prediction of a de¬nite value for Sx


’ ’

Fig. 19.1 The Bohm singlet version of the EPR experiment. σA and σB are spin-1/2 particles
in a singlet state, and ± and β are the angles of orientation of the two Stern“Gerlach magnets.
The nature of randomness in the quantum world

does not require any measurement at all, the system is not disturbed in any way.
Consequently, Sx is an element of physical reality at B.
Now consider the alternative scenario in which the quantization axes are directed
along y. In this case, a measurement of Sy with the outcome +1/2 leaves the system
in the reduced state
|¦ y = |‘y A |“y B , (19.5)

and this in turn implies that the value of Sy is certainly ’1/2. This prediction is also
possible without disturbing the system; therefore, Sy is also an element of physical
reality at B.
From the local-realistic point of view, a believer in quantum theory now faces a
dilemma. The spin components Sx and Sy are represented by noncommuting opera-
Sx , Sy = i S z = 0 , (19.6)
so they cannot be simultaneously predicted or measured. This leaves two alternatives.
(1) If Sx and Sy are both elements of physical reality, then quantum theory”which
cannot predict values for both of them”is incomplete.
(2) Two physical quantities, like Sx and Sy , that are associated with noncommuting
operators cannot be simultaneously real.
The latter alternative implies a more restrictive de¬nition of physical reality in
which, for example, two quantities cannot be simultaneously real unless they can be
simultaneously measured or predicted. This would, however, mean that the physical
reality of Sx or Sy at B depends on which measurement was carried out at the distant
apparatus A.
The state reductions in eqns (19.4) or (19.5), i.e. the replacement of the original
state |¦ AB by |¦ x or |¦ y respectively, occur as soon as the measurement at
A is completed. This is true no matter where apparatus B is located; in particular,
when the light transit time from A to B is larger than the time required to complete
the measurement at A. Thus the global change in the state vector occurs before any
signal could travel from A to B. This evidently violates local realism.
In the words of Einstein, Podolsky, and Rosen, ˜No reasonable de¬nition of real-
ity could be expected to permit this.™ On this basis, they concluded that quantum
theory is incomplete. In this connection, it is interesting to quote Einstein™s reaction
to Schr¨dinger™s introduction of the notion of entangled states. In a letter to Born,
written in 1948, Einstein wrote the following (Einstein, 1971):
There seems to me no doubt that those physicists who regard the descriptive meth-
ods of quantum mechanics as de¬nitive in principle would react to this line of thought
in the following way: they would drop the requirement for the independent existence
of the physical reality present in di¬erent parts of space; they would be justi¬ed in
pointing out that the quantum theory nowhere makes explicit use of this require-
ment. [Emphasis added]

19.2 The nature of randomness in the quantum world
If the EPR claim that quantum theory is incomplete is accepted, then the next step
would be to ¬nd some way to complete it. One advantage of such a construction would
¾ Bell™s theorem and its optical tests

be that the randomness of quantum phenomena, e.g. in radioactive decay, might be
explained by a mechanism similar to ordinary statistical mechanics.
In other words, there may exist some set of hidden variables within the radioac-
tive nucleus that evolve in a deterministic way. The apparent randomness of radioactive
decay would then be merely the result of our ignorance of the initial values of the hid-
den variables. From this point of view, there is no such thing as an uncaused random
event, and the characteristic randomness of the quantum world originates at the very
beginning of each microscopic event.
This should be contrasted with the quantum description, in which the state vector
evolves in a perfectly deterministic way from its initial value, and randomness enters
only at the time of measurement.
A simple example of a hidden variable theory is shown in Fig. 19.2. Imagine a
box containing many small, hard spheres that bounce elastically from the walls of the
box, and also scatter elastically from each other. The properties of such a system of
particles can be described by classical statistical mechanics.
Cutting a small hole into one of the walls of the box will result in an exponential
decay law for the number of particles remaining in the box as a function of time. In this
model for a nucleus undergoing radioactive decay, the apparent randomness is ascribed
to the observers ignorance of the initial conditions of the balls, which obey completely
deterministic laws of motion. The unknown initial conditions are the hidden variables
responsible for the observed phenomenon of randomness.
For an alternative model, we jump from the nineteenth to the twentieth century,
and imagine that the box is equipped with a computer running a program generating
random numbers, which are used to decide whether or not a particle is emitted in a
given time interval. In this case the apparently random behavior is generated by a
deterministic algorithm, and the hidden variables are concealed in the program code
and the seed value used to begin it.
Let us next consider a series of random events occurring in a time interval
(t ’ ∆t/2, t + ∆t/2) at two distant points r1 and r2 . If the two sets of events are
space-like separated, i.e. |r1 ’ r2 | > c∆t, then the principle of local realism requires
that correlations between the random series can only occur as a result of an earlier,
common cause. We will call this the principle of statistical separability.
In the absence of a common cause, the separated random events are like inde-
pendent coin tosses, located at r1 and r2 , so it would seem that they must obey a
common-sense factorization condition. For example, the joint probability of the out-
comes heads-at-r1 and heads-at-r2 should be the product of the independent proba-
bilities for heads at each location.

Fig. 19.2 A simple model for radioactive de-
cay, consisting of small balls inside a large box
with a small hole cut into one of the walls.
Einstein™s ˜hidden variables™ would be the un-
known initial conditions of these balls.
Local realism

In quantum mechanics, the factorizability of joint probabilities implies the factor-
izability of joint probability amplitudes (up to a phase factor); for example, a situation
in which measurements at r1 and r2 are statistically independent is described by a
separable two-body wave function, i.e. the product of a wave function of r1 and a
wave function of r2 . Conversely, the absolute square of a product wave function is the
product of two separate probabilities, just as for two independent coin tosses at r1 and
r2 .
By contrast, an entangled state of two particles, e.g. a superposition of two prod-
uct wave functions, is not factorizable. The result is that the probability distribution
de¬ned by an entangled state does not satisfy the principle of statistical separability,
even when the parts are far apart in space.
The EPR argument emphasizes the importance of these disparities between the
classical and quantum descriptions of the world, but it does not point the way to an
experimental method for deciding between the two views. Bell realized that the key is
the fact that the nonfactorizability of entangled states in quantum mechanics violates
the common-sense, independent-coin-toss rule for joint probabilities.
He then formulated the statistical separability condition in terms of a factoriz-
ability condition on the joint probability for correlations between measurements on
two distant particles. Bell™s analysis applies completely generally to all local realistic
theories, in a sense to be explained in the next section.

19.3 Local realism
Converting the qualitative disparities between the classical and quantum approaches
into experimentally testable di¬erences requires a quantitative formulation of local
realism that does not depend on quantum theory. We will follow Shimony™s version
(Shimony, 1990) of Bell™s solution for this problem. This analysis can be presented in
a very general way, but it is easier to understand when it is described in terms of a
concrete experiment. For this purpose, we ¬rst sketch an optical version of the Bohm
singlet experiment.

19.3.1 Optical Bohm singlet experiment
As shown in Fig. 19.3, the entangled pair of spin-1/2 particles in Fig. 19.1 is replaced
by a pair of photons emitted back-to-back in an entangled state, and the Stern“Gerlach
magnets are replaced by calcite prisms that act as polarization analyzers. The beam of
unpolarized right-going photons γA is split by the calcite prism A into an extraordinary
ray e and an ordinary ray o. Similarly, the beam of left-going photons γB is split by
calcite prism B into e and o rays.
The ordinary-ray and extraordinary-ray output ports of the calcite prisms are
monitored by four counters. The two calcite prisms A and B can be independently
rotated around the common decay axis by the azimuthal angles ± and β respectively.
The values of ± and β”which determine the division of the incident wave into e-
and o-waves”correspond to the direction of the magnetic ¬eld in a Stern“Gerlach
Bell™s theorem and its optical tests

* )

Fig. 19.3 An optical implementation of the EPR experiment. Calcite prisms replace the
Stern“Gerlach magnets shown in Fig. 19.1. The¬source emits an entangled state of two oppo-
sitely-directed photons, such as the Bell state ¬Ψ’ . The birefringent prisms split the light
into ordinary ˜o™ and extraordinary ˜e™ rays. The vertical dotted lines inside the prisms in-
dicate the optic axes of the calcite crystals. Coincidence-counting circuitry connecting the
Geiger counters is not shown.

The counters on each side of the apparatus are mounted rigidly with respect to
the calcite prisms, so that they corotate with the prisms. Thus the four counters will
constantly monitor the o and e outputs of the calcite prisms for all values of ± and β.
The azimuthal angles ± and β are examples of what are called parameter set-
tings, or simply parameters, of the EPR experiment. The experimentalist on the right
side of the apparatus, Alice, is free to choose the parameter setting ± (the azimuthal
angle of rotation of calcite prism A) as she pleases. Likewise, the experimentalist on
the left side, Bob, is free to choose the parameter setting β (the azimuthal angle of
rotation of calcite prism B) as he pleases, independently of Alice™s choice.

19.3.2 Conditions de¬ning locality and realism
Bell™s seminal paper has inspired many proposals for realizations of the metaphysical
notions of realism and locality, including both deterministic and stochastic forms of
hidden variables theories. In this section we present a general class of realizations by
specifying the conditions that a theory must satisfy in order to be called local and
We will say that a theory is realistic if it describes all required elements of physical
reality for a system by means of a space, Λ, of completely speci¬ed states »”i.e. the
states of maximum information”satisfying the following two conditions.

Objective reality
Λ is de¬ned without reference to any measurements. (19.7)

Spatial separability
The state spaces ΛA and ΛB for the spatially-separated systems
A and B are independently de¬ned. (19.8)

The only other condition imposed on Λ is that it must support probability distributions
ρ (») in order to describe situations in which maximum information is not available.
Local realism

The only conditions imposed on an admissible distribution ρ (») are that it be
positive de¬nite, i.e. ρ (») 0, normalized to unity,

d»ρ (») = 1 , (19.9)

and independent of the parameter values ± and β. The last condition incorporates the
intuitive idea that the states » are determined at the source S, before any encounters
with the measuring devices at A and B.
One possible example for Λ would be the classical phase space involved in the simple
model of radioactive decay presented above. In this case, the completely speci¬ed states
» are simply points in the phase space, and a probability distribution ρ (») would be
the usual phase space distribution.
A much more surprising example comes from a disentangled version of quantum
theory, which is de¬ned by excluding all entangled states of spatially-separated sys-
tems. This mutilated theory violates the superposition principle, but by doing so it
allows us to identify Λ with the Hilbert space H for the local system. An individual
state » is thereby identi¬ed with a pure state |ψ .
According to the standard interpretation of quantum theory, this choice of » gives
a complete description of the state of an isolated system. In this case ρ (») is just the
distribution de¬ning a mixed state. The fact that the disentangled version of quantum
theory is realistic illustrates the central role played by entanglement in di¬erentiating
the quantum view from the local realistic view.
We next turn to the task of developing a quantitative realization of locality. For this
purpose, we need a language for describing measurements at the spatially-separated
stations A and B, shown in Fig. 19.3. For the sake of simplicity, it is best to consider
experiments that have a discrete set of possible outcomes {Am , m = 1, . . . , M } and
{Bn , n = 1, . . . , N } at the stations A and B respectively, e.g. A1 could describe a
detector ¬ring at station A during a certain time interval. With each outcome Am , we
associate a numerical value, Am , called an outcome parameter. The de¬nition of
the output parameters is at our disposal, so they can be chosen to satisfy the following
convenient conditions:
’1 Am +1 and ’ 1 Bn +1 . (19.10)
For the two-calcite-prism experiment, sketched in Fig. 19.3, the indices m and n
can only assume the values o and e, corresponding respectively to the ordinary and the
extraordinary rays emerging from a given prism. The source S emits a pair of photons
prepared at birth in some state ». The experimental signals in this case are clicks in
one of the counters, so one useful de¬nition of the outcome parameters is
Ae = 1 for outcome Ae (Alice™s e-counter clicks) ,
Ao = ’1 for outcome Ao (Alice™s o-counter clicks) ,
Be = 1 for outcome Be (Bob™s e-counter clicks) ,
Bo = ’1 for outcome Bo (Bob™s o-counter clicks) .
The outcome Ae occurs when a rightwards-propagating photon from the source S
is de¬‚ected through the e port of the calcite prism A, and subsequently registered by
Bell™s theorem and its optical tests

Alice™s e-counter, etc. In this thought experiment we imagine that all counters have
100% sensitivity; consequently, if an e-counter does not click, we can be sure that the
corresponding o-counter will click.
The following conditional probabilities will be useful.
p(Am |», ±, β) ≡ probability of outcome Am , given
the system state » and parameter settings ±, β . (19.12)
p(Bn |», ±, β) ≡ probability of outcome Bn , given
the system state » and parameter settings ±, β . (19.13)
p(Am |», ±, β, Bn ) ≡ probability of outcome Am , given
the system state », parameter settings ±, β ,
and outcome Bn . (19.14)
p(Bn |», ±, β, Am ) ≡ probability of outcome Bn , given
the system state », parameter settings ±, β ,
and outcome Am . (19.15)
p(Am , Bn |», ±, β) ≡ joint probability of outcomes Am and Bn ,
given the system state » and
the parameter settings ±, β . (19.16)
Following the work of Jarrett (1984), as presented by Shimony (1990), we will say
that a theory is local if it satis¬es the following conditions.
Parameter independence
p(Am |», ±, β) = p(Am |», ±) , (19.17)
p(Bn |», ±, β) = p(Bn |», β) . (19.18)

Outcome independence
p(Am |», ±, β, Bn ) = p(Am |», ±, β) , (19.19)
p(Bn |», ±, β, Am ) = p(Bn |», ±, β) . (19.20)
Parameter independence states that the parameter settings chosen by one observer
have no e¬ect on the outcomes seen by the other. For example, eqn (19.17) tells us that
the probability distribution of the outcomes observed by Alice at A does not depend
on the parameter settings chosen by Bob at B.
This apparently innocuous statement is, in fact, extremely important. If parameter
independence were violated, then Bob”who might well be space-like separated from
Alice”could send her an instantaneous message by merely changing β, e.g. twisting
his calcite crystal. Such a possibility would violate the relativistic prohibition against
sending signals faster than light. Likewise, eqn (19.18) prohibits Alice from sending
instantaneous messages to Bob.
The principle of outcome independence states that the probability of outcomes seen
by one observer does not depend on which outcomes are actually seen by the other.
This is what one would expect for two independent coin tosses”since the outcome of
one coin toss is clearly independent of the outcome of the other”but eqns (19.19) and
(19.20) also seem to prohibit correlations due to a common cause, e.g. in the source S.
Local realism

This incorrect interpretation stems from overlooking the assumption that » is a
complete description of the state, including any secret mechanism that builds in corre-
lations at the source (Bub, 1997, Chap. 2). With this in mind, the conditions (19.19)
and (19.20) simply re¬‚ect the fact that the actual outcomes Bn or Am are super¬‚uous,
if » is given as part of the conditions. We will return to the issue of correlations after
deriving Bell™s strong-separability condition.
It is also important to realize that the individual events at A and B can be truly
random, even if they are correlated. This situation is exhibited in the experiment
sketched in Fig. 19.3. When the polarizations of photons γA and γB , in the Bell state
|Ψ’ , are measured separately”i.e. without coincidence counting”they are randomly
polarized; that is, the individual sequences of e- or o-counts at A and B are each as
random as two independent sequences of coin tosses.
Finally, we note that a violation of outcome independence does not imply any viola-
tions of relativity. The conditional probability p(Am |», ±, β, Bn ) describes a situation
in which Bob has already performed a measurement and transmitted the result to Al-
ice by a respectably subluminal channel. Thus protecting the world from superluminal
messages and the accompanying causal anomalies is the responsibility of parameter
independence alone.

19.3.3 Strong separability
Bell™s theorem is concerned with the strength of correlations between the random out-
comes at A and B, so the ¬rst step is to ¬nd the constraints imposed by the combined
e¬ects of realism and locality”in the form of parameter and outcome independence”
on the joint probability p(Am , Bn |», ±, β) de¬ned by eqn (19.16).
We begin by applying the compound probability rule (A.114) to ¬nd

p(Am , Bn |», ±, β) = p(Am |», ±, β, Bn )p(Bn |», ±, β) . (19.21)

In other words, the joint probability for outcome Am and outcome Bn is the product
of the probability for outcome Am (conditioned on the occurrence of the outcome Bn )
with the probability that outcome Bn actually occurred. All three probabilities are
conditioned by the assumption that the state of the system was » and the parameter
settings were ± and β. The situation is symmetrical in A and B, so we also ¬nd

p(Am , Bn |», ±, β) = p(Bn |», ±, β, Am )p(Am |», ±, β) . (19.22)

Applying outcome independence, eqn (19.19), to the right side of eqn (19.21) yields

p(Am , Bn |», ±, β) = p(Am |», ±, β)p(Bn |», ±, β) , (19.23)

and applying parameter independence to both terms on the right side of this equation
results in the strong-separability condition:

p(Am , Bn |», ±, β) = p(Am |», ±)p(Bn |», β) . (19.24)

This is the mathematical expression of the following, seemingly common-sense,
statement: for a given speci¬cation, », of the state, whatever Alice does or observes
Bell™s theorem and its optical tests

must be independent of whatever Bob does or observes, since they could reside in
space-like separated regions.
Before using the strong-separability condition to prove Bell™s theorem, we return
to the question of correlations that might be imposed by a common cause. In typical
experiments, the complete speci¬cation of the state represented by » is not available”
for example, the values of the hidden variables cannot be determined”so the strong-
separability condition must be averaged over a distribution ρ (») that represents the
experimental information that is available.
The result is

p(Am , Bn |±, β) = d»ρ (») p(Am |», ±)p(Bn |», β) , (19.25)

p(Am , Bn |±, β) = d»ρ (») p(Am , Bn |», ±, β) . (19.26)

The corresponding averaged probabilities for single outcomes are

p(Am |±) = d»ρ (») p(Am |», ±) ,
p(Bn |β) = d»ρ (») p(Bn |», β) ;

consequently, the condition for statistical independence,

p(Am , Bn |±, β) = p(Am |±)p(Bn |β) , (19.28)

can only be satis¬ed”for general choices of Am and Bn ”when ρ (») = δ (» ’ »0 ).
A closer connection with experiment is a¬orded by de¬ning Bell™s expectation
(1) The expectation value of outcomes seen by Alice is

p(Am |», ±)Am .
E(», ±) = (19.29)

(2) The expectation value of outcomes seen by Bob is

p(Bn |», β)Bn .
E(», β) = (19.30)

(3) The expectation value of joint outcomes seen by both Alice and Bob is

p(Am , Bn |», ±, β)Am Bn .
E(», ±, β) = (19.31)

The quantity E(», ±, β) is the average value of joint outcomes as measured, for
example, in a coincidence-counting experiment. The bounds |Am | 1 and |Bn | 1,
together with the normalization of the probabilities, imply that the absolute values of
all these expectation values are bounded by unity.
Bell™s theorem

From Bell™s strong-separability condition, it follows that the joint expectation
value”for a given complete state »”also factorizes:
E(», ±, β) = E(», ±)E(», β) , (19.32)
but in the absence of complete state information, the relevant expectation values are

E (±) ≡ p(Am |±)Am ,
d»ρ (») E(», ±) = (19.33)

etc. Thus the correlation function
C (±, β) = E(±, β) ’ E (±) E (β) (19.34)
can only vanish in the extreme case, ρ (») = δ (» ’ »0 ), of perfect information.

19.4 Bell™s theorem
An evaluation of any one of Bell™s expectation values, e.g. E(», ±), would depend
on the details of the particular local realistic theory under consideration. One of the
consequences of Bell™s original work (Bell, 1964) has been the discovery of various
linear combinations of expectation values, which have the useful property that upper
and lower bounds can be derived for the entire class of local realistic theories de¬ned
above. We follow Shimony (1990), by considering the particular sum
S (») ≡ E(», ±1 , β1 ) + E(», ±1 , β2 ) + E(», ±2 , β1 ) ’ E(», ±2 , β2 ) , (19.35)
which was ¬rst suggested by Clauser et al. (1969). With a ¬xed value, », of the hid-
den variables, the four combinations (±1 , β1 ), (±1 , β2 ), (±2 , β1 ), and (±2 , β2 ) represent
independent choices ±1 or ±2 by Alice and β1 or β2 by Bob, as shown in Fig. 19.4.
For the typical situation in which the complete state » is not known, S (») should
be replaced by the experimentally relevant quantity:
S ≡ E(±1 , β1 ) + E(±1 , β2 ) + E(±2 , β1 ) ’ E(±2 , β2 ) . (19.36)
Bell™s theorem is then stated as follows.

Bob Alice
Correlation -(±1,β1)

Correlation -(±2,β1) Correlation -(±1,β2)
β2 ±2
Anticorrelation ’-(±2,β2)

Two choices of Two choices of
Bob's settings Alice's settings

Fig. 19.4 The four terms in the sum S de¬ned in eqn (19.35). The dependence of the
expectation values E(», ±, β) on the system state » has been suppressed in this ¬gure.
¼ Bell™s theorem and its optical tests

Theorem 19.1 For all local realistic theories,
’2 E(», ±1 , β1 ) + E(», ±1 , β2 ) + E(», ±2 , β1 ) ’ E(», ±2 , β2 ) +2 . (19.37)

Averaging over the distribution of states produces the Bell inequality:
’2 E(±1 , β1 ) + E(±1 , β2 ) + E(±2 , β1 ) ’ E(±2 , β2 ) +2 . (19.38)
This result limits the total amount of correlation, as measured by S, that is allowed
for a local realistic theory. Experiments using coincidence-detection measurements
performed on two-photon decays have shown that this bound can be violated.

19.4.1 Mermin™s lemma
In order to prove Bell™s theorem, we ¬rst prove the following lemma due to Mermin.
Lemma 19.2 If x1 , x2 , y1 , y2 are real numbers in the interval [’1, +1], then the sum
S ≡ x1 y1 + x1 y2 + x2 y1 ’ x2 y2 lies in the interval [’2, +2], i.e. |S| 2.

Proof Since S is a linear function of each of the four variables x1 , x2 , y1 , y2 , it must
take on its extreme values when the arguments of the function themselves are extrema,
i.e. when (x1 , x2 , y1 , y2 ) = (±1, ±1, ±1, ±1), where the four ±s are independent. There
are four terms in S, and each term is bounded between ’1 and +1; consequently,
|S| 4. However, we can also rewrite S as
S = (x1 + x2 ) (y1 + y2 ) ’ 2x2 y2 . (19.39)
The extrema of x1 + x2 are 0 or ±2, and similarly for y1 + y2 . Therefore the extrema
of the product (x1 + x2 ) (y1 + y2 ) are 0 or ±4. The extrema for 2x2 y2 are ±2. Hence
the extrema for S are ±2 or ±6. The latter possibility is ruled out by the previously
determined limit |S| 4; therefore, the extrema of S are ±2, i.e. |S| 2.

19.4.2 Proof of Bell™s theorem
Proof Bell™s theorem now follows as a corollary of Mermin™s lemma. With the iden-
x1 = E(», ±1 ) , where |E(», ±1 )| 1 ,
x2 = E(», ±2 ) , where |E(», ±2 )| 1 ,
y1 = E(», β1 ) , where |E(», β1 )| 1 ,
where |E(», β2 )|
y2 = E(», β2 ) , 1,
Lemma 19.2 implies
|E(», ±1 )E(», β1 ) + E(», ±1 )E(», β2 ) + E(», ±2 )E(», β1 ) ’ E(», ±2 )E(», β2 )| 2 .
Using the strong-separability condition (19.32) for each term, i.e. E(», ±, β) =
E(», ±)E(», β), we now arrive at
’2 E(», ±1 , β1 ) + E(», ±1 , β2 ) + E(», ±2 , β1 ) ’ E(», ±2 , β2 ) +2 , (19.42)
and averaging over » yields eqn (19.38).
Quantum theory versus local realism

19.5 Quantum theory versus local realism
As a prelude to the experimental tests of local realism, we ¬rst support our previous
claim that quantum theory violates outcome independence and satis¬es parameter in-
dependence. In addition, we give an explicit example for which the quantum prediction
of the correlations violates Bell™s theorem.

19.5.1 Quantum theory is not local
The issues of parameter independence and outcome independence will be studied by
considering an experiment simpler than the one presented in Section 19.3.1. In this
arrangement, shown in Fig. 19.5, pairs of polarization-entangled photons are produced
by down-conversion, and Alice and Bob are supplied with linear polarization ¬lters and
a single counter apiece. This reduces the outcomes for Alice to: Ayes (Alice™s detector
clicks) and Ano (there is no click). The corresponding outcome parameters are Ayes = 1
and Ano = 0. Bob™s outcomes and outcome parameters are de¬ned in the same way.
We begin by assuming that the source produces the entangled state
|χ = F |hA , vB + G |vA , hB , (19.43)
|hA , vB ≡ a† A h a† B v |0 , |vA , hB ≡ a† A v a† B h |0 , (19.44)
k k k k
kA and kB are directed toward Alice and Bob respectively, and h and v label orthog-
onal polarizations: eh (horizontal ) and ev (vertical ). The parameters are the angles ±
and β de¬ning the linear polarizations e± and eβ transmitted by the polarizers.
Since akA h ∝ ekA h · E (+) , etc., the annihilation operators in the (h, v)-basis are
related to the annihilation operators in the (±, ± = π/2 ’ ±)-basis by
ak A h
ak A ± cos ± sin ±
= . (19.45)
’ sin ±
ak A ± cos ± ak A v
The corresponding relation for Bob follows by letting ± ’ β and kA ’ kB .

A Parameter independence
For this experiment, the role of p(Am |», ±, β) in eqn (19.17) is played by p(Ayes |χ, ±, β),
the probability that Alice™s detector clicks for the given state and parameter settings.
This is proportional to the detection rate for e± -polarized photons, i.e.

* )

Fig. 19.5 Schematic of an apparatus to measure the polarization correlations of the entan-
gled photon pair γA and γB emitted back-to-back from the source S. The coincidence-counting
circuitry connecting the two Geiger counters is not shown.
¾ Bell™s theorem and its optical tests

p(Ayes |χ, ±, β) ∝ G(1) (rA , tA ; rA , tA ) ∝ χ a† A ± akA ± χ . (19.46)

A calculation”see Exercise 19.2 ”using eqns (19.43)“(19.45) yields

p(Ayes |χ, ±, β) ∝ |F |2 cos2 ± + |G|2 sin2 ± . (19.47)

Thus the quantum result for the probability of a click of Alice™s detector is independent
of the setting β of Bob™s polarizer, although it can depend on her own polarizer setting
±. In other words, quantum theory”at least in this example”satis¬es parameter
independence. The symmetry of the experimental arrangement guarantees that the
probability, p(Byes |χ, ±, β), seen by Bob is independent of ±.
This single example does not constitute a general proof that quantum theory sat-
is¬es parameter independence, but the features of the calculation provide guidance
for crafting such a proof. In general, the calculation of outcome probabilities for Alice
take the same form as in the example, i.e. the expectation value of an operator”which
may well depend on Alice™s parameter settings”is evaluated by using the state vector
determined by the source. Neither Alice™s operator nor the state vector depend on
Bob™s parameter settings; therefore, parameter independence is guaranteed for quan-
tum theory. √
For the special values F = ’G = 1/ 2, the entangled state |χ becomes the
singlet-like Bell state

Ψ’ = √ {|hA , vB ’ |vA , hB } , (19.48)

¬rst de¬ned in Section 13.3.5. In this case, p(Ayes |χ, ±, β) is independent of ± as well
as β, so that Alice™s singles-counting measurements are the same as expected from an
unpolarized beam. This supports our previous claim that the individual measurements
can be as random as coin tosses.

B Outcome independence
Checking outcome independence requires the evaluation of the conditional probability
p(Ayes |», ±, β, Brslt ) that Alice hears a click, given that Bob has observed the outcome
Brslt , where rslt = yes, no. In this case, we will simplify the calculation by setting
|χ = |Ψ’ at the beginning.
With the usual assumption of 100% detector sensitivity, both possible outcomes
for Bob”Byes (click) or Bno (no click)”constitute a measurement. According to von
Neumann™s projection postulate, we must then replace the original state |Ψ’ by the
reduced state |Ψ’ rslt , to ¬nd

p(Ayes |», ±, β, Brslt ) ∝ rslt Ψ’ a† A ± akA ± Ψ’ . (19.49)
k rslt

The reduced state for either of Bob™s outcomes can be constructed by inverting
Bob™s version of eqn (19.45) to express the creation operators in the (h, v)-basis in
terms of the creation operators in the β, β -basis:
Quantum theory versus local realism

a† B β
a† B h ’ sin β
cos β k
k = . (19.50)
a† β
a† B v ’ sin β cos β
k kB

Using this in the de¬nition (19.44) exhibits the original states as superpositions of
states containing β-polarized photons and states containing β-polarized photons.
For the outcome Byes ”Bob heard a click”the projection postulate instructs us
to drop the states containing the β-polarized photons, since they are blocked by the
polarizer. This produces the reduced state
Ψ’ = √ {sin β |hA , βB ’ cos β |vA , βB } , (19.51)

where |βB = a† B β |0 . Substituting this into eqn (19.49) leads”by way of the calcu-
lation in Exercise 19.3”to the simple result

p(Ayes |», ±, β, Byes ) ∝ sin2 (± ’ β) . (19.52)

For the opposite outcome, Bno , the projection postulate tells us to drop the states
containing β-polarized photon states instead, and the result is

p(Ayes |», ±, β, Bno ) ∝ cos2 (± ’ β) . (19.53)

The conclusion is that quantum theory violates outcome independence, since the
probability that Alice hears a click depends on the outcome of Bob™s previous mea-
surement. The fact that Alice™s probabilities only depend on the di¬erence in polarizer
settings follows from the assumption that the source produces the special state |Ψ’ ,
which is invariant under rotations around the common propagation axis.
The violation of outcome independence implies that the two sets of experimental
outcomes must be correlated. The probability that both detectors click is proportional
to the coincidence-count rate, which”as we learnt in Section 9.2.4”is determined by
the second-order Glauber correlation function; consequently,

p Ayes , Byes Ψ’ , ±, β ∝ G±β (r1 t1 ; r2 t2 )

∝ Ψ ’ a† A ± a† B β ak B β ak A ± Ψ ’ . (19.54)
k k

The techniques used above give

p Ayes , Byes Ψ’ , ±, β = sin2 (± ’ β)
= ’ cos(2± ’ 2β) , (19.55)
which describes an interference pattern, e.g. if β is held ¬xed while ± is varied. Fur-
thermore, this pattern has 100% visibility, since perfect nulls occur for the values
± = β, β + π, β + 2π, . . ., at which the planes of polarization of the two photons are
parallel. The surprise is that an interference pattern with 100% visibility occurs in the
(2) (1) (1)
second-order correlation function G±β while the ¬rst-order functions G± and Gβ
display zero visibility, i.e. no interference at all.
Bell™s theorem and its optical tests

19.5.2 Quantum theory violates Bell™s theorem
The results (19.52), (19.53), and (19.55) show that quantum theory violates outcome
independence and the strong-separability principle; consequently, quantum theory does
not satisfy the hypothesis of Bell™s theorem. Nevertheless, it is still logically possible
that quantum theory could satisfy the conclusion of Bell™s theorem, i.e. the inequality
(19.37). We will now dash this last, faint hope by exhibiting a speci¬c example in
which the quantum prediction violates the Bell inequality (19.38).
For the experiment depicted in Fig. 19.3, let us now calculate what quantum theory
predicts for S (») when » is represented by the Bell state |Ψ’ . For general parameter
settings ± and β, the de¬nition (19.31) for Bell™s joint expectation value can be written

E(±, β) = pee (±, β)Ae Be + peo (±, β)Ae Bo + poe (±, β)Ao Be + poo (±, β)Ao Bo , (19.56)

where we have omitted the »-dependence of the expectation value, and adopted the
simpli¬ed notation
pmn (±, β) ≡ p(Am , Bn |», ±, β) (19.57)
for the joint probabilities.
In Exercise 19.4, the calculation of the probabilities is done by using the techniques
leading to eqn (19.55), with the result
sin2 (± ’ β) ,
pee (±, β) = poo (±, β) = (19.58)
cos2 (± ’ β) .
peo (±, β) = poe (±, β) = (19.59)
After combining these expressions for the probabilities with the de¬nition (19.11) for
the outcome parameters, Bell™s joint expectation value (19.56) becomes

E(±, β) = sin2 (± ’ β) ’ cos2 (± ’ β) = ’ cos (2± ’ 2β) . (19.60)

Our objective is to choose values (±1 , β1 , ±2 , β2 ) such that S violates the inequality
|S| 2. A set of values that accomplishes this,

±1 = 0—¦ , ±2 = 45—¦ , β1 = 22.5—¦ , β2 = ’22.5—¦ , (19.61)

is illustrated in Fig. 19.6.


Fig. 19.6 A choice of angular settings
± = 
±1 , ±2 , β1 , β2 in the calcite-prism-pair experi-
ment (see Fig. 19.3) that maximizes the viola-
tion of Bell™s bounds (19.42) by the quantum
Quantum theory versus local realism

For these settings, the expectation values are given by
E(±1 = 0, β1 = 22.5—¦ ) = ’ cos (45—¦ ) = ’ √ ,
E(±1 = 0, β2 = ’22.5—¦) = ’ cos (’45—¦ ) = ’ √ ,
—¦ —¦ —¦
E(±2 = 45 , β1 = 22.5 ) = ’ cos (’45 ) = ’ √ ,
E(±2 = 45—¦ , β2 = ’22.5—¦) = ’ cos (’135—¦) = + √ ,
√ √
so that S = ’2 2. This violation of the bound |S| 2 by a factor of 2 shows that
quantum theory violates the Bell inequality (19.38) by a comfortable margin.

Motivation for the de¬nition of the sum S
What motivates the choice of four terms and the signs (+, +, +, ’) in eqn (19.35)?
The answers to this question now becomes clear in light of the above calculation. The
independent observers, Alice and Bob, need to make two independent choices in their
respective parameter settings ± and β, in order to observe changes in the correlations
between the polarizations of the photons γA and γB . This explains the four pairs of
parameter settings appearing in the de¬nition of S, and pictured in Fig. 19.4.
The motivation for the choice of signs (+, +, +, ’) in S can be explained by refer-
ence to Fig. 19.6. Alice and Bob are free to choose the ¬rst three pairs of parameters
settings, (±1 , β1 ), (±1 , β2 ), and (±2 , β1 ), so that all three pairs have the same setting
di¬erence, 22.5—¦ , and negative correlations. In the quantum theory calculation of S for

the Bell state |Ψ’ , these choices yield the same negative correlation, ’1/ 2, since
the expectation values only depend on the di¬erence in the polarizer settings.
By contrast, the fourth pair of settings, (±2 , β2 ), describes the two angles that are
the farthest away from√ each other in Fig. 19.6, and it yields a positive expectation
value E(±2 , β2 ) = +1/ 2. This arises from the fact that, for this particular pair of
angles (±2 = 45—¦ , β2 = ’22.5—¦), the relative orientations of the planes of polarization
of the back-to-back photons γA and γB are almost orthogonal. The opposite sign of
this expectation value compared to the ¬rst three can be exploited by deliberately
choosing the opposite sign for this term in eqn (19.35). This stratagem ensures that
all four terms contribute with the same sign, and this gives the best chance of violating
the inequality.
It should be emphasized that the violation of this Bell inequality by quantum
theory is not restricted to this particular example. However, it turns out that this
special choice of angular settings de¬nes an extremum for S in the important case
of maximally entangled states. Consequently, these parameter settings maximize the
quantum theory violation of the Bell inequality (Su and W´dkiewicz, 1991).
Bell™s theorem and its optical tests

19.6 Comparisons with experiments
19.6.1 Visibility of second-order interference fringes
For comparison with experiments with two counters, such as the one sketched in Fig.
19.5, the visibility of the second-order interference fringes observed in coincidence
detection can be de¬ned”by analogy to eqn (10.26)”as
(2) (2)
’ G±β
V≡ max min
, (19.63)
(2) (2)
G±β + G±β
max min

(2) (2)
where G±β max and G±β min are respectively the maximum and minimum, with respect
to the angles ± and β, of the second-order Glauber correlation function. Let us assume
that data analysis shows that an empirical ¬t to the second-order interference fringes
has the form


. 22
( 27)