. 3
( 7)


1 + ±1 2

This is the conditional mean µt in Equation (6.2). Moreover, ln h t is
normally distributed
ln h t ∼ N ln h t , , or
1 + ±1

∼ N (µt , σ 2 )

6.2.2 The parameter w
First partition w as ± = (±0 , ±1 ) and σν . The conditional posterior dis-

tributions are:
(i) f (± R, H, σν ) = f (± H, σν ). Note that ln h t has an A R(1) struc-
2 2

ture. Hence, if prior distribution of ± is multivariate normal, ± ∼
M N (±0 , C0 ), then posterior distribution f (± H, σν |) is also mul-

tivariate normal with mean ±* and covariance C* , where
’1 ’1
C* = 2 + C0 ,
zt zt
σν t=2
1 ’1
±* = C* + C 0 ±0 ,
z t ln h t and
z t = (1, ln h t ) . (6.5)
(ii) f (σν |R, H, ± ) = f (σν | H, ±). If the prior distribution of σν is
2 2 2

m»/σν ∼ χm , then the conditional posterior distribution of σν is an
2 2 2

inverted chi-squared distribution with m + n ’ 1 degrees of free-
dom, i.e.

m» + νt2 ∼ χm+n’1 ,

νt = ln h t ’ ±0 ’ ±1 ln h t’1 for t = 2, · · · , n. (6.6)
Stochastic Volatility 63

Tsay (2002) suggested using the ARCH model parameter estimates as
the starting value for the MCMC simulation.

In a PhD thesis, Heynen (1995) ¬nds SV forecast is best for a number
of stock indices across several continents. There are only six other SV
studies and the view about SV forecasting performance is by no means
unanimous at the time of writing this book.
Heynen and Kat (1994) forecast volatility for seven stock indices and
¬ve exchange rates they ¬nd SV provides the best forecast for indices
but produces forecast errors that are 10 times larger than EGARCH™s and
GARCH™s for exchange rates. Yu (2002) ranks SV top for forecasting
New Zealand stock market volatility, but the margin is very small, partly
because the evaluation is based on variance and not standard deviation.
Lopez (2001) ¬nds no difference between SV and other time series
forecasts using conventional error statistics. All three papers have the
1987s crash in the in-sample period, and the impact of the 1987 crash
on the result is unclear.
Three other studies, Bluhm and Yu (2000), Dunis, Laws and Chau-
vin (2000) and Hol and Koopman (2002) compare SV and other time
series forecasts with option implied volatility forecast. Dunis, Laws and
Chauvin (2000) ¬nd combined forecast is the best for six exchange rates
so long as the SV forecast is excluded. Bluhm and Yu (2000) rank SV
equal to GARCH. Both Bluhm and Yu (2000) and Hol and Koopman
(2002) conclude that implied is better than SV for forecasting stock
index volatility.
Multivariate Volatility Models

At the time of writing this book, there was no volatility forecasting
contest that is based on the multivariate volatility model. However, there
have been a number of studies that examined cross-border volatility
spillover in stock markets (Hamao, Masulis and Ng, 1989; King and
Wadhwani, 1990; Karolyi, 1995; Koutmos and Booth, 1995), exchange
rates (Baillie, Bollerslev and Redfearn, 1993; Hong, 2001), and interest
rates (Tse and Booth, 1996). The volatility spillover relationships are
potential source of information for volatility forecasting, especially in
the very short term and during global turbulent periods. Several vari-
ants of multivariate ARCH models have existed for a long time while
multivariate SV models are fewer and more recent. Truly multivariate
volatility models (i.e. beyond two or three returns variables) are not easy
to implement. The greatest challenges are parsimony, nonlinear relation-
ships between parameters, and keeping the variance“covariance matrix
positive de¬nite. In the remainder of this short chapter, I will just illus-
trate one of the more recent multivariate ARCH models that I use a lot
in my research. It is the asymmetric dynamic covariance (ADC) model
due to Kroner and Ng (1998). I must admit that I have not used ADC
to ¬t more than three variables! The ADC model encompasses many
older multivariate ARCH models as we will explain later. Readers who
are interested in multivariate SV models could refer to Liesenfeld and
Richard (2003).

In implementing multivariate volatility of returns from different coun-
tries, the adjustment for time zone differences is important. In this sec-
tion, I rely much on my joint work with Martin Martens that was pub-
lished in the Journal of Banking and Finance in 2001. The model we
66 Forecasting Financial Market Volatility

used is presented below

rt = µ + µt + Mµt’1 , µt ∼ N (0, Ht ) ,
h iit = θiit ,
h ijt = ρijt h iit h jjt + φij θijt ,
θijt = ωijt + bi Ht’1 b j + ai µt’1 µt’1 a j + +gi ·t’1 ·t’1 g j .

The matrix M is used for adjusting nonsynchronous returns. It has non-
zero elements only in places where one market closes before another,
except in the case of the USA where the impact could be delayed till the
next day. So Japan has an impact on the European markets but not the
other way round. Europe has an impact on the USA market and the USA
has an impact on the Japanese market on the next day. The conditional
variance“covariance matrix Ht has different speci¬cations for diagonal
(conditional variance h iit ) and off-diagonal (conditional covariance h ijt )
Consider the following set of conditions:

(i) ai = ±i ei and bi = βi ei ∀i, where ei is the ith column of an (n, n)
identity matrix, and ±i and βi , i = 1, · · · , n, are scalars.
(ii) A = ±(ω» ) and B = β(ω» ) where A = (a1 , · · · , an ), B =
(b1 , · · · , bn ), ω and » are (n, 1) vectors and ± and β are scalars.

The ADC model reduces to:

(i) a restricted VECH model of Bollerslev, Engle and Wooldridge
(1988) if ρ12 = 0 and under condition (i) with the restrictions that
βi j = βi β j ;
(ii) the constant correlation model of Bollerslev (1990) if φ12 = 0 and
under condition (i);
(iii) the BEKK model of Engle and Kroner (1995) if ρ12 = 0 and
φ12 = 1;
(iv) the factor ARCH (FARCH) model of Engle, Ng and Rothschild
(1990) if ρ12 = 0, φ12 = 1 and under condition (ii),

and, unlike most of its predecessors, it allows for volatility asymmetry
in the spillover effect as well through the last term in θijt .
Multivariate Volatility Models 67

Take a two-variable case as an example;
µ1t min (0, µ1t )
µt = , ·t = ,
µ2t min (0, µ2t )
a1i b1i g1i
ai = , bi = , gi = .
a2i b2i g2i
The condition variance of, for example, ¬rst return is
h 11t = ω11 + B11 h 11,t’1 + a1 µt’1 µt’1 a1 + g1 ·t’1 ·t’1 g1
= ω11 + B11 h 11,t’1 + a11 µ1t’1 + 2a11 a21 µ1t’1 µ2t’1

+ a21 µ2t’1 + g11 ·1t’1 + 2g11 g21 ·1t’1 ·2t’1 + g21 ·2t’1 ,
22 22 22

and similarly for the second return from, for example, another country.
Here, we set B11 h 11,t’1 as a single element, although one could also have
B as a matrix, bringing in previous day conditional variance of returns
from the second country as B21 h 22,t’1 . In the above speci¬cation, we
only allow spillover to permeate through µ2t’1 and assume that the impact

will then be passed on ˜internally™ through h 11,t’1 .
The conditional covariance h i jt = h 12t is slightly more complex. It
accommodates both constant and time-varying components as follows:
h 12t = ρ12 h 11t h 22t + φ12 h 12t ,
h 12t = ω12 + B12 h 12,t’1 + a1 µt’1 µt’1 a2 + g1 µt’1 µt’1 g2
= ω12 + B12 h 12,t’1 + a11 a12 µ1t’1 + a11 a22 µ1t’1 µ2t’1

+ a12 a21 µ1t’1 µ2t’1 + a21 a22 µ2t’1 + g11 g12 ·1t’1
2 2

+ g11 g22 ·1t’1 ·2t’1 + g12 g21 ·1t’1 ·2t’1 + g21 g22 ·2t’1 ,

with ρ12 capturing the constant correlation and h 12t , a time-varying co-
variance weighted by φ12 . In Martens and Poon (2001), we calculate
time-varying correlation as
h 12t
ρ12t = √ .

h 11t h 22t
In the implementation, we ¬rst estimate all returns as univariate GJR-
GARCH and estimate the MA parameters for synchronization correction
independently. These parameter estimates are fed into the ADC model
as starting values.
68 Forecasting Financial Market Volatility

The future of multivariate volatility models very much depends on their
use. Their use in long horizon forecasting is restricted unless one adopts
a more parsimonious factor approach (see Sentana, 1998; Sentana and
Fiorentini, 2001). For capturing volatility spillover, multivariate volatil-
ity models will continue to be useful for short horizon forecast and
univariate risk management. The use of multivariate volatility mod-
els for estimating conditional correlation and multivariate risk man-
agement will be restrictive because correlation is a linear concept and
a poor measure of dependence, especially among large values (Poon,
Rockinger and Tawn, 2003, 2004). There are a lot of important de-
tails in the modelling of multivariate extremes of ¬nancial asset returns
and we hope to see some new results soon. It will suf¬ce to illustrate
here some of these issues with a simple example on linear relationship
Let Yt be a stock return and X t be the returns on the stock market
portfolio or another stock return from another country. The stock returns
regression gives

Yt = ± + β X t + µt , (7.1)
ρx y σ y
Cov (Yt , X t )
β= = , or
V ar (X t )
ρx y = .
If the factor loading, β, in (7.1) remains constant, then ρx y could increase
simply because σx /σ y increases during the high-volatility state. This is
the main point in Forbes and Rigobon (2002) who claim ¬ndings in
many contagion studies are being driven by high volatility.
However, β, the factor loading, need not remain constant. One com-
mon feature in ¬nancial crisis is that many returns will move together
and jointly become more volatile. This means that indiosyncratic risk
will be small σµ2 ’ 0, and from (7.1)

σ y = β 2 σx2 ,

β = , and ρx y ’ 1.
The dif¬culty in generalizing this relationship is that there are crises
that are local to a country or a region that have no worldwide impact or
Multivariate Volatility Models 69

impact on the neighbouring country. We do not yet have a model that
will make such a distinction, let alone one that will predict it.
The study of univariate jump risk in option pricing is a hot topic
just now. The study of the joint occurrence of jumps and multivariate
volatility models will probably ˜meet up™ in the not so-distant future.
Before we understand how the large events jointly occur, the use of the
multivariate volatility model on its own in portfolio risk management
will be very dangerous. The same applies to asset allocation and portfolio
formation, although the impact here is over a long horizon, and hence
will be less severely affected by joint-tail events.

A European-style call (put) option is a right, but not an obligation, to
purchase (sell) an asset at the agreed strike price on option maturity date,
T . An American-style option is a European option that can be exercised
prior to T .

The Black“Scholes (BS) formula below is for pricing European call and
put options:
c = S0 N (d1 ) ’ K e’r T N (d2 ) , (8.1)
p = K e’r T N (’d2 ) ’ S0 N (’d1 ) ,
ln (S0 /K ) + r + 1 σ 2 T
d1 = ,
√ 2

d 2 = d1 ’ σ T ,
e’0.5z dz,
N (d1 ) = √
2π ’∞
where c ( p) is the price of the European call (put), S0 is the current
price of the underlying assset, K is the strike or exercise price, r is
the continuously compounded risk-free interest rate, and T is the time
to option maturity. N (d1 ) is the cumulative probability distribution of
a standard normal distribution for the area below d1 , and N (’d1 ) =
1 ’ N (d1 ).
As T ’ 0,
d2 ’ ∞,
d1 and
N (d2 ) ’ 1,
N (d1 ) and
N (’d2 ) ’ 0,
N (’d1 ) and
which means
c ≥ S0 ’ K , p≥0 for S0 > K , (8.3)
c ≥ 0, p ≥ K ’ S0 for S0 < K . (8.4)
72 Forecasting Financial Market Volatility

As σ ’ 0, again

N (d2 ) ’ 1,
N (d1 ) and
N (’d2 ) ’ 0.
N (’d1 ) and

This will lead to

c ≥ S0 ’ K e’r T , and (8.5)
p ≥ K e’r T ’ S0 . (8.6)

The conditions (8.3), (8.4), (8.5) and (8.6) are the boundary conditions
for checking option prices before using them for empirical tests. These
conditions need not be speci¬c to Black“Scholes. Options with market
prices (transaction or quote) violating these boundary conditions should
be discarded.

8.1.1 The Black“Scholes assumptions
(i) For constant µ and σ , d S = µSdt + σ Sdz.
(ii) Short sale is permitted with full use of proceeds.
(iii) No transaction costs or taxes; securities are in¬nitely divisible.
(iv) No dividend before option maturity.
(v) No arbitrage (i.e. market is at equilibrium).
(vi) Continuous trading (so that rebalancing of portfolio is done
(vii) Constant risk-free interest rate, r .
(viii) Constant volatility, σ .

Empirical ¬ndings suggest that option pricing is not sensitive to the as-
sumption of a constant interest rate. There are now good approximating
solutions for pricing American-style options that can be exercised early
and options that encounter dividend payments before option maturity.
The impact of stochastic volatility on option pricing is much more pro-
found, an issue which we shall return to shortly. Apart from the constant
volatility assumption, the violation of any of the remaining assumptions
will result in the option price being traded within a band instead of at
the theoretical price.
Black“Scholes 73

8.1.2 Black“Scholes implied volatility
Here, we ¬rst show that
> 0,
which means that CBS is a monotonous function in σ and there is a
one-to-one correspondence between C B S and σ .
From (8.1)

‚CBS ‚ N ‚d1 ‚ N ‚d2
’ K e’r (T ’t)
=S , (8.7)
‚σ ‚d1 ‚σ ‚d2 ‚σ

‚ N (x) 1
= √ e’ 2 x ,

‚x 2π
‚d1 √ d1
= T ’t ’ ,
‚σ σ
√ d1 √
‚d2 d1
= T ’t ’ ’ T ’t =’ .
‚σ σ σ
Substitute these results into (8.7) and get
√ Ke’r (T ’t) ’ 1 d22 d1
‚CBS S d1
= √ e’ 2 d1
T ’t ’ +√ e2
‚σ σ σ
2π 2π
1 2√
Se’ 2 d1 T ’ t d1
= +√

σ 2π

√ 2
’ 1 d1 ’r (T ’t) ’ 1 (d1 ’σ T ’t )
— ’Se + Ke e
2 2

1 2√
Se’ 2 d1 T ’ t d1
= +√

σ 2π

’ 1 d1 ’r (T ’t)
e(’ 2 d1 +d1 σ
T ’t’ 2 σ (T ’t))
12 1
2 2
— ’Se + Ke

1 2√
Se’ 2 d1 T ’ t d 1 e ’ 2 d1

= +√

σ 2π

— ’S + Ke(’r (T ’t)+d1 σ T ’t’ 2 σ (T ’t)) .

Also from (8.2)
√ 1 S
d1 σ T ’ t ’ σ 2 (T ’ t) ’ r (T ’ t) = log (8.9)
2 K
74 Forecasting Financial Market Volatility

and substituting this result into (8.8), we get
1 2√
Se’ 2 d1 T ’ t d1 e ’ 2 d1
= +√ ’S + K

‚σ σ 2π K

1 2√
Se’ 2 d1 T ’ t
= >0

8.1.3 Black“Scholes implied volatility smile
Given an observed European call option price C obs for a contract with
strike price K and expiration date T , the implied volatility σiv is de-
¬ned as the input value of the volatility parameter to the Black“Scholes
formula such that
CBS (t, S; K , T ; σiv ) = C obs . (8.10)
The option implied volatility σiv is often interpreted as a market™s expec-
tation of volatility over the option™s maturity, i.e. the period from t to T .
We have shown in the previous section that there is a one-to-one corre-
spondence between prices and implied volatilities. Since ‚CBS /‚σ > 0,
the condition
C obs = CBS (t, S; K , T ; σiv ) > CBS (t, S; K , T ; 0)
means σiv > 0; i.e. implied volatility is always greater than zero. The
implied volatilities from put and call options of the same strike price
and time to maturity are the same because of put“call parity. Traders
often quote derivative prices in terms of σiv rather than dollar prices, the
conversion to price being made through the Black“Scholes formula.
Given the true (unconditional) volatility is σ over period T . If Black“
Scholes is correct, then
CBS (t, S; K , T ; σiv ) = CBS (t, S; K , T ; σ )
for all strikes. That is the function (or graph) of σiv (K ) against K for
¬xed t, S, T and r , observed from market option prices is supposed to be
a straight horizontal line. But, it is well known that the Black“Scholes
σiv , differ across strikes. There is plenty of documented empirical ev-
idence to suggest that implied volatilities are different across options
of different strikes, and the shape is like a smile when we plot Black“
Scholes implied volatility σiv against strike price K , the shape is anything
but a straight line. Before the 1987 stock market crash, σiv (K ) against
Black“Scholes 75

K was often observed to be U-shaped, with the minimum located at
or near at-the-money options, K = Se’r (T ’t) . Hence, this gives rise to
the term ˜smile effect™. After the stock market crash in 1987, σiv (K )
is typically downward sloping at and near the money and then curves
upward at high strikes. Such a shape is now known as a ˜smirk™. The
smile/smirk usually ˜¬‚attens™ out as T gets longer. Moreover, implied
volatility from option is typically higher than historical volatility and
often decreases with time to maturity.
Since ‚CBS /‚σ > 0, the smile/smirk curve, tells us that there is a pre-
mium charged for options at low strikes (OTM puts and ITM calls see
footnote 2) above their BS price as compared with the ATM options. Al-
though the market uses Black“Scholes implied volatility, σiv , as pricing
units, the market itself prices options as though the constant volatility
lognormal model fails to capture the probabilities of large downward
stock price movements and so supplement the Black“Scholes price to
account for this.

8.1.4 Explanations for the ˜smile™
There are at least two theoretical explanations (viz. distributional as-
sumption and stochastic volatility) for this puzzle. Other explanations
that are based on market microstructure and measurement errors (e.g.
liquidity, bid“ask spread and tick size) and investor risk preference (e.g.
model risk, lottery premium and portfolio insurance) have also been pro-
posed. In the next chapter on option pricing using stochastic volatility,
we will explain how violation of distributional assumption and stochastic
volatility could induce BS implied volatility smile. Here, we will con-
centrate on understanding how Black“Scholes distributional assumption
produces volatility smile. Before we proceed, we need to make use of
the positive relationship between volatility and option price, and the
put“call parity1
ct + K er (T ’t) = pt + St (8.11)
which establishes the positive relationship between call and put option
prices. Since implied volatility is positively related to option price, Equa-
tion (8.11) suggests there is also a positive relationship between implied
volatilities derived from call and put options that have the same strike
price and the same time to maturity.

The discussion here is based on Hull (2002).
76 Forecasting Financial Market Volatility

As mentioned before, Black“Scholes requires stock price to follow a
lognormal distribution or the logarithmic stock returns to have a normal
distribution. There is now widely documented empirical evidence that
risky ¬nancial asset returns have leptokurtic tails. In the case where the
strike price is very high, the call option is deep-out-of-the-money2 and
the probability for this option to be exercised is very low. Nevertheless,
a leptokurtic right tail will give this option a higher probability, than
that from a normal distribution, for the terminal asset price to exceed the
strike price and the call option to ¬nish in the money. This higher prob-
ability leads to a higher call price and a higher Black“Scholes implied
volatility at high strike.
Next, we look at the case when the strike price is low. First note
that option value has two components: intrinsic value and time value.
Intrinsic value re¬‚ects how deep the option is in the money. Time value
re¬‚ects the amount of uncertainty before the option expires, hence it
is most in¬‚uenced by volatility. A deep-in-the-money call option has
high intrinsic value and little time value, and a small amount of bid“ask
spread or transaction tick size is suf¬cient to perturb the implied volatility
estimation. We could, however, make use of the previous argument but
apply it to an out-of-the-money (OTM) put option at low strike price. An
OTM put option has a close to nil intrinsic value and the put option price
is due mainly to time value. Again because of the thicker tail on the left,
we expect the probability that the OTM put option ¬nishes in the money
to be higher than that for a normal distribution. Hence the put option
price (and hence the call option price through put“call parity) should be
greater than that predicted by Black“Scholes. If we use Black“Scholes to
invert volatility estimates from these option prices, the Black“Scholes
implied will be higher than actual volatility. This results in volatility
smile where implied volatility is much higher at very low and very high
The above arguments apply readily to the currency market where ex-
change rate returns exhibit thick tail distributions that are approximately
symmetrical. In the stock market, volatility skew (i.e. low implied at high
strike but high implied at low strike) is more common than volatility
smile after the October 1987 stock market crash. Since the distribution

In option terminology, an option is out of the money when it is not pro¬table to exercise the option. For a
call option, this happens when S < X , and in the case of a put, the condition is S > X . The reverse is true for an
in-the-money option. A call or a put is said to be at the money (ATM) when S = X . A near-the-money option is
an option that is not exactly ATM, but close to being ATM. Sometimes, discounted values of S and X are used
in the conditions.
Black“Scholes 77

is skewed to the far left, the right tail can be thinner than the normal
distribution. In this case implied volatility at high strike will be lower
than that expected from a volatility smile.

8.2.1 The stock price dynamics
The Black“Scholes model for pricing European equity options assumes
the stock price has the following dynamics:
dS = µSdt + σ Sdz, (8.12)
and for the growth rate on stock:
= µdt + σ dz. (8.13)
From Ito™s lemma, the logarithm of the stock price has the following
d ln S = µ ’ σ 2 dt + σ dz, (8.14)
which means that the stock price has a lognormal distribution or the
logarithm of the stock price has a normal distribution. In discrete time
d ln S = µ ’ σ 2 dt + σ dz,

ln S = µ ’ σ 2 t +σ µ t,

ln ST ’ ln S0 ∼ N µ ’ σ T, σ T ,

ln ST ∼ N ln S0 + µ ’ σ 2 T, σ T . (8.15)

8.2.2 The Black“Scholes partial differential equation
The derivation of the Black“Scholes partial differential equation (PDE)
is based on the fundamental fact that the option price and the stock price
depend on the same underlying source of uncertainty. A portfolio can
then be created consisting of the stock and the option which eliminates
this source of uncertainty. Given that this portfolio is riskless, it must
78 Forecasting Financial Market Volatility

therefore earn the risk-free rate of return. Here is how the logic works:
S = µS t + σ S z, (8.16)
‚f ‚f 1 ‚2 f 2 2 ‚f
f= µS + + σS t+ σ S z. (8.17)
‚S ‚t 2 ‚ S2 ‚S
We set up a hedged portfolio, , consisting of ‚ f /‚ S number of shares
and short one unit of the derivative security. The change in portfolio
value is
=’ f + S
‚f ‚f 1 ‚2 f 2 2 ‚f
=’ µS + + σS t’ σS z
‚S ‚t 2 ‚ S2 ‚S
‚f ‚f
+ µS t + σS z
‚S ‚S
‚f 1 ‚2 f 2 2
=’ + σS t.
‚t 2 ‚ S2
Note that uncertainty due to z is cancelled out and µ, the premium for
risk (returns on S), is also cancelled out. Not only has no uncer-
tainty, it is also preference-free and does not depend on µ, a parameter
controlled by the investor™s risk aversion.
If the portfolio value is fully hedged, then no arbitrage implies that it
must earn only a risk-free rate of return
t= ,
t =’ f +
r S,
‚f ‚f ‚f 1 ‚2 f 2 2 ‚f
r ’f + t =’ µS + + σS t’ σS z
‚S ‚S ‚t 2 ‚ S2 ‚S
+ [µS t + σ S z] ,
‚f ‚f ‚f 1 ‚2 f 2 2
r (’ f ) t = ’r S t’ µS t ’ t’ σS t
‚S ‚S ‚t 2 ‚ S2
‚f ‚f ‚f
’ σS z + µS t + σ S z,
‚S ‚S ‚S
and ¬nally we get the well-known Black“Scholes PDE
‚f ‚f 1 ‚2 f 2 2
r f = rS + + σS (8.18)
‚S ‚t 2 ‚ S2
Black“Scholes 79

8.2.3 Solving the partial differential equation
There are many solutions to (8.18) corresponding to different derivatives,
f , with underlying asset S. In other words, without further constraints,
the PDE in (8.18) does not have a unique solution. The particular security
being valued is determined by its boundary conditions of the differential
equation. In the case of an European call, the value at expiry c (S, T ) =
max (S ’ K , 0) serves as the ¬nal condition for the Black“Scholes PDE.
Here, we show how BS formula can be derived using the risk-neutral
valuation relationship. We need the following facts:
(i) From (8.15),
ln S ∼ N ln S0 + µ ’ σ 2 , σ .
Under risk-neutral valuation relationship, µ = r and
ln S ∼ N ln S0 + r ’ σ 2 , σ .
(ii) If y is a normally distributed variable,

µy ’ a
+ σ y eµ y + 2 σ y .
e y f (y) dy = N

(iii) From the de¬nition of cumulative normal distribution,

a ’ µy µy ’ a
f (y) dy = 1 ’ N =N .
σy σy

Now we are ready to solve the BS formula. First, the terminal value of
a call is
cT = E [max (S ’ K , 0)]

= (S ’ K ) f (S) d S
∞ ∞
= f (ln S) d ln S ’ K
ln S
f (ln S) d ln S.
ln K ln K

Substituting facts (ii) and (iii) and using information from (i) with
µ y = ln S0 + r ’ σ 2 ,
σ y = σ,
a = ln K ,
80 Forecasting Financial Market Volatility

we get

ln S0 + r + 1 σ 2 ’ ln K
cT = S0 e Nr 2
ln K

ln S0 + r ’ 1 σ 2 ’ ln K
’ KN 2
ln K
= S0 er N (d1 ) ’ K N (d2 ) , (8.19)
ln S0 /K + r ’ 1 σ 2
d1 = ,
d2 = d1 ’ σ.
The present value of the call option is derived by applying e’r to both
sides. The put option price can be derived using put“call parity or using
the same argument as above. The σ in the above formula is volatility
over the option maturity. If we use σ as the annualized volatility then

we replace σ with σ T in the formula.
There are important insights from (8.19), all valid only in a ˜risk-
neutral™ world:
(i) N (d2 ) is the probability that the option will be exercised.
(ii) Alternatively, N (d2 ) is the probability that call ¬nishes in the
(iii) X N (d2 ) is the expected payment.
(iv) S0 er T N (d1 ) is the expected value E [ST ’ X ]+ , where E [·]+ is
expectation computed for positive values only.
(v) In other words, S0 er T N (d1 ) is the risk-neutral expectation of ST ,
E Q [ST ] with ST > X .

In a highly simpli¬ed example, we assume a stock price can only move
up by one node or move down by one node over a 3-month period as
shown below. The option is a call option for the right to purchase the
share at $21 at the end of the period (i.e. in 3 month™s time).
Black“Scholes 81

stock price = 22
option price = 1
stock price = 20  
option price = c d
stock price = 18
option price = 0

Construct a portfolio consisting of amount of shares and short one
call option. If we want to make sure that the value of this portfolio is the
same whether it is up state or down state, then
$22 — ’ $1 = $18 — + $0,
= 0.25.
stock price = 22
por t f olio value = 22 — 0.25 ’ 1 = 4.5
stock price = 20  
por t f olio value
= 4.5e’0.12—3/12
= 4.367 stock price = 18
por t f olio value = 18 — 0.25 = 4.5

Given that the portfolio™s value is $4.367, this means that
$20 — 0.25 ’ f = $4.367
f = $0.633.
This is the value of the option under no arbitrage.
From the above simple example, we can make the following general-

S0 u
S0 d
82 Forecasting Financial Market Volatility

The amount is calculated using
S0 u — ’ f u = S0 d — ’ fd ,
fu ’ fd
= . (8.20)
S0 u ’ S0 d
Since the terminal value of the ˜riskless™ portfolio is the same in the up
state and in the down state, we could use any one of the values (say the
up state) to establish the following relationship
’ f u ) e’r T ,
S0 — ’ f = (S0 u —
’ f u ) e’r T .
f = S0 — ’ (S0 u — (8.21)
Substituting the value of from (8.20) into (8.21), we get
fu ’ fd fu ’ fd
’ f u e’r T
f = S0 — ’ S0 u —
S0 u ’ S0 d S0 u ’ S0 d

fu ’ fd fu ’ fd
’ f u e’r T
= ’ u—
u’d u’d

er T ( f u ’ f d ) u ( f u ’ f d ) u f u ’ d f u
e’r T
= ’ +
u’d u’d u’d

er T f u ’ er T f d u fd ’ d fu
e’r T
= +
u’d u’d

er T ’ d u ’ er T
f d e’r T .
= fu +
u’d u’d
By letting p = (er T ’ d)/(u ’ d), we get
f = e’r T [ p f u + (1 ’ p) f d ] (8.22)
u ’ d ’ er T + d u ’ er T
1’ p = = .
u’d u’d
We can see from (8.22) that although p is not the real probability dis-
tribution of the stock price, it has all the characteristics of a probability
Black“Scholes 83

measure (viz. sum to one and nonnegative). Moreover, when the ex-
pectation is calculated based on p, the expected terminal payoff is dis-
counted using the risk-free interest. Hence, p is called the risk-neutral
probability measure.
We can verify that the underlying asset S also produces a risk-free
rate of returns under this risk-neutral measure.
er T ’ d u ’ er T
= S0 u +
S0 e S0 d,
u’d u’d

uer T ’ ud + ud ’ der T
= ,

(u ’ d) er T
= = er T ,
µ = r.
The actual return of the stock is no longer needed and neither is the
actual distribution of the terminal stock price. (This is a rather amazing
discovery in the study of derivative securities!!!)

8.3.1 Matching volatility with u and d
We have already seen in the previous section and Equation (8.22) that
the risk-neutral probability measure is set such that the expected growth
rate is the risk-free rate, r .
er T ’ d u ’ er T
f d e’r T
f= fu +
u’d u’d

= [ p f u + (1 ’ p) f d ] e’r T ,

er T ’ d
p= .
This immediately leads to the question of how does one set the values
of u and d? The key is that u and d are jointly determined such that the
volatility of the binomial process equal to σ which is given or can be
estimated from prices of the asset underlying the option contract. Given
84 Forecasting Financial Market Volatility

that there are two unknowns and there is only one constant σ , there
are a number of ways to specify u and d. The good or better ways are
those that guarantee the nodes recombined after an upstate followed by
a downstate, and vice versa. In Cox, Ross and Rubinstein (1979), u and
d are de¬ned as follows:
√ √
σ δt ’σ δt
u=e , d=e .

It is easy to verify that the nodes recombine since ud = du = 1. So
after each up move and down move (and vice versa), the stock price will
return to S0 . √
To verify that the volatility of stock returns is approximately σ δt
under the risk-neutral measure, we note that

Var = E x 2 ’ [E (x)]2 ,
√ √
ln u = σ δt, and ln d = ’σ δt.

The expected stock returns is

S0 u S0 d
E (x) = p ln + (1 ’ p) ln
√0 √0
= pσ δt ’ (1 ’ p) σ δt

= (2 p ’ 1) σ δt,


S0 u 2 2
S0 d
= p ln + (1 ’ p) ln
S0 S0
= pσ 2 δt + (1 ’ p) σ 2 δt
= σ 2 δt.


Var = σ 2 δt ’ (2 p ’ 1)2 σ 2 δt
= σ 2 δt 1 ’ 4 p 2 + 4 p ’ 1
= σ 2 δt — 4 p (1 ’ p) .

It has been shown elsewhere that as δt ’ 0, p ’ 0.5 and Var ’ σ 2 δt.
Black“Scholes 85

8.3.2 A two-step binomial tree and American-style options

S0 u 2
  f uu
S0 u
1 ’ p2 d
S0   S0 ud
fd   f ud
1 ’ p1 d S0 d
1 ’ p2 d S0 d 2
f dd

The binomial tree is often constructed in such a way that the branches
recombine. If the volatilities in period 1 and period 2 are different, then,
in order to make the binomial tree recombine, p1 = p2 . (This is a more
advanced topic in option pricing.) Here, we take the simple case where
volatility is constant, and p1 = p2 = p. Hence, to price a European
option, we simply take the expected terminal value under the risk-neutral
measure and discount it with a risk-free interest rate, as follows:
f = e’r —2δt p 2 f uu + 2 p (1 ’ p) f ud + (1 ’ p)2 f dd . (8.23)
Note that the hedge ratio for state 2 will be different depending on
whether state 1 is an up state or a down state
fu ’ fd
= ,
S0 u ’ S0 d
f uu ’ f ud
= ,
S0 u 2 ’ S0 ud
f ud ’ f dd
= .
S0 ud ’ S0 d 2
This also means that, for such a model to work in practice, one has to
be able to continuously and costlessly rebalance the composition of the
portfolio of stock and option. This is a very important assumption and
should not be overlooked.
86 Forecasting Financial Market Volatility

We can see from (8.23) that the intermediate nodes are not required
for the pricing of European options. What are required are the range of
possible values for the terminal payoff and the risk-neutral probability
density for each node. This is not the case for the American option and all
the nodes in the intermediate stages are needed because of the possibility
of early exercise. As the number of nodes increases, the binomial tree
converges to a lognormal distribution for stock price.

Let C = f (K , S, r, σ, T ’ t) denote the theoretical (or model) price
of the option and f is some option pricing model, e.g. f BS denotes
the Black“Scholes formula. At any one time, we have options of many
different strikes, K , and maturities, T ’ t. (Here we use t and T as dates;
t is now and T is option maturity date. So the time to maturity is T ’ t.)
Since σt1 and σt2 need not be the same for t1 = t2 , we tend to use only
options with the same maturity T ’ t because volatility itself has a term
obs obs obs
structure. Assuming that there are C1 , C2 and C3 observed option
prices (possibly these are market-traded option prices) associated with
three exercise prices K 1 , K 2 and K 3 . To ¬nd the theoretical option price
C, we need the ¬ve parameters K , S, r, σ and T ’ t. Except for σ , the
other four parameters K , S, r and T ’ t can be determined accurately
and easily. We could estimate σ from historical stock prices. The problem
with this approach is that when C = C obs (i.e. the model price is not the
same as the market price), we do not know if this is because we did not
estimate σ properly or because the option pricing model f (·) is wrong.
A better approach is to use ˜backward induction™, i.e. use an iterative
procedure to ¬nd the σ that minimizes the pricing errors

C1 ’ C1 , C2 ’ C2 , C3 ’ C3 .
obs obs obs

The above is usually done by minimizing the unsigned errors
Wi Ci ’ Ciobs (8.24)

with an optimization routine searching over all possible values of σ ; n
is the number of observed option prices (three in this case), and Wi is
the weight applied to observation i.
Black“Scholes 87

In the simplest case, Wi = 1 for all i. To give the greatest weight to
the ATM option, we could set
for S = K e’r (T ’t)
 10,000

for S = K e’r (T ’t) .
Wi = S
 ’1

X e’r (T ’t)
Wi = 10 000 is equivalent to an option price that is 0.0001 away from
being at the money.
The power term, m, is the control for large pricing error. The larger
the value of m, the greater the emphasis placed on large errors for errors
> 1. If a very large error is due to data error, then a large m means the
entire estimation will be driven by this data error. Typically, m is set
equal to 1 or 2 corresponding to ˜absolute errors™ and ˜squared errors™ .
Since the option price is much greater for an ITM option than for
an OTM option, the pricing error is likely to be of a greater magnitude
for an ITM option. Hence, for an ITM option and an OTM option that
are an equal distance from being ATM, the procedure in (8.24) will
place a greater weight on pricing an ITM option correctly and pay little
or negligible attention to OTM options. One way to overcome this is
to minimize their Black“Scholes implied volatilities instead. Here, we
are using BS as a conversion tool. So long as ‚C B S /‚σ > 0 and there
is a one-to-one correspondence between option price and BS implied
volatility. Such a procedure does not require the assumption that the BS
model is correct.
To implement the new procedure, we start with an initial value σ *
and get C1 , C2 and C3 from f , the option pricing model that we wish
to test. f could even be Black“Scholes, if it is our intention to test
Black“Scholes. Use the Black“Scholes model f B S to invert BS implied
volatility I V1 , I V2 and I V3 from the theoretical prices C1 , C2 and C3
calculated in the previous step. If f in the previous step is indeed Black“
Scholes, then I V1 = I V2 = I V3 = σ * . Use the Black“Scholes model
f B S to invert BS implied volatility I V1 obs , I V2 obs and I V3 obs from the
market observed option prices C1 obs , C2 obs and C3 .
Finally, minimize the function
Wi IVi ’ IViobs

using the algorithm and logic as before.
88 Forecasting Financial Market Volatility

As option holders are not entitled to dividends, the option price should
be adjusted for known dividends to be distributed during the life of the
option and the fact that the option holder may have the right to exercise
early to receive the dividend.

8.5.1 Known and ¬nite dividends
Assume that there is only one dividend at „ . Should the call option
holder decide to exercise the option, she will receive S„ ’ K at time „
and if she decides not to exercise the option, her option value will be
worth c(S„ ’ D„ , K , r, T, σ ). The Black (1975) approximation involves
making such comparisons for each dividend date. If the decision is not to
exercise, then the option is priced now at c St ’ D„ e’r („ ’t) , K , r, T, σ .
If the decision is to exercise, then the option is priced according to
c (St , K , r, „, σ ). We note that if the decision is not to exercise, the
American call option will have the same value as the European call option
calculated by removing the discounted dividend from the stock price.
A more accurate formula that takes into account of the probability of
early exercise is that by Roll (1977), Geske (1979), and Whaley (1981),
and presented in Hull (2002, appendix 11). These formulae (even the
Black-approximation) work quite well for American calls. In the case
of an American put, a better solution is to implement the Barone-Adesi
and Whaley (1987) formula (see Section 8.5.3).

8.5.2 Dividend yield method
When the dividend is in the form of yield it can be easily ˜netted off™
from the risk-free interest rate as in the case of a currency option. To
calculate the dividend yield of an index option, the dividend yield, q, is
the average annualized yield of dividends distributed during the life of
the option:
« 
S+ r (t’ti )
Di e
1¬ ·
q = ln ¬ ·
 
t S

where Di and ti are the amount and the timing of the ith dividend on
the index with ti should also be annualized in a similar fashion as t. The
Black“Scholes 89

dividend yield rate computed here is thus from the actual dividends paid
during the option™s life which will therefore account for the monthly
seasonality in dividend payments.

8.5.3 Barone-Adesi and Whaley quadratic approximation
De¬ne M = and N = 2 (r ’ q)/σ 2 , then3 for an American call op-
± q2
 S
c (S) + A2 S < S*
C (S) = (8.25)

S’K S ≥ S*.

The variable S * is the critical price of the index above which the option
should be exercised. It is estimated by solving the equation,

S * ’ K = c S * + 1 ’ e’qt N d1 S * ,
iteratively. The other variables are

1 4M
q2 = 1’ N + (N ’ 1)2 + ,
1 ’ e’r t
1 ’ e’qt N d1 S * ,
A2 =
ln S * /K + (r ’ q + 0.5σ 2 )t
*= .

d1 S (8.26)
To compute delta and vega for hedging purposes4 :

(q2 ’1)
‚C ’qt
N(d1 (S)) + A 2 q2
S < S*
e when
= = S* S*
‚S S ≥ S*,
1 when

‚C S t N (d1 ) e’qt S < S*
= = (8.27)
S ≥ S*.
‚σ 0 when

Note that in Barone-Adesi and Whaley (1987), K (t) is 1 ’ e’r t , and b is (r ’ q).
Vega for the American options cannot be evaluated easily because C partly depends on S * , which itself is a
complex function of σ. The expression for vega in the case when S < S * in Equation (8.27) represents the vega
for the European component only. Vega for the American option could be derived using numerical methods.
90 Forecasting Financial Market Volatility

For an American put option, the valuation formula is:
p (S) + A1 S > S **
P (S) = S ** (8.27a)
K ’S S ¤ S ** .
The variable S ** is the critical index price below which the option should
be exercised. It is estimated by solving the equation,
S **
K ’ S ** = p S ** ’ 1 ’ e’qt N ’d1 S ** ,
iteratively. The other variables are
1 4M
q1 = 1’ N ’ (N ’ 1)2 + ,
1 ’ e’r t
S **
1 ’ e’qt N ’d1 S ** ,
A1 = ’
ln S ** /K + (r ’ q + 0.5σ 2 )t
S ** = .

To compute delta and vega for hedging purposes:
S (q1 ’1)
‚ P ’e’qt N (d1 (S)) + 1 1
when S > S **
= = ** S **
‚S 
’1 when S ¤ S ** ,

‚P = S t N (d1 ) e’qt when S > S **
P= = ‚σ
‚σ when S ¤ S ** .

Early studies of option implied volatility suffered many estimation prob-
lems,5 such as the improper use of the Black“Scholes model for an
American style option, the omission of dividend payments, the option
price and the underlying asset prices not being recorded at the same
time, or stale prices being used. Since transactions may take place at bid
or ask prices, transaction prices of the option and the underlying assets
are subject to bid“ask bounce making the implied volatility estimation

Mayhew (1995) gives a detailed discussion on such complications involved in estimating implied volatility
from option prices, and Hentschel (2001) provides a discussion of the con¬dence intervals for implied volatility
Black“Scholes 91

unstable. Finally, in the case of an S&P100 OEX option, the privilege
of a wildcard option is often omitted.6 In more recent studies, many of
these measurement errors have been taken into account. Many studies
use futures and options futures because these markets are more active
than the cash markets and hence there is a smaller risk of prices being
Conditions in the Black“Scholes model include: no arbitrage, trans-
action cost is zero and continuous trading. As mentioned before, the
lack of such a trading environment will result in options being traded
within a band around the theoretical price. This means that implied
volatility estimates extracted from market option prices will also lie
within a band even without the complications described in Chapter 10.
Figlewski (1997) shows that implied volatility estimates can differ by
several percentage points due to bid“ask spread and discrete tick size
alone. To smooth out errors caused by bid“ask bounce, Harvey and Wha-
ley (1992) use a nonlinear regression of ATM option prices, observed in
a 10-minute interval before the market close, on model prices.
Indication of nonideal trading environment is usually re¬‚ected in poor
trading volume. This means implied volatility of options written on
different underlying assets will have different forecasting power. For
most option contracts, ATM option has the largest trading volume. This
supports the popularity of ATM implied volatility referred to later in
Chapter 10.

8.6.1 Investor risk preference
In the Black“Scholes world, investor risk preference is irrelevant in
pricing options. Given that some of the Black“Scholes assumptions have
been shown to be invalid, there is now a model risk. Figlewski and
Green (1999) simulate option writers positions in the S&P500, DM/$,
US LIBOR and T-Bond markets using actual cash data over a 25-year
period. The most striking result from the simulations is that delta hedged
short maturity options, with no transaction costs and a perfect knowledge
of realized volatility, ¬nished with losses on average in all four markets.
This is clear evidence of Black“Scholes model risk. If option writers
are aware of this model risk and mark up option prices accordingly, the
Black“Scholes implied volatility will be greater than the true volatility.
This wildcard option arises because the stock market closes later than the option market. The option trader
is given the choice to decide, before the stock market closes, whether or not to trade on an option whose price is
¬xed at an earlier closing time.
92 Forecasting Financial Market Volatility

In some situations, investor risk preference may override the risk-
neutral valuation relationship. Figlewski (1997), for example, compares
the purchase of an OTM option to buying a lottery ticket. Investors are
willing to pay a price that is higher than the fair price because they like
the potential payoff and the option premium is so low that mispricing
becomes negligible. On the other hand, we also have fund managers who
are willing to buy comparatively expensive put options for fear of the
collapse of their portfolio value. Both types of behaviour could cause
the market price of options to be higher than the Black“Scholes price,
translating into a higher Black“Scholes implied volatility. Arbitrage ar-
guments do not apply here because these are unique risk preferences
(or aversions) associated with some groups of individuals. Franke, Sta-
pleton and Subrahmanyam (1998) provide a theoretical framework in
which such option trading behaviour may be analysed.

The determination of S * and S ** in Equations (8.25) and (8.27a) are
not exactly straightforward. We have some success in solving S * and
S ** using NAG routing C05NCF. Barone-Adesi and Whaley (1987),
however, have proposed an ef¬cient method for determining S * , details
of which can be found in Barone-Adesi and Whaley (1987, hereafter
referred to as BAW) pp. 309 to 310. BAW claimed that convergence of
S * and S ** can be achieved with three or fewer iterations.

American calls
The following are step-by-step procedures for implementing BAW™s
ef¬cient method for estimating S — of the American call.
Step 1. Make initial guess of σ and denote this initial guess as σ j with
j = 1.
Step 2. Make initial guess of S * , Si (with i = 1), as follow; denoting
S * at T = +∞ as S * (∞) :
S1 = X + S * (∞) ’ K 1 ’ eh 2 , (8.28)
S * (∞) = , (8.29)
q2 (∞)
Black“Scholes 93

q2 (∞) = 1’ N + (N ’ 1)2 + 4M , (8.30)
√ K
h 2 = ’ (r ’ q) t + 2σ t . (8.31)
S * (∞) ’ K
Note that the lower bound of S * is K . So if S1 < K , reset
S1 = K . However, the condition S * < K rarely occurs.
Step 3. Compute the l.h.s. and r.h.s. of Equation (8.25a) as follows:
l.h.s. (Si ) = Si ’ X, and (8.32)
r.h.s. (Si ) = c (Si ) + 1 ’ e’qt N [d1 (Si )] Si /q2 . (8.33)
Compute starting value of c (Si ) using the simple Black“Scholes
Equation (8.25) and d1 (Si ) using Equation (8.26). It will be
useful to set up a function (or subroutine variable) for d1 .
Step 4. Check tolerance level,
|l.h.s. (Si ) ’ r.h.s. (Si )| /K < 0.000 01. (8.34)
Step 5. If Equation (8.34) is not satis¬ed; compute the slope of Equation
(8.33), bi , and the next guess of S * , Si+1 , as follows:
bi = e’qt N [d1 (Si )] (1 ’ 1/q2 )

+[1 ’ e’qt n d1 (Si )]/σ t /q2 , (8.35)
Si+1 = [X + r.h.s. (Si ) ’ bi Si ] / (1 ’ bi ) , (8.36)
where n (.) is the univariate normal density function. Repeat
from step 3.
Step 6. When Equation (8.34) is satis¬ed, compute C (S) according to
Equation (8.25). If C (S) is greater than the observed American
call price, try a smaller σ j+1 , otherwise try a larger σ j+1 . Repeat
steps 1 to 5 until C (S) is the same as the observed American
call price. Step 6 could be handled by a NAG routine such as
C05ADF for a quick solution.
American puts
To approximate S ** for American puts, steps 2, 3 and 5 have to be
Step 1. Make initial guess of σ and denote this initial guess as σ j with
j = 1.
94 Forecasting Financial Market Volatility

Step 2. Make initial guess of S ** , Si (with i = 1), as follows, denoting
S ** at T = +∞ as S ** (∞):

S1 = S ** (∞) + K ’ S ** (∞) eh 1 , (8.37)

S ** (∞) = ,
q1 (∞)
q1 (∞) = 1 ’ N ’ (N ’ 1)2 + 4M ,
√ K
h 1 = (r ’ q) t ’ 2σ t . (8.38)
K ’ S ** (∞)

Note that the upper bound of S ** is K . So if S1 > K , reset
S1 = K . Again, the condition S ** > X rarely occurs. Accord-
ing to Barone-Adesi and Whaley (1987, footnote 9), the in¬‚u-
ence of (r ’ q) must be bounded in the put exponent to ensure
critical prices monotonically decrease in t, for very large val-

ues of (r ’ q) and t. A reasonable bound on (r ’ q) is 0.6σ t,
so the√critical stock price declines with a minimum velocity
e’1.4σ t . This check is required before computing h 1 , in Equa-
tion (8.38).
Step 3. Compute the l.h.s. and r.h.s. of Equation (8.37) as follows:

l.h.s. (Si ) = K ’ Si , and
r.h.s. (Si ) = p (Si ) ’ 1 ’ e’qt N [’d1 (Si )] Si /q1 . (8.39)

Step 4. Check tolerance level, as before,

|l.h.s. (Si ) ’ r.h.s. (Si )| /K < 0.00001. (8.40)

Step 5. If Equation (8.40) is not satis¬ed; compute the slope of Equation
(8.39), bi , and the next guess of S ** , Si+1 , as follows:

bi = ’e’qt N [’d1 (Si )] (1 ’ 1/q1 )

’ 1 + e’qt n [d1 (Si )] /σ t /q1 ,
Si+1 = [X ’ r.h.s. (Si ) + bi Si ] / (1 + bi ) .

Repeat from step 3 above.
Black“Scholes 95

Step 6. When Equation (8.40) is satis¬ed, compute P (S) using Equa-
tion (8.27a). If P (S) is greater than the observed American call
price, try a larger σ j+1 , otherwise try a smaller σ j+1 . Then repeat
steps 1 to 5 until P (S) is the same as the observed American put
price. Similarly the case for the American call, step 6 could be
handled by a NAG routine such as C05ADF for a quick solution.
Option Pricing with Stochastic


. 3
( 7)