If Black“Scholes (BS) is the correct option pricing model, then there

can only be one BS implied volatility regardless of the strike price of

the option, or whether the option is a call or a put. BS implied volatility

smile and skew are clear evidence that market option prices are not priced

according to the BS formula. This raises the important question about

the relationship between BS implied volatility and the true volatility.

The BS option price is a positive function of the volatility of the

underlying asset. If the BS model is correct, then market option price

should be the same as the BS option price and the BS implied volatility

derived from market option price will be the same as the true volatility.

If the BS price is incorrect and is lower than the market price, then BS

implied volatility overstates the true volatility. The reverse is true if the

BS price is higher than the market price. The problem is complicated

by the fact that BS implied volatility differs across strike prices. All the

theories that predict the relationship between BS price and the market

option price are all contingent on the proposed alternative option pricing

model or the proposed alternative pricing dynamic being correct. Given

that the BS implied volatility, despite all its shortcomings, has been

proven overwhelmingly to be the best forecast of volatility, it will be

useful to understand the links between BS implied volatility bias and

the true volatility. This is the objective of this chapter.

There have been a lot of efforts made to solve the BS anomalies.

The stochastic volatility (SV) option pricing model is one of the most

important extensions of Black“Scholes. The SV option pricing model

is motivated by the widespread evidence that volatility is stochastic and

that the distribution of risky asset returns has tail(s) longer than that of

a normal distribution. An SV model with correlated price and volatility

innovations can address both anomalies. The SV option pricing model

was developed roughly over a decade with contributions from Johnson

and Shanno (1987), Wiggins (1987), Hull and White (1987, 1988), Scott

(1987), Stein and Stein (1991) and Heston (1993). It was in Heston

(1993) that a closed form solution was derived using the characteristic

98 Forecasting Financial Market Volatility

function of the price distribution. Section 9.1 presents this landmark

Heston SV option pricing model while some details of the derivation are

presented in the Appendix to this chapter (Section 9.5). In Section 9.2,

we simulate a series of Heston option prices from a range of parameters.

Then we use these option prices as if they were the market option prices

to back out the corresponding BS implied volatilities. If market option

prices are priced according to the Heston formula, the simulations in

this section will give us some insight into the relationship between BS

implied volatility bias and the true volatility. In Section 9.3, we analyse

the usefulness and practicality of the Heston model by looking at the

impact of Heston model parameters on skewness and kurtosis range and

sensitivity, and some empirical tests of Heston model. Finally, Section

9.4 analyses empirical ¬ndings on the the predictive power of Heston

implied volatility as a volatility forecast.

9.1 THE HESTON STOCHASTIC VOLATILITY

OPTION PRICING MODEL

Heston (1993) speci¬es the stock price and volatility price processes as

follows:

√

d St = µSdt + …t Sdz s,t ,

√

d…t = κ [θ ’ …t ] dt + σν …t dz …,t ,

where …t is the instantaneous variance, κ is the speed of mean reversion,

θ is the long-run level of volatility and σν is the ˜volatility of volatility™.

The two Wiener processes, dz s,t and dz …,t have constant correlation ρ.

The assumption that consumption growth has a constant correlation with

spot-asset returns generates a risk premium proportional to …t . Given the

volatility risk premium, the risk-neutral volatility process can be written

as

√ *

d…t = κ [θ ’ …t ] dt ’ »…t dt + σν …t dz …,t

√

= κ * θ * ’ … dt + σ … dz * , ν …,t

t t

where » is the market price of (volatility) risk, and κ * = κ + » and

θ * = κθ /(κ + »). Here κ * is the risk-neutral mean reverting parameter

and θ * is the risk-neutral long-run level of volatility. The parameter σν

and ρ implicit in the risk-neutral process are the same as that in the

real volatility process. Given the price and the volatility dynamics, the

Option Pricing with Stochastic Volatility 99

Heston (1993) formula for pricing European calls is

c = S P1 ’ K e’r (T ’t) P2 ,

1∞ e’iφ ln K f i

1

Pj = + dφ, for j = 1, 2

Re

2 π0 iφ

f i = exp {C (T ’ t; φ) + D (T ’ t; φ) … + iφx} ,

where

x = ln S, „ = T ’ t,

1 ’ ged„

a

C („ ; φ) = r φi„ + 2 b j ’ ρσν φi + d „ ’ 2 ln ,

σν 1’g

b j ’ ρσν φi + d 1 ’ ed„

D („ ; φ) = ,

σν 1 ’ ged„

2

b j ’ ρσν φi + d

g= ,

b j ’ ρσν φi ’ d

2

d= ρσν φi ’ b j

’ σν 2µφi ’ φ 2 ,

2

µ1 = 1 2 , µ2 = ’1 2 , a = κθ = κ * θ * ,

b1 = κ + » ’ ρσν = κ * ’ ρσν , b2 = κ + » = κ * .

9.2 HESTON PRICE AND BLACK“SCHOLES IMPLIED

In this section, we analyse possible BS implied bias by simulating a

series of Heston option prices with parameter values similar to those in

Bakshi, Cao and Chen (1997), Nandi (1998), Das and Sundaram (1999),

Bates (2000), Lin, Strong and Xu (2001), Fiorentini, Angel and Rubio

(2002) and Andersen, Benzoni and Lund (2002). For the simulations, we

set the asset price as 100, interest rate as zero, time to maturity is 1 year,

and strike prices ranging from 50 to 150. In most simulations, and unless

otherwise stated, the current ˜instantaneous™ volatility, σt , is set equal to

the long-run level, θ, at 20%. There are ¬ve other parameters used in the

Heston formula, namely, κ, the speed of mean reversion, θ, the long-run

volatility level, », the market price of risk, σ… , volatility of volatility, and

ρ, the correlation between the price and the volatility processes. If we

set » = 0, then the volatility process becomes risk-neutral, and κ and θ

become κ * and θ * respectively.

The ¬rst set of simulations presented in Figure 9.1(a) involves repli-

cating the Black“Scholes prices as a special case. Here we set σ… = 0.

100 Forecasting Financial Market Volatility

(a) A Black“Scholes series (b) Effect of volatility of volatility (su, Kurtosis)

(S = 100, r = 0, T = 1, l = 0) (S = 100, r = 0, T = 1, l = 0, k = 0.1)

Skewness = 0, Kurtosis = 3

su = 0.1 , Kur = 3.7

0.45 su = 0.3, Kur = 9.53

0.45

0.4 0.4 su = 0.6, Kur = 29.1

0.35 0.35

su = 1.0, Kur = 75.6

0.3 0.3

BS Implied

BS Implied

0.25 0.25

0.2 0.2

0.15 0.15

0.1 0.1

0.05 0.05

0 0

50 60 70 80 90 100 110 120 130 140 150 50 60 70 80 90 100 110 120 130 140 150

Strike Price, K Strike Price, K

(d) Effect of k

(c) Effect of correlation, r

(S = 100, r = 0, T = 1, l = 0, su = 0.6, q = 0.2)

(S = 100, r = 0, T = 1, l = 0)

k = 0.01, √…t = 0.7 k = 3, √…t = 0.7

r = -0.95 r = -0.5

k = 0.01, √…t = 0.15 k = 3, √…t = 0.15

r=0 r = 0.6

0.4

0.8

0.35

0.7

0.3

0.6

BS Implied

0.25

BS Implied

0.5

0.2

0.4

0.15

0.3

0.1 0.2

0.05 0.1

0.1049

0 0

50 60 70 80 90 100 110 120 130 140 150 50 60 70 80 90 100 110 120 130 140 150

Strike Price, K

Strike Price, K

(e) Effect of l on negatively correlated (f) Effect of l on positively correlated

processes (S = 100, r = 0, T = 1, r = -0.5) processes (S = 100, r = 0, T = 1, r = 0.5)

Skewness = -0.1623, Kurtosis = 3.7494

Skewness = +0.1623, Kurtosis = 3.7494

l = -2 l=0 l=2 l = -2 l=0 l=2

0.45 0.45

0.4 0.4

0.35 0.35

0.3 0.3

BS Implied

BS Implied

0.25 0.25

0.2 0.2

0.15 0.15

0.1 0.1

0.05 0.05

0 0

50 60 70 80 90 100 110 120 130 140 150 50 60 70 80 90 100 110 120 130 140 150

Strike Price, K Strike Price, K

Figure 9.1 Relationships between Heston option prices and Black“Scholes implied

volatility

Since there is no volatility risk, » = 0. This is a special case where

the Heston price and the Black“Scholes price are identical and the BS

implied volatility is the same across strike prices. In this special case,

BS implied volatility (at any strike price) is a perfect representation of

true volatility.

Option Pricing with Stochastic Volatility 101

In the second set of simulations presented in Figure 9.1(b), we al-

ter σ… , the volatility of volatility, and keep all the other parameters the

same and constant. The effect of an increase in σ… is to increase the

unconditional volatility and kurtosis of risk-neutral price distribution. It

is the risk-neutral distribution because », the market price of risk, is set

equal to zero. As σ… increases and without appropriate compensation for

volatility risk premium, ATM (at-the-money) implied volatility under-

estimates true volatility while OTM (out-of-the-money) implied volatil-

ity overestimates it. This is the same outcome as Hull and White (1987)

where the price and the volatility processes are not correlated and there

is no risk premium for volatility risk. With appropriate adjustment for

volatility (which will require a volatility risk premium input), the ATM

implied volatility will be at the right level, but Black“Scholes will con-

tinue to underprice OTM options (and OTM BS implied overestimates

true volatility) because of the BS lognormal thin tail assumptions. As-

suming that ρ = 0 or at least is constant over time, and that σ… and » are

relatively stable, a time series regression of historical ˜actual volatility™

on historical ˜implied volatility™ at a particular strike will be suf¬cient

to correct for these biases. This is basically the Ederington and Guan

(1999) approach. We will show in the next section that ρ is not likely to

be stable. When ρ is not constant, the analysis below and Figure 9.1(c)

show that ATM implied volatility is least affected by changing ρ. This

explains why ATM implied volatility is the most robust and popular

choice of volatility forecast.

In Figure 9.1(c), it is clear that changing the correlation coef¬cient

alone has no impact on ATM implied volatility. Correlation has the

greatest impact on skewness of the price distribution and determines the

shape of volatility smile or skew. Its impact on kurtosis is less marked

when compared with σ… , the volatility of volatility.

Figure 9.1(d) highlights the impact of κ, the mean reversion parameter

which we have already brie¬‚y touched on in relation to the long memory

of volatility in Chapter 5. The higher the rate of mean reversion, the more

likely the return distribution will be normal even when the volatility of

√

volatility, σ… , and the initial volatility, νt , are both high. When this is

the case, there is no strike price bias in BS implied (i.e. there will not be

volatility smile). When κ is low, this is when the problem starts. A low κ

corresponds with volatility persistence where BS implied volatility will

be sensitive to the current state of volatility level. At high volatility state,

√

high νt compensates for the low κ and the strike price bias is less severe.

Strike price effect or the volatility smile is the most acute when initial

√

volatility level νt is low. ATM options will be overpriced vis-` -vis a

102 Forecasting Financial Market Volatility

OTM options.1 (Note that we have set » = 0 in this set of simulations.)

√

When σ… = 0.6, θ = 0.2, νt = 0.15 and κ = 0.01, the ATM BS im-

plied is only 0.1049 much lower than any of the volatility parameters.

Figure 9.1(e) and 9.1(f) can be used to infer the impacts of parameter

estimates above when the volatility risk premium » is omitted. In the

literature, we often read ˜. . . volatility risk premium is negative re¬‚ect-

ing the negative correlation between the price and the volatility dyna-

mics . . . ™ (Buraschi and Jackwerth, 2001; Bakshi and Kapadia, 2003.

All series in Figure 9.1(e) have correlation ρ = ’0.5 and all series in

Figure 9.1(f) have correlation ρ = +0.5. A negative » (volatility risk

premium) produces higher Heston price and higher BS implied volatil-

ity. The impact is the same whether the correlation ρ is negative or

positive. We will see later in the next section that empirical evidence in-

dicates that Figure 9.1(f) is just as likely a scenario as Figure 9.1(e). As

κ * = κ + » and θ * = κθ /(κ + »), a negative » has the effect of reduc-

ing κ * (resulting in a smaller option price) and increasing θ * (resulting

in a bigger option price). Simulations, not reported here, show that the

price impact of θ * is much greater than that of κ * , so the outcome will

be a higher option price due to the negative ». Hence, a ˜negative risk

premium™ is to be expected whether the price and the volatility processes

are positively or negatively correlated.2 This also means that, without

accounting for the volatility risk premium, the BS option price will be

too low and the BS implied will always overstate true volatility. Both

volatility and volatility risk premium have positive impact on option

price. The omission of volatility risk premium will cause the volatil-

ity risk premium component to be ˜translated™ into higher BS implied

volatility.

9.3 MODEL ASSESSMENT

In this section, we evaluate the Heston model using simulations. In par-

ticular, we examine the skewness and kurtosis planes covered by a range

of Heston parameter values. We have no information on the volatility

risk premium. Hence, to avoid an additional dimension of complexity,

we will evaluate the risk-neutral parameters κ * and θ * instead of κ and

θ for the true volatility process.

1

When BS overprice options, the BS implied volatility will understate volatility because BS implied is

inverted from market price, which is lower than the BS price.

This is really a misnomer: while the » parameter is negative, it actually results in a higher option price. So

2

strictly speaking the volatility risk premium is positive!

Option Pricing with Stochastic Volatility 103

9.3.1 Zero correlation

We learn from the simulations in Section 9.2 and from Figure 9.1 that,

according to the Heston model, skewness in stock returns distribution

and BS implied volatility asymmetry are determined completely by the

correlation parameter, ρ. When the correlation parameter is equal to

zero, we get zero skewness and both the returns distribution and BS

implied volatility will be symmetrical. Figure 9.2 presents the kurtosis

values produced by different combinations of κ, θ and σv . One important

pattern emerged that highlights the importance of the mean reversion

parameter, κ. When κ is low we have high volatility persistence, and

vice versa for high value of κ.

At high value of κ, kurtosis is close to 3, regardless of the value of θ and

σv . This is, unfortunately, the less likely scenario for a ¬nancial market

time series that typically has high volatility persistence and low value

of κ. At low value of κ, the kurtosis is the highest at low level of σv , the

parameter for volatility of volatility. At high level of σv , kurtosis drops

to 3 very consistently, regardless of the value of the other parameters.

At low level of σv , the long-term level of volatility, θ, comes into effect.

The higher the value of θ, the lower the kurtosis value, even though it is

still much greater than 3.

When skewness is zero and kurtosis is low (i.e. relatively ¬‚at BS im-

plied volatility), it will be dif¬cult to differentiate whether it is due to

a high κ, a high σv or both. This also re¬‚ects the underlying prop-

erty that a high κ, a high σv or both make the stochastic volatility

structure less important and the BS model will be adequate in this

case.

9.3.2 Nonzero correlation

In Figures 9.3 and 9.4, we illustrate skewness and kurtosis, respectively,

for the case when the correlation coef¬cient, ρ, is greater than 0. The

case for ρ < 0 will not be discussed here as it is the re¬‚ective image

of ρ > 0 (e.g. instead of positive skewness, we get negative skewness

etc.).

Figure 9.3 shows that skewness is ¬rst ˜triggered™ by a nonzero cor-

relation coef¬cient, after which κ and σv combine to drive skewness.

High skewness occurs when σv is high and κ is low (i.e. high volatility

persistence). At relatively low skewness level, there is a huge range of

high κ, low σv or both that produce similar values of skewness. A low θ

104 Forecasting Financial Market Volatility

(a) k = 1, r = 0, Skewness = 0

80

70

60

50

40

Kurtosis 30

20

10

0

0.05 0.1 0.5

0.15 0.4

0.2 0.3

0.25 0.2

0.3 0.1

q su

(b) k = 0.1, r = 0, Skewness = 0

80.00

70.00

60.00

50.00

40.00

Kurtosis

30.00

20.00

10.00

0.00

0.1 0.1

0.2

0.2 0.3

0.3 0.4

0.5

0.4

su q

0.5

Figure 9.2 Impact of Heston parameters on kurtosis for symmetrical distribution with

zero correlation and zero skewness

and high ρ produces high skewness, but at low level of σv , skewness is

much less sensitive to these two parameters.

Figure 9.4 gives a similar pattern for kurtosis. Except when σv is very

high and κ is low, the plane for kurtosis is very ¬‚at and not sensitive to

θ or ρ.

Option Pricing with Stochastic Volatility 105

k = 1, su = 0.5 k = 0.1, su = 0.5

3.5 3.5

3 3

2.5 2.5

2 2

Skewness Skewness

1.5 1.5

1 1

0.5 0.5

0 0

0.9

0.9

0.7

0.05

0.7

0.05

0.5

0.5

0.3

0.2

0.3

0.2

0.1

0.1

q r q r

k = 0.1, su = 0.1

k = 1, su = 0.1

3.5

3.5

3

3

2.5

2.5

2

2

Skewness

Skewness

1.5

1.5

1

1

0.5

0.5

0

0

0.9

0.9

0.7

0.7

0.05

0.05

0.5

0.5

0.3

0.3

0.2

0.2

0.1

0.1

q r q r

Figure 9.3 Impact of Heston parameters on skewness

9.4 VOLATILITY FORECAST USING THE

HESTON MODEL

The thick tail and nonsymmetrical distribution found empirically could

be a result of volatility being stochastic. The simulation results in the

previous section suggest that σv , the volatility of volatility, is the main

driving force for kurtosis and skewness (if correlation is not equal to

zero). At high κ, volatility mean reversion will cancel out much of

the σv impact on kurtosis and some of that on skewness. Correlation

between the price and the volatility processes, ρ, determines the sign of

the skewness. But beyond that its impact on the magnitude of skewness

is much less compared with σv and κ. Correlation has negligible impact

on kurtosis. The long-run volatility level, θ, has very little impact on

skewness and kurtosis, except when σv is very high and κ is very low. So

a stochastic volatility pricing model is useful and will outperform Black“

Scholes only when volatility is truly stochastic (i.e. high σv ) and volatility

is persistent (i.e. low κ). The dif¬culty with the Heston model is that, once

106 Forecasting Financial Market Volatility

k = 1, su = 0.5 k = 0.1, su = 0.5

90 90

80 80

70 70

60 60

50 50

Kurtosis Kurtosis

40 40

30 30

20 20

10 10

0 0

0.9

0.9

0.7

0.05

0.7

0.05

0.5

0.5

0.3

0.2

0.3

0.2

0.1

0.1

q r q r

k = 0.1, su = 0.1

k = 1, su = 0.1

90

90

80

80

70

70

60

60

50

50

Kurtosis

Kurtosis

40

40

30

30

20

20

10

10

0

0

0.9

0.9

0.7

0.7

0.05

0.05

0.5

0.5

0.3

0.3

0.2

0.2

0.1

0.1

q r q r

Figure 9.4 Impact of Heston parameters on kurtosis

we move away from the high σv and low κ region, a large combination

of parameter values can produce similar skewness and kurtosis. This

contributes to model parameter instability and convergence dif¬culty

during estimation.

Through simulation results we can predict the degree of Black“

Scholes pricing bias as a result of stochastic volatility. In the case where

volatility is stochastic and ρ = 0, Black“Scholes overprices near-the-

money (NTM) or at-the-money (ATM) options and the degree of over-

pricing increases with maturity. On the other hand, Black“Scholes un-

derprices both in- and out-of-the-money options. In term of implied

volatility, ATM implied volatility will be lower than actual volatility

while implied volatility of far-from-the-money options (i.e. either very

high or very low strikes) will be higher than actual volatility. The pattern

of pricing bias will be much harder to predict if ρ is not zero, when there

is a premium for bearing volatility risk, and if either or both values vary

through time.

Option Pricing with Stochastic Volatility 107

Some of the early work on option implied volatility focuses on ¬nd-

ing an optimal weighting scheme to aggregate implied volatility of op-

tions across strikes. (See Bates (1996) for a comprehensive survey of

these weighting schemes.) Since the plot of implied volatility against

strikes can take many shapes, it is not likely that a single weighting

scheme will remove all pricing errors consistently. For this reason and

together with the liquidity argument, ATM option implied volatility is

often used for volatility forecast but not implied volatilities at other

strikes.

9.5 APPENDIX: THE MARKET PRICE OF

VOLATILITY RISK

9.5.1 Ito™s lemma for two stochastic variables

Given two stochastic processes,3

d S1 = µ1 (S1 , S2 , t) dt + σ1 (S1 , S2 , t) d X 1 ,

d S2 = µ2 (S1 , S2 , t) dt + σ2 (S1 , S2 , t) d X 2 ,

E {d X 1 d X 2 } = ρdt,

where X 1 and X 2 are two related Brownian motions.

From Ito™s lemma, the derivative function V (S1 , S2 , t) will have the

following process:

12 12

d V = Vt + σ1 VS1 S1 + ρσ1 σ2 VS1 S2 + σ2 VS2 S2 dt

2 2

+VS1 d S1 + VS2 d S2 ,

where

‚V ‚2V ‚ ‚V

Vt = , VS1 S1 = VS1 S2 = .

and

‚t ‚ S1 ‚ S2

‚ S1

2

9.5.2 The case of stochastic volatility

Here, we assume S1 is the underlying asset and S2 is the stochastic

volatility σ1 as follows:

d S1 = µ1 (S1 , σ, t) dt + σ1 (S1 , σ, t) d X 1 , (9.1)

dσ1 = p (S1 , σ, t) dt + q (S1 , σ, t) d X 2 ,

E {d X 1 d X 2 } = ρdt,

3

I am grateful to Konstantinos Vonatsos for helping me with materials presented in this section.

108 Forecasting Financial Market Volatility

and from Ito™s lemma, we get:

12 1

d V = Vt + σ1 VS1 S1 + ρqσ1 VS1 σ + q 2 VS2 σ dt (9.2)

2 2

+VS1 d S1 + Vσ dσ.

In the following stochastic volatility derivation, µ1 = µSt is the mean

drift of the stock price process. The volatility of S1 , σ1 = f (σ ) S1 is

stochastic and is level-dependent. The mean drift of the volatility pro-

cess is more complex as volatility cannot become negative and should

be stationary in the long run. Hence an OU (Ornstein“Uhlenbeck)

process is usually recommended with p (S1 , σ, t) = ± (m ’ σ ) and

q (S1 , σ, t) = β:

dσ1 = ± (m ’ σ1 ) dt + βd X 2 .

Here, p = ± (m ’ σ ) is the mean drift of the volatility process, m is

the long-term mean level of the volatility, and ± is the speed at which

volatility reverts to m, and β is the volatility of volatility.

9.5.3 Constructing the risk-free strategy

To value an option V (S1 , σ, t) we must form a risk-free portfolio using

the underlying asset to hedge the movement in S1 and use another option

V (S1 , σ, t) to hedge the movement in σ . Let the risk-free portfolio be:

=V’ ’ S1 .

1V

Applying Ito™s lemma from (9.2) on the risk-free portfolio ,

12 1

= Vt + σ1 VS1 S1 + q 2 Vσ σ + ρqσ1 VS1 σ dt

d

2 2

12 1

’ 1 V t + σ1 V S1 S1 + q 2 V σ σ + ρqσ1 V S1 σ dt

2 2

+ VS1 ’ ’

1 V S1 d S1

+ Vσ ’ 1V σ dσ.

To eliminate dσ , set

Vσ ’ = 0,

1V σ

Vσ

= ,

1

Vσ

Option Pricing with Stochastic Volatility 109

and, to eliminate d S1 , set

Vσ

VS1 ’ V S1 ’ = 0,

Vσ

Vσ

= VS1 ’ V S1 .

Vσ

This results in:

12 1

= Vt + σ1 VS1 S1 + q 2 Vσ σ + ρqσ1 VS1 σ dt

d

2 2

Vσ 1 1

’ V t + σ1 2 V S1 S1 + q 2 V σ σ + ρqσ1 V S1 σ dt

2 2

Vσ

= r dt

Vσ Vσ

=r V ’ V ’ VS1 ’ V S1 S1 dt.

Vσ Vσ

Dividing both sides by Vσ , we get:

1 12 1

Vt + σ1 VS1 S1 + q 2 Vσ σ + ρqσ1 VS1 σ

Vσ 2 2

1 12 1

’ V t + σ1 V S1 S1 + q 2 V σ σ + ρqσ1 V S1 σ

2 2

Vσ

V V VS S1 V S S1

=r ’r ’r 1 +r 1 .

Vσ Vσ

Vσ Vσ

Now we separate the two options by moving V to one side and V to the

other:

1 12 1

Vt + σ1 VS1 S1 + q 2 Vσ σ + ρqσ1 VS1 σ ’ r V + r VS1 S1

Vσ 2 2

1 12 1

= V t + σ1 V S1 S1 + q 2 V σ σ + ρqσ1 V S1 σ ’ r V + r V S1 S1 .

2 2

Vσ

Each side of the equation is a function of S1 , σ and t, and is independent

of the other option. So we may write:

1 12 1

Vt + σ1 VS1 S1 + q 2 Vσ σ + ρqσ1 VS1 σ ’ r V + r VS1 S1

Vσ 2 2

= f (S1 , σ, t) ,

12 1

Vt + σ1 VS1 S1 + q 2 Vσ σ + ρqσ1 VS1 σ ’ r V + r VS1 S1

2 2

= f (S1 , σ, t) Vσ , (9.3)

110 Forecasting Financial Market Volatility

and similarly for the RHS. In order to solve the PDE, we need to under-

stand the function f (S1 , σ, t) which depends on whether or not d S1 and

dσ are correlated.

9.5.4 Correlated processes

If two Brownian motions d X 1 and d X 2 are correlated with correlation

coef¬cient ρ, then we may write:

d X 2 = ρd X 1 + 1 ’ ρ 2d t ,

where d t is the part of d X 2 that is not related to d X 1 .

Now consider the hedged portfolio,

=V’ S1 , (9.4)

where only the risk that is due to the underlying asset and its correlated

volatility risk are hedged. Volatility risk orthogonal to d S1 is not hedged.

From Ito™s lemma, we get:

= dV ’

d d S1

12 1

= Vt + σ1 VS1 S1 + ρqσ1 VS1 σ + q 2 Vσ σ dt

2 2

+VS1 d S1 + Vσ dσ ’ d S1 . (9.5)

Now write:

d S1 = µ1 dt + σ1 d X 1 , and

dσ = pdt + qd X 2

= pdt + q ρd X 1 + 1 ’ ρ 2d .

t

Substitute this result into (9.5) and get:

12 1

= Vt + σ1 VS1 S1 + ρqσ1 VS1 σ + q 2 Vσ σ dt

d

2 2

+VS1 {µ1 dt + σ1 d X 1 } + Vσ pdt + q ρd X 1 + 1 ’ ρ 2d t

’ {µ1 dt + σ1 d X 1 }

12 12

= Vt + σ1 VS1 S1 + ρqσ1 VS1 σ + q Vσ σ dt

2 2

+ VS1 µ1 + Vσ p ’ µ1 dt + VS1 σ1 + Vσ qρ ’ σ1 d X 1

+Vσ q 1 ’ ρ 2 d t . (9.6)

Option Pricing with Stochastic Volatility 111

So to get rid of d X 1 , the hedge ratio should be:

VS1 σ1 + Vσ qρ ’ σ1 = 0,

Vσ qρ

= VS1 + .

σ1

With this hedge ratio, only the uncorrelated volatility risk,

Vσ q 1 ’ ρ 2 d t , is left in the portfolio. If ρ = 1, the portfolio would

be risk-free.

Now substitute value of into (9.6). We get:

12 1

= Vt + σ1 VS1 S1 + ρqσ1 VS1 σ + q 2 Vσ σ dt

d

2 2

Vσ qρµ1

+ VS1 µ1 + Vσ p ’ VS1 µ1 ’ dt + Vσ q 1 ’ ρ 2 d t

σ1

12 1

= Vt + σ1 VS1 S1 + ρqσ1 VS1 σ + q 2 Vσ σ dt

2 2

Vσ qρµ1

+ Vσ p ’ dt + Vσ q 1 ’ ρ 2 d t . (9.7)

σ1

9.5.5 The market price of risk

Next we made the assumption that the partially hedged portfolio in

(9.4) will earn a risk-free return plus a premium for unhedged volatility

risk, , such that

= r dt +

d

= r (V ’ S1 ) dt +

Vσ qρ

= r V ’ VS1 S1 ’ S1 dt + . (9.8)

σ1

Substituting d from (9.7), we get:

12 1

Vt + σ1 VS1 S1 + ρqσ1 VS1 σ + q 2 Vσ σ dt

2 2

Vσ qρµ1

+ Vσ p ’ dt + Vσ q 1 ’ ρ 2 d t

σ1

Vσ qρ

= r V dt ’ r VS1 S1 dt ’ r S1 dt + ,

σ1

112 Forecasting Financial Market Volatility

12 1

= Vt + σ1 VS1 S1 + ρqσ1 VS1 σ + q 2 Vσ σ ’ r V + r VS1 S1 dt

2 2

Vσ qρ

+ Vσ p + (r S1 ’ µ1 ) dt + Vσ q 1 ’ ρ 2 d t .

σ1

Now replace the {} term with f (S1 , σ, t) in (9.3) we get:

Vσ qρ

= f Vσ + Vσ p + (r S1 ’ µ1 ) dt + Vσ q 1 ’ ρ 2 d t

σ1

f + p + qρ (r S1 ’ µ1 )

σ1

= Vσ q 1 ’ ρ 2 dt + d t .

q 1 ’ ρ2

Now de¬ne ˜market price of risk™ γ :

qρ

f + p+ (r S1 ’ µ1 )

σ1

γ= , (9.9)

q 1’ρ 2

where γ is the ˜returns™ associated with each unit of risk that is due to d t

(i.e. the unhedged volatility risk), hence, the denominator q 1 ’ ρ 2 .

From (9.9), we can get an expression for f :

qρ

f = ’p + (µ1 ’ r S1 ) + γ q 1 ’ ρ 2 .

σ1

Substituting p = ± (m ’ σ ) , q = β, µ1 = µS1 , and σ1 = f (σ ) S1 :

βρ

f = ’± (m ’ σ ) + (µS1 ’ r S1 ) + γ β 1 ’ ρ 2 .

f (σ ) S1

βρ (µ ’ r )

= ’± (m ’ σ ) + + γ β 1 ’ ρ 2.

f (σ )

We can now price the option with stochastic volatility in (9.3) using the

expression for f above and get:

122 1

0 = Vt + f St VS1 S1 + β 2 Vσ σ + ρβ f St VS1 σ ’ r V + r VS1 S1

2 2

ρ (µ ’ r )

+ ± (m ’ σ ) ’ β + γ 1 ’ ρ 2 Vσ (9.10)

f (σ )

Now write

ρ (µ ’ r )

(S1 , σ, t) = + γ 1 ’ ρ 2.

f (σ )

Option Pricing with Stochastic Volatility 113

We write (9.10) as

122

Vt + f S1 VS1 S1 + r VS1 S1 ’ V +ρβ f S1 VS1 σ

2

correlation

Black“Scholes

1

+ β 2 Vσ σ + ± (m ’ σ ) Vσ ’ β Vσ = 0 (9.11)

2

premium

L ou

or, on rearrangement,4

122 1

Vt + f S1 VS1 S1 + r VS1 S1 + ρβ f S1 VS1 σ + β 2 Vσ σ + ± (m ’ σ ) Vσ

2 2

correlation

L ou

Black“Scholes

= + β Vσ

rV

risk-free return as in BS premium for volatility risk

This analysis show that when volatility is stochastic in the form in

(9.1), the option price will be higher. The additional risk premium is

related to the correlation between volatility and the stock price processes

and the mean-reverting dynamic of the volatility process.

4

This result is shown in Fouque, Papanicolaou and Sircar (2000).

10

Option Forecasting Power

Option implied volatility has always been perceived as a market™s

expectation of future volatility and hence it is a market-based volatil-

ity forecast. It makes use of a richer and more up-to-date information

set, and arguably it should be superior to time series volatility forecast.

On the other hand, we showed in the previous two chapters that op-

tion model-based forecast requires a number of assumptions to hold for

the option theory to produce a useful volatility estimate. Moreover, op-

tion implied also suffers from many market-driven pricing irregularities.

Nevertheless, the volatility forecasting contests show overwhelmingly

that option implied volatility has superior forecasting capability, out-

performing many historical price volatility models and matching the

performance of forecasts generated from time series models that use a

large amount of high-frequency data.

10.1 USING OPTION IMPLIED STANDARD

DEVIATION TO FORECAST VOLATILITY

Once an implied volatility estimate is obtained, it is usually scaled by

√

n to get an n-day-ahead volatility forecast. In some cases, a regression

model may be used to adjust for historical bias (e.g. Ederington and

Guan, 2000b), or the implied volatility may be parameterized within a

GARCH/ARFIMA model with or without its own persistence adjust-

ment (e.g. Day and Lewis, 1992; Blair, Poon and Taylor, 2001; Hwang

and Satchell, 1998).

Implied volatility, especially that of stock options, can be quite un-

stable across time. Beckers (1981) ¬nds taking a 5-day average improves

the forecasting power of stock option implied. Hamid (1998) ¬nds such

an intertemporal averaging is also useful for stock index option dur-

ing very turbulent periods. On a slightly different note, Xu and Taylor

(1995) ¬nd implied estimated from a sophisticated volatility term struc-

ture model produces similar forecasting performance as implied from

the shortest maturity option.

116 Forecasting Financial Market Volatility

In contrast to time series volatility forecasting models, the use of im-

plied volatility as a volatility forecast involves some extra complexities.

A test on the forecasting power of option implied standard deviation

(ISD) is a joint test of option market ef¬ciency and a correct option pric-

ing model. Since trading frictions differ across assets, some options are

easier to replicate and hedge than the others. It is therefore reasonable

to expect different levels of ef¬ciency and different forecasting power

for options written on different assets.

While each historical price constitutes an observation in the sample

used in calculating volatility forecast, each option price constitutes a

volatility forecast over the option maturity, and there can be many option

prices at any one time. The problem of volatility smile and volatility skew

means that options of different strike prices produce different Black“

Scholes implied volatility estimates.

The issue of a correct option pricing model is more fundamental

in ¬nance. Option pricing has a long history and various extensions

have been made since Black“Scholes to cope with dividend payments,

early exercise and stochastic volatility. However, none of the option

pricing models (except Heston (1993)) that appeared in the volatility

forecasting literature allows for a premium for bearing volatility risk.

In the presence of a volatility risk premium, we expect the option price

to be higher which means implied volatility derived using an option

pricing model that assumes zero volatility risk premium (such as the

Black“Scholes model) will also be higher, and hence automatically be

more biased as a volatility forecast. Section 10.3 examines the issue of

biasedness of ISD forecasts and evaluates the extent to which implied

biasedness is due to the omission of volatility risk premium.

10.2 AT-THE-MONEY OR WEIGHTED IMPLIED?

Since options of different strikes have been known to produce differ-

ent implied volatilities, a decision has to be made as to which of these

implied volatilities should be used, or which weighting scheme should

be adopted, that will produce a forecast that is most superior. The most

common strategy is to choose the implied derived from an ATM op-

tion based on the argument that an ATM option is the most liquid and

hence ATM implied is least prone to measurement errors. The analysis

in Chapter 9 shows that, omitting volatility risk premium, ATM implied

is also least likely to be biased.

If ATM implied is not available, then an NTM (nearest-to-the-money)

option is used instead. Sometimes, to reduce measurement errors and

Option Forecasting Power 117

the effect of bid“ask bounce, an average is taken from a group of NTM

implied volatilities. Weighting schemes that also give greater weight

to ATM implied are vega (i.e. the partial derivative of option price

w.r.t. volatility) weighted or trading volume weighted, weighted least

squares (WLS) and some multiplicative versions of these three. The

WLS method, ¬rst appeared in Whaley (1982), aims to minimize the

sum of squared errors between the market and the theoretical prices of a

group of options. Since the ATM option has the highest trading volume

and the ATM option price is the most sensitive to volatility input, all

three weighting schemes (and the combinations thereof) have the ef-

fect of placing the greatest weight on ATM implied. Other less popular

weighting schemes include equally weighted, and weight based on the

elasticity of option price to volatility.

The forecasting power of individual and composite implied volatilities

has been tested in Ederington and Guan (2000b), Fung, Lie and Moreno

(1990), Gemmill (1986), Kroner, Kneafsey and Claessens (1995), Scott

and Tucker (1989) and Vasilellis and Meade (1996). The general con-

sensus is that among the weighted implied volatilities, those that favour

the ATM option such as the WLS and the vega weighted implied are

better. The worst performing ones are equally weighted and elastic-

ity weighted implied using options across all strikes. Different ¬ndings

emerged as to whether an individual implied volatility forecasts better

than a composite implied. Beckers (1981) Feinstein (1989b), Fung, Lie

and Moreno (1990) and Gemmill (1986) ¬nd evidence to support indi-

vidual implied although they all prefer a different implied (viz. ATM,

Just-OTM, OTM and ITM respectively for the four studies). Kroner,

Kneafsey and Claessens ¬nd composite implied volatility forecasts bet-

ter than ATM implied. On the other hand, Scott and Tucker (1989) con-

clude that when emphasis is placed on ATM implied, which weighting

scheme one chooses does not really matter.

A series of studies by Ederington and Guan have reported some inter-

esting ¬ndings. Ederington and Guan (1999) report that the information

content of implied volatility of S&P500 futures options exhibits a frown

shape across strikes with options that are NTM and have moderately

high strike (i.e. OTM calls and ITM puts) possess the largest informa-

tion content with R 2 equal to 17% for calls and 36% for puts.

10.3 IMPLIED BIASEDNESS

Usually, forecast unbiasedness is not an overriding issue in any forecast-

ing exercise. Forecast bias can be estimated and corrected if the degree

118 Forecasting Financial Market Volatility

of bias remains stable through time. Testing for biasedness is usually

carried out using the regression equation (2.3), where X i = X t is the

implied forecast of period t volatility. For a forecast to be unbiased, one

would require ± = 0 and β = 1. Implied forecast is upwardly biased if

± > 0 and β = 1, or ± = 0 and β > 1. In the case where ± > 0 and

β < 1, which is the most common scenario, implied underforecasts low

volatility and overforecasts high volatility.

It has been argued that implied bias will persist only if it is dif¬cult

to perform arbitrage trades that are needed to remove the mispricing.

This is more likely in the case of stock index options and less likely

for futures options. Stocks and stock options are traded in different

markets. Since trading of a basket of stocks is cumbersome, arbitrage

trades in relation to a mispriced stock index option may have to be

done indirectly via index futures. On the other hand, futures and futures

options are traded alongside each other. Trading in these two contracts

are highly liquid. Despite these differences in trading friction, implied

biasedness is reported in both the S&P100 OEX market (Canina and

Figlewski, 1993; Christensen and Prabhala, 1998; Fleming, Ostdiek and

Whaley, 1995; Fleming, 1998) and the S&P500 futures options market

(Feinstein, 1989b; Ederington and Guan, 1999, 2002).

Biasedness is equally widespread among implied volatilities of cur-

rency options (see Guo, 1996b; Jorion, 1995; Li, 2002; Scott and Tucker,

1989; Wei and Frankel, 1991). The only exception is Jorion (1996) who

cannot reject the null hypothesis that the one-day-ahead forecasts from

implied are unbiased. The ¬ve studies listed earlier use implied to fore-

cast exchange rate volatility over a much longer horizon ranging from

one to nine months.

Unbiasedness of implied forecast was not rejected in the Swedish mar-

ket (Frennberg and Hansson, 1996). Unbiasedness of implied forecast

was rejected for UK stock options (Gemmill, 1986), US stock options

(Lamoureux and Lastrapes, 1993), options and futures options across a

range of assets in Australia (Edey and Elliot, 1992) and for 35 futures

options contracts traded over nine markets ranging from interest rate

to livestock futures (Szakmary, Ors, Kim and Davidson, 2002). On the

other hand, Amin and Ng (1997) ¬nd the hypothesis that ± = 0 and

β = 1 cannot be rejected for the Eurodollar futures options market.

Where unbiasedness was rejected, the bias in all but two cases was

due to ± > 0 and β < 1. These two exceptions are Fleming (1998) who

reports ± = 0 and β < 1 for S&P100 OEX options, and Day and Lewis

(1993) who ¬nd ± > 0 and β = 1 for distant-term oil futures options

contracts.

Option Forecasting Power 119

Christensen and Prabhala (1998) argue that implied is biased because

of error-in-variable caused by measurement errors. Using last period

implied and last period historical volatility as instrumental variables to

correct for these measurement errors, Christensen and Prabhala (1998)

¬nd unbiasedness cannot be rejected for implied volatility of the S&P100

OEX option. Ederington and Guan (1999, 2002) ¬nd bias in S&P500

futures options implied also disappeared when similar instrument vari-

ables were used.

10.4 VOLATILITY RISK PREMIUM

It has been suggested that implied biasedness could not have been caused

by model misspeci¬cation or measurement errors because this has rel-

atively small effects for ATM options, used in most of the studies that

report implied biasedness. In addition, the clientele effect cannot explain

the bias either because it only affects OTM options. The volatility risk

premium analysed in Chapter 9 is now often cited as an explanation.

Poteshman (2000) ¬nds half of the bias in S&P500 futures options

implied was removed when actual volatility was estimated with a more

ef¬cient volatility estimator based on intraday 5-minute returns. The

other half of the bias was almost completely removed when a more

sophisticated and less restrictive option pricing model, i.e. the Heston

(1993) model, was used. Further research on option volatility risk pre-

mium is currently under way in Benzoni (2001) and Chernov (2001).

Chernov (2001) ¬nds, similarly to Poteshman (2000), that when implied

volatility is discounted by a volatility risk premium and when the errors-

in-variables problems in historical and realized volatility are removed,

the unbiasedness of the S&P100 index option implied volatility cannot

be rejected over the sample period from 1986 to 2000. The volatility risk

premium debate continues if we are able to predict the magnitude and

the variations of the volatility premium and if implied from an option

pricing model that permits a nonzero market price of risk will outperform

time series models when all forecasts (including forecasts of volatility

risk premium) are made in an ex ante manner.

Ederington and Guan (2000b) ¬nd that using regression coef¬cients

produced from in-sample regression of forecast against realized volatil-

ity is very effective in correcting implied forecasting bias. They also

¬nd that after such a bias correction, there is little to be gained from

averaging implied across strikes. This means that ATM implied together

with a bias correction scheme could be the simplest, and yet the best,

way forward.

11

Volatility Forecasting Records

11.1 WHICH VOLATILITY FORECASTING MODEL?

Our JEL survey has concentrated on two questions: is volatility fore-

castable? If it is, which method will provide the best forecasts? To con-

sider these questions, a number of basic methodological viewpoints need

to be discussed, mostly about the evaluation of forecasts. What exactly is

being forecast? Does the time interval (the observation interval) matter?

Are the results similar for different speculative markets? How does one

measure predictive performance?

Volatility forecasts are classi¬ed in this section as belonging in one

of the following four categories:

r HISVOL: for historical volatility, which include random walk, histor-

ical averages of squared returns, or absolute returns. Also included

in this category are time series models based on historical volatility

using moving averages, exponential weights, autoregressive models,

or even fractionally integrated autoregressive absolute returns, for ex-

ample. Note that HISVOL models can be highly sophisticated. The

multivariate VAR realized volatility model in Andersen, Bollerslev,

Diebold and Labys (2001) is classi¬ed here as a ˜HISVOL™ model. All

models in this group model volatility directly, omitting the goodness

of ¬t of the returns distribution or any other variables such as option

prices.

r GARCH: any member of the ARCH, GARCH, EGARCH and so forth

family is included.

r SV: for stochastic volatility model forecasts.

r ISD: for option implied standard deviation, based on the Black“

Scholes model and various generalizations.

The survey of papers includes 93 studies, but 25 of them did not

involve comparisons between methods from at least two of these groups,

and so were not helpful for comparison purposes.

Table 11.1 involves just pairwise comparisons. Of the 66 studies that

were relevant, some compared just one pair of forecasting techniques,

122 Forecasting Financial Market Volatility

Table 11.1 Pair-wise comparisons of forecasting performance

of various volatility models

Number of studies Studies percentage

HISVOL > GARCH 22 56%

GARCH > HISVOL 17 44%

HISVOL > ISD 8 24%

ISD > HISVOL 26 76%

GARCH > ISD 1 6%

ISD > GARCH 17 94%

SV > HISVOL 3

SV > GARCH 3

GARCH > SV 1

ISD > SV 1

Note: “A > B” means model A™s forecasting performance is better than

that of model B™s

other compared several. For those involving both HISVOL and GARCH

models, 22 found HISVOL better at forecasting than GARCH (56% of

the total), and 17 found GARCH superior to HISVOL (44%).

The combination of forecasts has a mixed picture. Two studies ¬nd it

to be helpful but another does not.

The overall ranking suggests that ISD provides the best forecasting

with HISVOL and GARCH roughly equal, although possibly HISVOL

does somewhat better in the comparisons. The success of the implied

volatility should not be surprising as these forecasts use a larger, and

more relevant, information set than the alternative methods as they use

option prices. They are also less practical, not being available for all

assets.

Among the 93 papers, 17 studies compared alternative version of

GARCH. It is clear that GARCH dominates ARCH. In general, mod-

els that incorporate volatility asymmetry such as EGARCH and GJR-

GARCH, perform better than GARCH. But certain specialized speci¬-

cations, such as fractionally integrated GARCH (FIGARCH) and regime

switching GARCH (RSGARCH) do better in some studies. However,

it seems clear that one form of study that is included is conducted just

to support a viewpoint that a particular method is useful. It might not

have been submitted for publication if the required result had not been

reached. This is one of the obvious weaknesses of a comparison such as

this: the papers being reported have been prepared for different reasons

Volatility Forecasting Records 123

and use different data sets, many kinds of assets, various intervals and

a variety of evaluation techniques. Rarely discussed is if one method

is signi¬cantly better than another. Thus, although a suggestion can be

made that a particular method of forecasting volatility is the best, no

statement is available about the cost“bene¬t from using it rather than

something simpler or how far ahead the bene¬ts will occur.

Financial market volatility is clearly forecastable. The debate is on

how far ahead one can accurately forecast and to what extent volatility

changes can be predicted. This conclusion does not violate market ef¬-

ciency since accurate volatility forecast is not in con¬‚ict with underlying

asset and option prices being correct. The option implied volatility, being

a market-based volatility forecast, has been shown to contain most in-

formation about future volatility. The supremacy among historical time

series models depends on the type of asset being modelled. But, as a

rule of thumb, historical volatility methods work equally well compared

with more sophisticated ARCH class and SV models. Better reward

could be gained by making sure that actual volatility is measured accu-

rately. These are broad-brush conclusions, omitting the ¬ne details that

we outline in this book. Because of the complex issues involved and the

importance of volatility measure, volatility forecasting will continue to

remain a specialist subject and to be studied vigorously.

11.2 GETTING THE RIGHT CONDITIONAL

VARIANCE AND FORECAST WITH THE

˜WRONG™ MODELS

Many of the time series volatility models, including the GARCH mod-

els, can be thought of as approximating a deeper time-varying volatility

construction, possibly involving several important economic explana-

tory variables. Since time series models involve only lagged returns it

seems likely that they will provide an adequate, possibly even a very

good, approximation to actuality for long periods but not at all times.

This means that they will forecast well on some occasions, but less well

on others, depending on ¬‚uctuations in the underlying driving variables.

Nelson (1992) proves that if the true process is a diffusion or near-

diffusion model with no jumps, then even when misspeci¬ed, appropri-

ately de¬ned sequences of ARCH terms with a large number of lagged

residuals may still serve as consistent estimators for the volatility of

the true underlying diffusion, in the sense that the difference between

the true instantaneous volatility and the ARCH estimates converges to

124 Forecasting Financial Market Volatility

zero in probability as the length of the sampling frequency diminishes.

Nelson (1992) shows that such ARCH models may misspecify both the

conditional mean and the dynamic of the conditional variance; in fact the

misspeci¬cation may be so severe that the models make no sense as data-

generating processes, they could still produce consistent one-step-ahead

conditional variance estimates and short-term forecasts.

Nelson and Foster (1995) provide further conditions for such mis-

speci¬ed ARCH models to produce consistent forecasts over the medium

and long term. They show that forecasts by these misspeci¬ed models

will converge in probability to the forecast generated by the true diffusion

or near-diffusion process, provided that all unobservable state variables

are consistently estimated and that the conditional mean and conditional

covariances of all state variables are correctly speci¬ed. An example

of a true diffusion process given by Nelson and Foster (1995) is the

stochastic volatility model described in Chapter 6.

These important theoretical results con¬rm our empirical observa-

tions that under normal circumstances, i.e. no big jumps in prices, there

may be little practical difference in choosing between volatility mod-

els, provided that the sampling frequency is small and that, whichever

model one has chosen, it must contain suf¬ciently long lagged residuals.

This might be an explanation for the success of high-frequency and long

memory volatility models (e.g. Blair, Poon and Taylor, 2001; Andersen,

Bollerslev, Diebold and Labys, 2001).

11.3 PREDICTABILITY ACROSS DIFFERENT ASSETS

Early studies that test the forecasting power of option ISD are fraught

with many estimation de¬ciencies. Despite these complexities, option

ISD has been found empirically to contain a signi¬cant amount of in-

formation about future volatility and it often beats volatility forecasts

produced by sophisticated time series models. Such a superior perfor-

mance appears to be common across assets.

11.3.1 Individual stocks

Latane and Rendleman (1976) were the ¬rst to discover the forecast-

ing capability of option ISD. They ¬nd actual volatilities of 24 stocks

calculated from in-sample period and extended partially into the future

are more closely related to implied than historical volatility. Chiras and

Manaster (1978) and Beckers (1981) ¬nd prediction from implied can

Volatility Forecasting Records 125

explain a large amount of the cross-sectional variations of individual

stock volatilities. Chiras and Manaster (1978) document an R 2 of 34“

70% for a large sample of stock options traded on CBOE whereas

Beckers (1981) reports an R 2 of 13“50% for a sample that varies from

62 to 116 US stocks over the sample period. Gemmill (1986) produces

an R 2 of 12“40% for a sample of 13 UK stocks. Schmalensee and

Trippi (1978) ¬nd implied volatility rises when stock price falls and that

implied volatilities of different stocks tend to move together. From a

time series perspective, Lamoureux and Lastrapes (1993) and Vasilellis

and Meade (1996) ¬nd implied volatility could also predict time series

variations of equity volatility better than forecasts produced from time

series models.

The forecast horizons of this group of studies that forecast equity

volatility are usually quite long, ranging from 3 months to 3 years. Stud-

ies that examine incremental information content of time series fore-

casts ¬nd volatility historical average provides signi¬cant incremental

information in both cross-sectional (Beckers, 1981; Chiras and Man-

aster, 1978; Gemmill, 1986) and time series settings (Lamoureux and

Lastrapes, 1993) and that combining GARCH and implied volatility

produces the best forecast (Vasilellis and Meade, 1996). These ¬ndings

have been interpreted as an evidence of stock option market inef¬ciency

since option implied does not subsume all information. In general, stock

option implied volatility exhibits instability and suffers most from mea-

surement errors and bid“ask spread because of the lower liquidity.

11.3.2 Stock market index

There are 22 studies that use index option ISD to forecast stock index

volatility; seven of these forecast volatility of S&P100, ten forecast

volatility of S&P500 and the remaining ¬ve forecast index volatility

of smaller stock markets. The S&P100 and S&P500 forecasting results

make an interesting contrast as almost all studies that forecast S&P500

volatility use S&P500 futures options which is more liquid and less

prone to measurement errors than the OEX stock index option written

on S&P100. We have dealt with the issue of measurement errors in the

discussion of biasness in Section 10.3.

All but one study (viz. Canina and Figlewski, 1993) conclude that

implied volatility contains useful information about future volatility.

Blair, Poon and Taylor (2001) and Poteshman (2000) record the highest

R 2 for S&P100 and S&P500 respectively. About 50% of index volatility

126 Forecasting Financial Market Volatility

is predictable up to a 4-week horizon when actual volatility is estimated

more accurately using very high-frequency intraday returns.

Similar, but less marked, forecasting performance emerged from the

smaller stock markets, which include the German, Australian, Canadian

and Swedish markets. For a small market such as the Swedish market,

Frennberg and Hansson (1996) ¬nd seasonality to be prominent and that

implied volatility forecast cannot beat simple historical models such as

the autoregressive model and random walk. Very erratic and unstable

forecasting results were reported in Brace and Hodgson (1991) for the

Australian market. Doidge and Wei (1998) ¬nd the Canadian Toronto

index is best forecast with GARCH and implied volatility combined,

whereas Bluhm and Yu (2000) ¬nd VDAX, the German version of VIX,

produces the best forecast for the German stock index volatility.

A range of forecast horizons were tested among this group of stud-

ies, though the most popular choice is 1 month. There is evidence that

the S&P implied contains more information after the 1987 crash (see

Christensen and Prabhala (1998) for S&P100 and Ederington and Guan

(2002) for S&P500). Some described this as the ˜awakening™ of the S&P

option markets.

About half of the papers in this group test if there is incremental

information contained in time series forecasts. Day and Lewis (1992),

Ederington and Guan (1999, 2004), and Martens and Zein (2004) ¬nd

ARCH class models and volatility historical average add a few percent-

age points to the R 2 , whereas Blair, Poon and Taylor (2001), Christensen

and Prabhala (1998), Fleming (1998), Fleming, Ostdiek and Whaley

(1995), Hol and Koopman (2001) and Szakmary, Ors, Kim and Davidson

(2002) all ¬nd option implied dominates time series forecasts.

11.3.3 Exchange rate

The strong forecasting power of implied volatility is again con¬rmed

in the currency markets. Sixteen papers study currency options for a

number of major currencies, the most popular of which are DM/US$

and ¥/US$. Most studies ¬nd implied volatility to contain information

about future volatility for a short horizon up to 3 months. Li (2002) and

Scott and Tucker (1989) ¬nd implied volatility forecast well for up to a

6“9-month horizon. Both studies register the highest R 2 in the region of

40“50%.

A number of studies in this group ¬nd implied volatility beats time

series forecasts including volatility historical average (see Fung, Lie and

Volatility Forecasting Records 127

Moreno, 1990; Wei and Frankel, 1991) and ARCH class models (see