ńňđ. 10 
full employment is assumed, equilibrium is realised through price changes. Th
In this model (unlike that in Section 8.3) the exchange rate has a feedback e
domestic price level via equation (13.37). The steady state is defined as Pf =
Y, = 1 and E,S,+l = S,. Hence from the UIP condition ri = 0 and from (13.38
The solution for the model at time t is found by substitution for P, from (13.37
+
and then solving for (1 rt) and substitution in the UIP condition (13.39) to
where
x ,  M t y ta p  â€˜ l / â€˜ l + k â€ť
 t1
and 81, 8 are functions of the other structural parameters (and 0 < 82 < 1)
2
spot rate is determined by the fundamentals that comprise X,. The term E,(
marketâ€™s expectation and provides an important nonlinearity in the model. T
expectation is determined by a weighted average of the behaviour of NT an
Section 8.3).
E ( S t + l / S r  l ) = f ( S r  1 9 ˜ r  2 ,. . .)mr(S:,/Srl)(lmf)a
where the weight given to noise traders is
Equations (13.42), (13.43) and (13.40) yield the following nonlinear equat
exchange rate
S , = (Xf)â€™1 (Sf1)â€™1 (St2)Qâ€™ ( ˜ 1  3 ) â€ť (Sr4 144
The above representation assumes an exogenous money supply but we can al
model under the assumption of interest rate smoothing
where !P(> 0) measures the intensity of interest rate smoothing. The simula
the exchange rate under interest rate smoothing for a given chaotic solutio
in Figure 13.1 and has the general random pattern that we associate with
exchange rate data. De Grauwe et a1 then take this simulated data and test fo
walk (strictly, a unit root).
+
St = aSt1 Et
and find that they cannot reject (x = 1 (for a wide variety of parameters of
which result in chaotic solution). Hence a pure deterministic â€˜fundamentals
mimic a stochastic process with a unit root.
Figure 13.1 Chaotic Exchange Rate Model with Money. Source: De Grauwe et a1 (1
duced by permisssion of Blackwell Publishers
If covered interest parity holds, the forward premium ( F I S ) , is equal to
differential (1 + rr)/(1 + r:). De Grauwe et a1 use the simulated values of
differential as a measure of ( F I S ) , and regress the latter on the simulated
( S ,+ 1 /st 1:
+
(S,+l/s,>= HFIS),
They find b < 0 which is also the case with actual real world data. Hence i
model the forward premium is a biased forecast of the change in the spot rate
we have risk neutrality at the level of the market (i.e. UIP holds). The above
resolved when we recognise the heterogeneity of expectations of the NT and SM
domestic interest rates rt leads to an immediate appreciation of the current spot
domestic currency (i.e. S, falls as in the Dornbusch model) and via covered int
a rise in ( F I S ) , . However, if the market is dominated by NT they will extr
current appreciation which tends to lead to a further appreciation in the domest
next period (i.e. E,Sr+1  S, falls). The domination by extrapolative NT, howe
that on average a rise in ( F I S ) , is accompanied by a fall in E,S,+1  S,. If there
SM in the market then UIP indicates that the spot rate would depreciate (i.e. E
increases) in future periods after a rise in rf as in the Dornbusch model.
De Grauwe et a1 also simulate the model when stochastic shocks are allow
ence the money supply (which is now assumed to be exogenous). The mone
assumed to follow a random walk and the error term therefore represents â€˜ne
long time period the simulated data for S, broadly moves with that for the mo
as our fundamentals model would indicate and a simple (OLS cointegrating)
confirms this:
+
S, = 0.02 0.09 M ,
(0.5) (32.8)
In the simulated data the variability in S, is much greater than that for Mr. Th
data is then split into several subperiods of 50 observations each and the
S , = (x + BM, is run. De Grauwe et a1 find that the parameters a and B are high
data using 50 data points forecasts better than either the random walk or th
model. Thus, data from a chaotic system can be modelled and may yield
forecasts over short horizons. These results provide a prima facie case for th
success of error correction models over static models or purely AR models
+
world exhibits chaotic behaviour. The static structural regression S, = a B
the longrun fundamentals and the linear AR(3) model mimics the nonlinea
Of course, the (linear) error correction model is not the correct representation
nonlinear chaotic model but it may provide a reasonably useful approximat
finite set of data.
Chaotic models applied to economic phenomena are in their infancy so
expect definitive results at present. However, they do suggest that nonli
economic relationships might be important and the latter may yield chaotic
In addition one can always add stochastic shocks to any nonlinear system
tend to increase the noise in the system. Hence although much of economic
been founded on linear models or linear approximations to nonlinear mode
tractability and closed form solutions, it may be that such approximation
important nonlinearities in behavioural responses (see Pesaran and Potter (1
A key question not yet tackled is whether actual data on exchange rates exh
behaviour. The obvious difficulty in discriminating between a chaotic determini
and a purely linear stochastic process has already been noted. There are s
available to detect chaotic behaviour but they require large amounts of data (i.e
data points) to yield reasonably unambiguous and clear results. De Grauwe et a
several tests on daily exchange rate data but find only weak evidence of chaoti
for the yen/dollar, pound sterling/dollar daily exchange rate data and no evi
for chaotic behaviour in the DM/dollar exchange rate over the 19721990 p
may be due to insufficient data or because the presence of any stochastic â€˜n
the detection of chaos. They then test for the presence of nonlinearities in th
rate data. Using two complementary test statistics (Brock et al, 1987 and Hi
they find that for six major bilateral rates, they could not reject the existence o
structures for any of the bilateral rates using daily and weekly returns (i.e. the
change in the exchange rate) and for monthly data they only reject nonline
case, namely the pound sterling/yen rate. Of course, nonlinearity is necessary
behaviour but it is not sufficient: the presence or otherwise of chaotic behavio
on the precise parameterisation of the nonlinear relationship. Also the above
tell us the precise form of the nonlinearity in the dynamics of the exchange
that some form or other of nonlinearity appears to be present. We are still l
somewhat Herculean task of specifying and estimating a nonlinear structur
the exchange rate.
13.7 SUMMARY
It is important that the reader is aware of the attempts that have been made
movements in spot exchange rates in terms of economic fundamentals, not le
performance. At present it would appear to be the case that formal tests of
models lead one to reject them. Over short horizons, say up to about one yea
fundamentals generally do not help predict changes in the spot rate. Over long
of four years fundamentals do provide some predictive power for some curren
1995). The latter is consistent with the view that purchasing power parity and t
for) moneyincome nexus hold in the long run (i.e. the relevant variables are co
However, on balance it must be recognised that there exists a great deal of
regarding the underlying determinants of the spot exchange rate in industrialise
with moderate inflation. The concepts and ideas which underlie these models
UIP relative money supplies) do still play a role in guiding policy maker
because they have little else to go on other than their â€˜hunchâ€™ about the
interest rateexchange rate nexus to apply in particular circumstances. The
firm policy implications which arise because of the statistical inadequacy o
models has resulted in policy makers trying to mitigate the severity of wide sw
real exchange rate by coordinated Central Bank intervention, a move toward
zones and even the proposal to adopt a common currency.
The lack of success of â€˜pure fundamentalsâ€™ models in explaining movements
exchange rate has recently led researchers to consider nonlinear models that
in chaotic behaviour. The underlying theory in these models usually involves b
traders and some form of noisetrader behaviour. They are not models that
agents being rational and maximising some welldefined objective function. N
their adhoc assumptions are usually plausible. The main conclusions to emerg
literature as applied to exchange rates are:
Deterministic nonlinear models are capable of generating apparently
random and irregular time series which broadly resemble those in real w
The addition of â€˜newsâ€™ or random â€˜shocksâ€™ provides additional random im
The empirical evidence on exchange rates is ambivalent on the presence
(deterministic) chaos. However, it does quite strongly indicate the presen
linearities in the data generation process.
The challenge now is to produce coherent theoretical models that are
generating chaotic or behavioural nonlinear models that are subject to rand
Nonlinear structural (fundamental) models need not necessarily rule out
expectations. However, models involving the interaction of NT and SM,
not fully (Muth) rational, also seem worthy of further analysis.
Nonlinear theoretical models will need to be tested against the data an
provide further econometric challenges.
The problem with chaotic models and indeed nonlinear stochastic models is
small changes in parameters values can lead to radically different behaviou
casts for the exchange rate. Given that econometric parameter estimates from
models are frequently subject to some uncertainty, the range of possible fore
such models becomes potentially quite large. If such models are not firmly
FURTHER READING
Macroeconomics texts such as Cuthbertson and Taylor (1987) and Burda a
(1993) provide an overview of theories of the determination of spot exchan
an intermediate level. Other intermediate texts which deal exclusively wit
rates are Copeland (1994) who also has a useful chapter on chaos, and
(1988) who provides copious references to the empirical literature. MacDonald
(1992) and Taylor (1995) in their survey articles concentrate on what might
described as macroeconomic models of the exchange rate. Froot and Thaler (1
the anomalies literature in the FOREX market while De Grauwe et a1 (199
accessible account of chaos theory applied to the FOREX market.
PART 5
I
II
Tests of the EMH using the VA
Methodology
Tests of the EMH in the FOREX market and for stocks and bonds outline
chapters have in part been concerned with the informational efficiency assumpt
by rational expectations. Empirical work centres on whether information at time
can be used to predict returns and hence enable investors to earn abnormal
variance bounds approach tests whether asset prices equal their fundamental
anomalies literature frequently seeks to test for the existence of profitable tr
in the market. All of these tests are, of course, conditional on a specific econo
(or adhoc hypothesis) concerning the determination of equilibrium expected
This part of the book examines some relatively sophisticated tests of the EM
to stocks, bonds and the FOREX market which have recently appeared in th
The generic term for these tests is the â€˜VAR methodologyâ€™. Some readers will
aware that tests of the informational efficiency requirement of the EMH resul
tions on the parameters of the model under investigation (see Chapter 5). Th
this instance usually consists of a hypothesis about equilibrium returns plus
forecasting equation (or equations) which is used to mimic the predictions fo
servable rational expectations of agents. Early tests of these crossequation
were undertaken by estimating the model both with and without the restrictio
and then seeing how far the â€˜fitâ€™ of the model deteriorated when the rest
imposed. A likelihood ratio test was often invoked to provide the actual metric
statistic. The VAR methodology tests these same RE restrictions but, as we sha
only estimate the unrestricted model and this is usually computationally much
In testing the EMH on asset prices in previous chapters we have used varia
inequalities. In the VAR methodology these variance inequalities are replaced
equalities. For example, consider the expectations hypothesis EH of the term
Using the VAR methodology we can obtain a best predictor of future chang
+
term interest rates EtAr,+,: call this prediction S . Under the EH RE the
:
Si should mimic movements in the actual spread between long and short rates
the variance of S, should equal the variance of the best forecast Si. Henc
+
methodology provides two â€˜metricsâ€™ for evaluating the PEH RE, first a se
equation) restrictions on the estimated parameters of the VAR prediction equ
asset price levels.
So far we have concentrated on the econometrics of the tests using the VA
ology. However, it is important to note that the (crossequation) parameter
referred to above do have an economic interpretation. Only if these restriction
the case that investors make zero abnormal profits and that the (RE) forecas
independent of the information set used by agents. The latter conditions are o
the heart of the EMH and informational efficiency.
Some readers may find the material in Chapter 14 more difficult than tha
chapters. However, it is my contention that the material is not analytica
although the algebra sometimes seems a little voluminous and on first read
be difficult to see the wood for the trees. The latter is in part due to a desi
the reader from simple specific cases to the more complex general case which
the burgeoning literature in this area. Your perseverance will have a high pay
understanding the literature in this area. To aid the exposition a simplified o
the main analytic issues is presented first.
Overview
To throw some light on where we are headed in relation to tests already d
earlier chapters consider, by way of example, our much loved RVF for stoc
and the expectations hypothesis for the long rate Rr (using spot yields):
00
=
Pr = v, y = 1/(1+ k )
yiE,D,+;
i= 1
n1
i=O
The â€˜fundamental valueâ€™ V, denotes the DPV of expected future dividends and
rate y is assumed constant (0 .c y < 1). TV is the expected terminal value of
rolledover oneperiod investment in short bonds. Under the EMH the actual
equals V, and the long rate equals T V I , otherwise profitable (risky) trades a
and abnormal profits can be made.
The regressionbased tests and the variance bounds tests of the EMH
applying the RE assumption:
to (1) and (2), respectively, where qr+l and vr+1 represent â€˜newsâ€™ or â€˜surprisesâ€™
equations imply that stock prices only change because of the arrival of new
+
between t and t 1 hence stock returns (i.e. basically the change in stock
unforecastable using information available at time t or earlier (Q,). Similarly
(2) and (4) imply that the excess return from investing in the long bond ra
into the RHS of (1) so we have
+ 0,
P; = P ,
where P: is the perfect foresight price calculated using expost actual valu
Since by RE, P, is uncorrelated with w,, yet var(w,) must be greater than zer
the usual variance bounds inequality. A similar argument applies to the ter
equation (2). Hence the variance bounds tests do not seek to provide explicit
schemes for the unobservable expectations E,D,+i and Etr,+i but merely assum
are unbiased and that forecast errors are independent of Q, . (This is the â€˜errors i
approach which will be familiar to econometricians.)
In contrast, the methodology in Chapter 14 seeks to provide explicit
equations for D,+; and rt+i based on regressions using a limited informat
These forecasting equations we may term weakly rational since they do not
use the full information set R, as used by agents. However, given explicit
dividends and assuming we know y then we can calculate the RHS of (1) a
an explicit forecast for V, which we denote Vi. For ease of exposition it is u
point to assume the econometrician has discovered the â€˜trueâ€™ forecasting mod
used by agents, namely an AR(1) process:
By the chain rule of forecasting:
E,D,+j = aiD,
and the best forecast of the DPV of future divideilds using the true informatio
Knowing y and having an estimate of a from the regression (6) a time series
be constructed. (As we shall see Pi is referred to as the â€˜theoretical priceâ€™: it
estimate of the DPV of â€˜fundamentalsâ€™ given by our theoretical valuation mo
RVF and (6) are â€˜trueâ€™ then we expect (i) P , and Pi to move together over
be highly correlated, (ii) the variance ratio, defined as:
VR = var(P,)/ var(P;)
to equal unity and (iii) in the regression
+ PlP: + 4
P, = P O
we expect PO = 0, = 1. By positing the â€˜trueâ€™ expectations scheme for D
mating this relationship, we have been able to move from a variance bounds in
a relationship between P , and P: based on a variance equalityâ€™. The relationsh
P , and P: allows the three â€˜metricsâ€™ in (i) to (iii) to be used to assess the val
EMH based on the RVF plus RE.
dends and (6) will be an approximation. In this case we do not expect P, t
forecast P: exactly. The reason is that the equation chosen by the econom
forecast Dr+i may be based on a limited information set ( A l c S2,) and th
econometricianâ€™s forecasting equation will not equal the true (rational) expect
cast, as formulated by investors. Clearly the closer (6) is to the â€˜trueâ€™ equatio
we expect the conditions (i)(iii) to holdâ€˜â€™). Also, even if the econometricia
form of the true model to forecast Dt+j his estimates would be subject to sam
and hence the conditions (i)(iii) would hold, for any given sample, only with
statistical confidence limits.
It was stated above that conditions (i)(iii) will hold even with a limited
set. In fact, this is only true if the forecasting equation for D,+j depends on P
term structure model if forecasts of r,+j depend on R,. The key element in ob
results (i)(iii) with a limited information set is that the LHS variable in the
efficient markets relationship (e.g. P, or R,) is used in the forecasting equat
RHS variables,(i.e. D,+j or r f + j ) . This then constitutes a VAR system wh
analysed in much of the rest of this section of the book.
From AR to VAR
Let us return to our story of trying to predict r,+i and assume that r,+l de
and R,:
+ +
B2Rr &,+l
Tr+l = Blrr
Having estimated (11)â€™ we now wish to use it to forecast E,r,+2 so that we ca
this value in the RHS of (2) in order to calculate TV,. see that
We
The values of (r,,R,) are known at time t; however we do not have a value
and hence we cannot as yet obtain an explicit forecast for Errf+2. What w
forecasting equation for R,+l and so assume (somewhat arbitrarily) this is o
general form as that for r,+l in (ll), that is:
Now, having estimated (13) we can obtain an expression for E,R,+1 in terms
known at time t, namely:
+
E&+1 = W r *2R,
Equations (11) and (14) taken together are known as a vector autoregression
order 1) and they allow one to calculate all future values of E,rr+i to input
econometrician uses only a limited information set At(C $2,). The reason is
depends on R, (as well as r,). Hence all future values of r,+j depend on R, (
+
is TV;= f l ( / 3 , Q)R, f 2 ( / 3 , Q)r, where f l and f 2 depend on the estimated
of the VAR. The expectations hypothesis then implies R, = TV; which can
if f l ( P , Q) = 1 (and f 2 @ , Q) = 0). But if f l = 1 (and f 2 = 0) then TV;
naturally must move oneforone with actual R,. Hence even with a limited info
we expect the correlation between R, and TV;to equal unity and V R = var(R,
to also equal unity. This basic insight is elaborated below.
Turning briefly to the RVF for stock prices, note that if we allow both
+
discount factor yt = 1/(1 k , ) to vary over time then a VAR in D,and k, is
forecast all future values of E,D,+, and E,k,+; so that we can calculate our fo
the RHS of (1). There is an additional problem, namely the RHS of (1) is n
k, and D,. As we shall see in Chapter 16, linearisation of the RVF provides a
this technical problem.
There is another way that the VAR methodology can be used to test the EM
involves socalled crossequation restrictions between the parameters of each
equations (11) and (14), respectively. However, a brief explanation of these
possible here and they are discussed below.
If the reader thought at the end of the last chapter that he had probably a
given a near exhaustive (or exhausting!) set of tests of the various versions o
then it must be apparent from the above that he is sadly mistaken. These â€˜newâ€™
on the VAR methodology have some advantages over the variance bounds tes
based on the predictability of returns. But the VAR methodology also has som
drawbacks. It does not explicitly deal with time varying risk premia (this is
of Chapter 17). It relies on explicit (VAR) equations to generate expectations
linear with constant parameters and which are assumed to provide a good app
to the forecasts actually used by investors. In contrast, all the variance b
require is the somewhat weaker assumption that whatever forecasting schem
forecast errors are independent of R,. Hence to implement the variance bound
econometricianhesearcher does not have to know the explicit forecasting mode
used by investors.
We discuss the VAR methodology first with respect to the bond market in
since concepts and algebra are simpler than for FOREX and stock markets
discussed in the following chapters. At the outset the reader should note tha
the above examples of the VAR methodology have been conducted in terms o
of the stock price and dividends and the levels of the long rate and short ra
EH, in actual empirical work, transformations of these variables are used in o
and ensure stationarity of the variables. In the case of bonds we have alread
the EH equation (2) can be rewritten in terms of the longshort spread S, =
changes in the short rates Ar,+,. For stocks the RVF is nonlinear in dividen
time varying discount factor and therefore the transformation is a little mor
involving the (logarithm of the) dividend price ratio and the (logarithm of the
dividends. These issues are dealt with at the appropriate point in each chapter
This page intentionally left blank
The Term Structure and
Y Market
C n
This chapter outlines a series of procedures used in testing the EMH in the b
using the VAR methodology. For the bond market the VAR algebra is some
than for the stock market but some readers may still find the material difficu
be made of the simplest cases to illustrate points of interest: extending the ana
general case is usually straightforward but it involves even more tedious an
algebra. In the rest of the chapter we discuss:
How crossequation parameter restrictions arise from the EMH and RE.
0
The relationship between the likelihood ratio test and the Wald test of
0
using the VAR methodology.
How the parameter restrictions ensure that (i) investors do not systemati
0
abnormal profits and (ii) that investors (RE) forecast errors are independe
mation at time t used by agents in making the forecasts.
Illustrative empirical results at the short and long end of the maturity sp
0
government bonds, using the VAR methodology are presented.
Beginning with the simplest model of the EH of the term structure, the forecast
for the short rate is assumed to be univariate. The analysis is repeated fo
autoregressive system (VAR) in terms of the spread S, = R,  rf and the chan
rates Ar, . It is then shown how the crossequation parameter restrictions can be
in matrix notation, so that any VAR equation can be used to provide forec
theoretical spread Si and hence allow the comparison between Si and the ac
S f using variance ratios and correlation coefficients. Finally, a brief survey o
work in this area is examined.
14.1 CROSSEQUATION RESTRICTIONS AND
INFORMATIONAL EFFICIENCY
The pure expectations hypothesis (PEH) implies that the twoperiod interes
given by:
+ rf+@
R, = 0  f
AR Forecasting Scheme
+
Using (14.2) a test of the PEH RE is possible if we assume a weakly ratio
tations generating equation for ArF+l which depends only on its own past
example:
+ +
Art+i = a1 Art a2Art1 wr+l
(where we exclude a constant term to simplify the algebra). Agents are assu
the limited information set AI = (Art, Art1) to forecast future changes in in
= a1 Art k a2Artl
and the forecast error wt+l = Art+i  Arf+l is independent of At under RE. W
that under the null that the PEH t RE is true then equations (14.2) and (14
crossequation restriction. Substituting (14.4) in (14.2)
If the PEH is true then (14.3) and (14.5) are true and it can be seen that the
in these two equations are not independent. â€˜By eyeâ€™ one can see, for ex
the ratio of the coefficients on Art and Arr1 in each equation are ai/(ai/
i = 1,2). Consider the joint estimation of (14.3) and (14.5) without any rest
the parameters:
+ n2Art1 +
St = nlArt Vr+1
+ RE is true then (14.3) and (14.5) hold and hence we expect
If the PEH
=a l p , = a2/2
111 7t2
A error term has been added to the unrestricted equation (14.6) for the spread.
n
either because St might be measured with error or given our limited informatio
a,), error term picks up the difference between forecasts by the econometr
the
on Af = (r,, r f  l ) and the true forecasts which use the complete information
The econometrician obtains estimates of four coefficients n to 7t4 but the
l
two underlying parameters, al, a2, in the model. Hence there are two implicit
in the model and from (14.8) these are easily seen to be:
7t3 7t4
=2=
n1 7t2
These restrictions can be tested by comparing the loglikelihood values from
of the unrestricted equations (14.6) and (14.7) in which the coefficients 7ti
while the form of equation (14.6) is unchanged:
+ n2Arfl +
Sf = nlArf
The likelihood ratio test compares the â€˜fitâ€™ of the unrestricted twoequation s
that of the restricted system. The variancecovariance matrix of the unrestric
(assuming each error term is white noise but they may be a contemporaneous
i.e. a # 0) is:
,
[ 1
2
C=
, OW
4
awv OWV
The variances and covariances of the error terms are calculated from the res
each equation (e.g. a = CG:/n, a = Ci&Gf/n).The restricted system has
: ,
matrix Cr of the same form as (14.12) but the variances and covariances are
using the residuals from the restricted regressions (14.10) and (14.11)â€™ that is
respectively .
The likelihood ratio test is computed as
LR = n ln[(defC,)/(det C , ) ]
+
where n = number of observations and â€˜detâ€™ indicates the determinant of
ance matrix
det C = a a  (awv)2
$:
If the restrictions hold in the data then we do not expect much change in th
and hence det[Cr] FZ det[C,] so that LR % 0. Conversely, if the data do not c
the restrictions we expect the â€˜fitâ€™ to be worse and for the restricted residuals
(on average) than their equivalent unrestricted counterparts. Hence [] > [,
a,
: a:
]
det[C,] > det[C,]
and LR will be large. It may be shown that LR is distributed with a (c
squared distribution (x2) under the null, with q degrees of freedom (where q =
parameter restrictions). Thus we reject the null if LR > xz(q) where xc is the cri
(For a formal derivation of the likelihood ratio test see Harvey (1981) and C
et a1 (1992)).
Interpretation of the RE Restrictions
The two estimated equations (14.10) and (14.11) with the restrictions im
easily seen to be consistent with the zero abnormal profit condition. Unde
the (abnormal) profit AP from investing long rather than short is:
For the expected abnormal profit to be zero we require:
= 2E(SflAf)  E(Arr+lJAr)
the expected profit, conditional on A, is zero. Using the restricted equation
is also easy to see that S , = Ar:+1/2 regardless of the particular values of t
another way, the restrictions on the Xis are such that the PEH always holds. Of
we used the unrestricted equation (14.6) to forecast Ar,+l then since the latt
on 713 and n4, the above relationships would not hold.
The above crossequation restrictions may be given further intuitive appea
that they also imply that the (RE) forecast error is independent of the limited i
set assumed, that is they enforce the error orthogonality property of RE. T
error is:
+ +
Arr+l  E(Arr+llar) = (7r3Arf n4Arr1 o r + l >  2Sf
where we have assumed the PEH hypothesis (14.2) holds. Substituting from (1
+  2n2)Arrl + (o,+l  2vt+1)
 2nl)Ar,
 Arf+l = (773 (n4
Hence the expected value of the forecast error will not be independent of inf
time t or earlier unless:
n  2x1 = 7r4  2x2 = 0
3
But the above are just a simple rearrangement of the crossequation restricti
In fact this example is so simple algebraically that the restrictions in (14.9)
equationâ€™ but they are not nonlinear.
CrossEquation Restrictions: Addition of the Spread
A dispassionate commentator might remark that the AR forecasting schem
is very simple and that investors might use many more variables than this
future interest rates. The CampbellShiller VAR methodology recognises this
to the most obvious additional variable that investors might use. Given that in
supposed to believe in the PEH, then from (14.2) a variable that should b
predicting Art+l should be the spread S , itself. Hence, we now augment (14
and obtain a new expectations generating equation:
where 7r3 = a1, n4 = a2, 7r5 = bl. Proceeding as before and substituting the f
Arf+l in the PEH equation (14.2), we obtain:
Wald Test
Campbell and Shiller note a very straightforward way of estimating the restrictio
in the model. Since S , appears on both sides of (14.22), then the equation can
(for all values of the variables) if
1 = b1/2, 0 = a1/2, 0 = a2/2
(14.21), namely bl = 2, a1 = a2 = 0 (or 775 = 2, n = n4 = 0). In this simp
3
we have linear restrictions so the Wald test is the same as a t test on a se
cients. The benefit in using the Wald test is that one doesnâ€™t have to run an
regression on the restricted equation system as we did when we performed the
ratio test(3).
Essentially the Wald test implies that only S, is useful in forecasting Ar
coefficient in (14.21) should equal 2. This is because the PEH (14.2) impli
optimal forecast of Arf+,. Hence any additional variables in the expectations
equation (14.21), namely At,, Artl, should be zero as suggested by the Wal
It is also straightforward to show that the restrictions imply that the fo
Art+l  Arf+l is independent of all information at time t or earlier (use equa
and (14.21)) and that expected (abnormal) profits E,(AP) = E(Ar,+l)  2
(equation (14.21) says it all).
Summary: CrossEquation Restrictions
( 9 By assuming an expectations generating equation for Ar,+1 one can te
+
hypothesis of the PEH RE by running both the unrestricted and (t
restricted equations and applying a likelihood ratio test. The disadvan
procedure (in more complex cases) is that the restricted equation has to
lated algebraically and then estimated: sometimes this can be rather
implement.
(ii) The CampbellShiller methodology uses a Wald test which only requires
mate the unrestricted parameter estimates in the expectations generatio
(iii) Whichever test is used the (nonlinear) restrictions on the n ensure that (
i
(abnormal) profits are zero, (b) the PEH holds, that is S, = 2Arf+, at al
(c) that the forecast error for Art+l is independent of the variables in the
set at time t or earlier.
(iv) In the above tests one has to posit an explicit expectations generating e
+
Arf,,, and if the latter is incorrectly specified then tests of the PEH
fail not because the PEH is incorrect, but because agents use a different
scheme for Arf+l, resulting in biased parameter estimates.
It is interesting to compare the above tests involving crossequation restri
a direct test of the PEH which we discussed in Chapter 10 which only i
unbiasedness and orthogonality properties of RE
+ qr+l
rr+l = E,rr+l
Substituting in the term structure relationship (14.2) and adding 52, gives
+ cS2, +
S = a + bS,
: qf+l
where S = Art+1/2 is the â€˜perfect foresight spreadâ€™. Under the null of PEH
T
expect a = c = 0 and b = 1. The interesting contrast between these two type
priate form for the equation for Arr+1 a direct test based on (14.25) may
a positive feature in testing the PEH. However, using only a single equatio
result in some loss of â€˜statistical efficiencyâ€™ compared with estimating a tw
system and using the crossequation tests or the Wald test approaches. Thes
explored further in Section 14.3 when we discuss conflicting results from emp
in this area.
14.2 THE VAR APPROACH
In the previous example we only had to make a oneperiod ahead forecast of
When multiperiod forecasts are required we need an equation to forecast futu
the spread S,. The latter can be done by using the CampbellShiller vector aut
(VAR) approach which involves matrix notation. As before, this approach i
using a simple example. The PEH applied to a threeperiod horizon gives:
+ r;+1 + r;+,>
R, = i(rt
which may be reparameterised to give:
+
St =
where S, = RI  r, is the longshort spread, = (rf+l  r,) and Arf+
rf+l). Now assume that both S, and Ar, may be represented as a biva
autoregression of order one (for simplicity):
or in vector notation:
+
=
Zf+l Or+]
where z,+l = (Sr+l,Ar,+l)â€™, A is the (2 x 2) matrix of coefficients aij,
From (14.30) the optimal prediction of future zâ€™s using the
(wl,+l, wr+l)â€™.
for forecasting is:
Now let elâ€™ = ( l , O ) , and e2â€˜ = (0, 1) be 2 x 1 selection vectors. It follows t
S, = elâ€™z,
E, Ar,+1 = e2â€˜z:+, = e2â€˜Az,
E,Ar,+2 = e2l ze, + ˜ e2â€™A2zr
=
Substituting the above in the PEH equation (14.27):
f ( a ) = ell  e2â€™ ($A + $4â€™) = 0
where the f(a) has been defined as the set of restrictions. Hence a test o
plus the forecasting scheme represented by the VAR simply requires one to e
unrestricted VAR equations and apply a Wald test based on the restrictions i
Wald Test
It is worth giving a brief account of the form of the Wald test at this point. Afte
our 2 x 2 VAR we have an estimate of the variancecovariance matrix of the
VAR system which as previously we denote
The variancecovariance matrix of the nonlinear function f ( a ) in (14.37) is
where fa(a) is the first derivative of the restrictions with respect to the aij
The Wald statistic is:
There is little intuitive insight one can obtain from the general form of the
(but see Buse 1982). However, the larger is the variance of f(a) the smaller
of W .Hence the more imprecise the estimates of the A matrix the smalle
the more likely one is â€˜to passâ€™ the Wald test (i.e. not reject the null). In
the restrictions hold exactly then f ( a ) % 0 and W % 0. It may be shown tha
standard conditions for the error terms (i.e. no serial correlation or heteros
etc.) then W is distributed as central x2 under the null with r degrees of freed
r = number of restrictions. If W is less than the critical value xz then we do no
null f ( a ) = 0. The VARWald test procedure is very general. It can be appli
complex term structure relationships and can be implemented with high order
VAR. Campbell and Shiller show that under the PEH, in general, the sprea
nperiod and mperiod bond yields (n > m) denoted St(nâ€™m) may be represente
where Aâ€ťr, = r,  rrm and k = n/m (an integer). For example, for n =
st(47â€ť  r:â€™) and:
= Rt(4â€™
+ a22Arf] + [a23Sr1 + a24Arr11
Arf+l = [a2lSf
+ [a25Sf2 + a26&2]+ .  + wt+l *
However, having obtained estimates of the aij in the usual way, we can re
above 'high lag' system into a first order system. For example, suppose we h
of order p = 2, then in matrix notation this is equivalent to:
a11 a12 a13 a14
0
0.
Art1
Equation (14.50) is known as the companion form of the VAR and may be
written:
+
Zt = AZr1 mr+l
where Zfr+l= [&+I, Arf+l, S,, Art]. Given the ( 2 p x 1) selection vec
[l,O , O , 01, e2 = [0, l , O , 01 we have:
= e2'AJZt
E, At::;
where in our example n = 4, m = 1, p = 2. If (14.44) are substituted into the g
equation (14.40), Campbell and Shiller demonstrate that the VAR nonlinear
are given by:
f(a) = e l f  e2'A[I  (rn/n)(I  A")(I  A")'](I  A)' = 0
which for our example gives
f*(a) = el'  e2'A[I  (1/4)(I  A4)(I  A)'](I  A)' = 0
and this restriction can be tested using the Wald statistic outlined above.
Interpretation
Let us return to the threeperiod horizon PEH to see if we can gain some
how the nonlinear restrictions in (14.36) arise. We proceed as before and de
optimal forecasts of Ar,+l and Arr+2 from the VAR. We have from (14.29) a
and (14.28):
Using (14.29) and (14.49) in the PEH equation (14.27):
s = 3(a21Arr + a22Sr) + :[(a;, + a22all)Arr + a22(a21 + a12)SrI
r2
s = [fa21 + +(a;1 + a22a11)] Arr + [$a22+ f a 2 2 h + a1211 s
r
r
s = fl(a)Arr + f2(a)Sr
r
Equating coefficients on both sides of (14.50) the nonlinear restrictions are:
+ ;(.;I + a22a11)
0 = f 1 (a) = +21
= p 2 2 + $22@21 + a12)
1 =f 2W 2
It has been rather tedious to derive these conditions by the longhand method
tion and it is far easier to do so in matrix form as derived earlier since we al
general form for the restrictions (suitable for programming for any values of n
The matrix restrictions in (14.37) must be equivalent to those in (14.51). Clea
linear element comes from the A2 term while the lefthand side of (14.51) c
to the vector el. It is left as a simple exercise for the reader to show that for
a12
all
A= [a21 U221
the restrictions in (14.51) are equivalent to those in (14.37). As before, the
crossequation restrictions (14.51) ensure that
(i) Expected (abnormal) profits based on information 2,in the VAR are ze
(ii) The PEH equation (14.27) holds for all values of the variables in the V
(iii) The error in forecasting Arr+l,Arr+2 using the VAR is independent of
at time t or earlier (i.e. of Srj, Arrj for j 3 0). The latter is the or
property of RE.
It is now straightforward to demonstrate how difficult it can become to formula
mate the restricted model and hence perform a likelihood ratio test of the restric
+
for the threeperiod horizon case. The unrestricted VAR consists of (14.28)
obtain the restricted VAR we have to use (14.51) to obtain a21 and a 2 2 in term
other aijs and then substitute these (two) expressions in (14.29) which togeth
(unchanged) equation (14.28) constitute the restricted model. The algebraic ma
required become horrendous as either the horizon in the PEH or the lag le
VAR is increased.
The Advantages of the VAR Approach
+
(i) To test the PEH RE restrictions we need only estimate the unrestricted
in the VAR. The Wald test on the parameters of the VAR can be formula
general case of any n and rn (for which k = n/rn is an integer).
The! Ciisadvantages are:
( 0 An explicit forecasting scheme for ( S i , Art) is required which may be m
and hence statistical results are biased.
(ii) The Wald test may have poor small sample properties and it is not inva
precise way the nonlinear restrictions are formed (e.g. Gregory and Ve
+
Hence the Wald test may reject the null hypothesis of the PEH RE beca
â€˜slight deviationsâ€™ in the data from the null hypothesis. For example, if f 2
in (14.51) but the standard error on f2(a) was 0.003 one would rej
(on a t test) but an economist would still say that the data largely su
PEH. Campbell and Shiller (1992) recognise that the Wald test restricti
+
rejected and yet the PEH RE may provide a â€˜reasonable modelâ€™ of th
of interest rates.
Further Testable Implications of the PEH Using the VAR Methodology  Th
Theoretical Spread Si
Campbell and Shiller (1992) suggest some additional â€˜metricsâ€™ for measuring th
success of the PEH and these are outlined for the threeperiod horizon model
(i.e. n = 3, rn = 1):
+
S , = $ArP,, iAr;+,
We have seen that the RHS of (14.52) may be represented as a linear pred
the estimated VAR. If we denote the RHS as the theoretical spread S: then:
+ $A2)Zi = f(A)z, = fl(a)Ar, + fz(a)S,
Si = e2â€™ ( $ A
The theoretical spread is the econometriciansâ€™ â€˜best shotâ€™ at what the true (RE
(the weighted average of) future changes in shortterm interest rates will be.
(14.52) is correct then f2(a) = 1 and f l ( a ) = 0 and hence Si = S , and th
actual spread S , should be highly correlated with the theoretical spread. In t
the latter restrictions will (usually) not hold exactly and hence we expect Si f
to broadly move with the actual spread. Under the null hypothesis of the PE
following â€˜statisticsâ€™ provide useful metrics against which we can measure th
the PEH.
(i) The correlation coefficient corr(S,, S : ) between S , and Si should be cl
and in a regression
++
Si a PS, vt
we expect cy = 0 and P = 1.
(ii) Because the VAR contains the spread then either the variance ratio or
standard deviations:
VR = var(S,)/ var(Si)
(iii) It follows that in a graph of S , and Si against time, the two series sho
move in unison.
(iv) The PEH equation (14.52) implies that S, is a sufficient statistic for fut
in interest rates and hence S, should â€˜Granger causeâ€™ changes in interes
latter implies that in the VAR equation (14.29) explaining Ar,+1, the
own lagged values should, as a group, contribute in part to the explanati
(socalled block exogeneity tests can be used here).
(v) Suppose RI and rr are 1(1) variables. Then Arl+j is I ( 0 ) and if the PE
then from (14.27) the spread S , = R,  r, must also be I(0). Hence R,
be cointegrated, with a cointegration parameter of unity. That is, given
series R, and r, should broadly move together.
It is worth noting that if the econometrician had the â€˜trueâ€™ RE forecasting sche
investors then S, and S: would be equal in all time periods. The latter stateme
imply that rational agents do not make forecasting errors, they do. However,
the market with reference to the expected value of future interest rates. Ther
have an equation that predicts the expected values actually used by agents in
then S, = Si for all t .
Perfect Foresight Spread
+
It is convenient here to remind the reader of a test of the PEH RE based on
foresight spread ST, although it must be stressed that this test has nothing to
VAR methodology. The logic of this test using S, and S: is set out in Chap
summarised here for the threeperiod case. The test does not use an explicit
equation for E , Ar,+j but merely invokes the unbiasedness and orthogonality a
of RE:
+
Art+j = Et (Art+ j IQ ) qt+ j
r
Substituting in equation (14.27) and rearranging:
+ $!b+2] + [$If+, + &r1,+2]
= s,
[+,+I
Now define the LHS of (14.57) as the perfect foresight spread S and note that
:
var(S:) < var(S,). Also, in the single equation regression:
+
under the null hypothesis of PEH RE, we expect H o : a = c = 0, b = 1
variables in (14.58) are dated at t or earlier and are independent of the RE fore
Hence OLS on (14.58) provides unbiased estimates. However, q;+l is MA (
also be heteroscedastic, so the standard errors from OLS are invalid but corre
errors are available using a GMM correction to the covariance matrix (see C
One word of warning. Do not confuse the perfect foresight spread S l wit
retical spread S: used in the VAR methodology. The perfect foresight spre
the future.
At this point the reader will no doubt like to refresh his memory concerning
concepts presented for evaluating the EH before moving on to illustrative empi
in this area.
14.3 EMPIRICAL EVIDENCE
In his study using the VAR methodology. Taylor (1992) uses weekly data
3 pm rates) on threemonth Treasury bills and yields to maturity on 10,
year UK government bonds (over the period January 1985November 1989
strongly against the EH under rational expectations. Using the VAR methodolo
(i) spreads do not Granger cause changes in interest rates (ii) the variance
are in excess of 1.5 for all maturities and they are (statistically) in excess o
(iii) the VAR crossequation restrictions are strongly violated.
MacDonald and Speight (1991) use quarterly data for 1964 1986 on a re
single government â€˜long bondâ€™ (i.e. over 15 years to maturity for five OECD
For the UK, the VAR restrictions, Granger causality and variance ratio te
indicate rejection of the PEH (although correlation coefficient between S, and
for the UK at 0.87). For other countries, Belgium, Canada, the USA and Wes
the results are mixed but in general the Wald test is rejected and the varianc
in excess of 1.5 (except for the UK where it is found to be 1.29). However,
errors are given for the variance ratios and so formal statistical tests cannot be
(See also Mills (1991) who undertakes similar tests on UK data over a long sam
namely 1871 1988.)
Campbell and Shiller (1992) use monthly data on US government bonds fo
of up to five years including maturities for 1, 2, 3, 4, 6, 9, 12, 14, 36,
months for the period 19461987. Their data are therefore towards the s
the maturity spectrum for bonds. Generally speaking they find little or no
the EH at maturities of less than one year, from the regressions of the perfe
spread Sy on the actual spread S,,their j values being in the region 00.5,
3
close to unity. Similarly the values of corr(S,, Si) are relatively low being i
00.7 and the values of VR are in the range 210 for maturities of less tha
At maturities of four and five years Campbell and Shiller (1992) find more
the EH since the variance ratio (VR) and the correlation between S, and S
to unity. However, Campbell and Shiller do not directly test the VAR cro
restrictions but this has been done subsequently by Shea (1992) who in genera
are rejected.
Cuthbertson (1996) considers the PEH of the term structure at the very shor
maturity spectrum and is therefore able to use spot rates. (See also Cuthbertson
for results using data on German spot rates.) His data consist of London Inter
rates for maturities of 1, 4, 13, 26 and 52 weeks. The complete data set is sam
(Thursdays, 4 pm rates) beginning on the second Thursday in January 1981
on the second Thursday of February 1992 giving a total of 580 data points. Th
::;
12.0
1
100
8.0,
â€™
6.0 I I
I I
I I I I I I I
I
211 289 337 385 133 181 429
19 97 115 193
Tim

1Week Interest Rate
Interest Rate 
52Week
Oneweek and 52week Interest Rates.
Figure 14.1
and 52week yields are graphed in Figure 14.1: these rates move closely tog
long run (i.e. appear to be cointegrated) but there are also substantial movem
spread, SIâ€ť.â€ť.
The regressions of the perfect foresight spread S ˜ â€ť l â€ť â€ťon the actual spread
)
the limited information set A , (consisting of five lags of S˜â€™â€ťâ€ť) and ARIâ€˜â€ťâ€™)ar
Table 14.1. In all cases we do not reject the null that information available
earlier does not incrementally add to the predictions of future interest rates, thus
the PEH + RE. In all cases, cxcept that for S;â€˜.â€™) (the fourweekhneweek spre
d o not reject the null H u : = 1, thus in general providing strong support for
rnay he due to thc short investment period
The rcjcction of thc null for SI4,â€™)
misalignment of investment horizons, since four oneweek investments rnay
fall on a Thursday four weeks hence.
Cuthbertsonâ€™s results from the VAK models for Sln.m) and AK:â€™â€śâ€™ indicate
Granger causes ARIâ€śâ€ť: a weak test of the PEH. (There is also Granger cau
AR:â€ťâ€ť to Sjnâ€™m) indicating substantial feedback in the VAR regressions.) Cuthb
finds that for all maturities there is a strong correlation (Table 14.2, column
the actual spread S, and the predicted or theoretical spread S: from the forecas
VAK. The variance ratio (VK) = var(S,)/ var(Si) yields point estimates (col
within two standard deviations of unity in 5 out of 8 cases. Hence on the ba
two statistics we can broadly accept the PEH under weakly rational expectati
0.033 (0.14) 88
0.97 (0.23) 83
47
0.018 (0.13) 1.32 (0.44) 74
1.22 (0.30) 46
0.019 (0.10) 73
1.17 (0.21) 42
0.064 (0.25) 58
(0.01
0.069 (0.02) 0.73 (0.06) <0.01
0.98 (0.07) 2.0
0.166 (0.06) 82
0.133 (0.12) 1.02 (0.10) 40
0.86
0.164 (0.25) 1.09 (0.15) 52
54
The regression coefficients reported in columns 2 and 3 are from regressions with y = 0 impos
sample period is from the second Thursday January 1981 to the 2nd Thursday February 1992.
for leads and lags this yields 540 observations (when y = 0 is imposed) and 524 (when y # 0). T
estimation is G M M with a correction for heteroscedasticity and moving average errors of orde
using Newey and West (1987) declining weights to guarantee positive semi definiteness. The last
are marginal significance levels for the null hypothesis stated. For H 2 : y = 0 the reported res
information set which includes five lags of the change in short rates and of the spread (lon
qualitatively similar results).
Table 14.2 Tests of the PEH using Weakly Rational Expectations
(2) (3)
Wald Statistic, W(.) var(S,)/ var(Si) R2
[.I = critical value (5%) (.) = std. error (,
0.84 (0.44) 0.
W(6) = 26.3 [12.6]
0.
W(6) = 10.3 [12.6] 0.37 (0.20)
W(4) = 6.3 [9.5] 0.
0.50 (0.20)
0.
W(4) = 7.3 [9.5] 0.61 (0.18)
0.
1.82 (0.42)
W(8) = 29.9 [15.5]
W(8) = 27.3 [15.5] 0.
1.18 (0.26)
1.00 (0.25) 0.
W(8) = 16.5 [15.5]
0.86 (0.23) 0.
W(8) = 10.2 [15.5]
Wald statistics and standard errors are heteroscedastic robust.
By way of illustration the graph of S, and Si from Cuthbertson (1996) i
( n , rn) = (4, 1) in Figure 14.2. The R2 of 0.98 indicates that the lagged spre
the direction of change in future interest rates but the point estimate of t
ratio (= 1.8) suggests that the quantitative impact of ( S , , ARj"') on future
interest rates is too small relative to that required by the PEH under rational e
The Wald test of the restrictions of the VAR coefficients is rejected at short h
contrast, Hurn et a1 (1993) using monthly LIBOR rates (19751991) find th
tests are not rejected. Shea (1992) notes that, particularly when using overla
the Wald test rejects too often when the EH is true and this may, in part, acc
different results in these two studies.
Given that long rates R, and short rates r, are found in all of the above
+
be 1(1), then a weak test of the PEH RE is that Rt and r, are cointegra
cointegration parameter of unity. This is always found to be the case in the s
so that the spread St = R,  r1 is I ( 0 ) for all maturities (rn and n ) .
Figure 14.2 Spread and Theoretical Spread (n,m) = (4, 1).
While it is often found to be the case that, taker2 as a pair, any two in
are cointegrated and each spread S:â€ť.â€™â€ť)is stationary, this cointegration proce
undertaken in a more comprehensive fashion. If we have q interest rates wh
then the EH implies (see equation (14.40)) that ( q  1) linearly independent sp
are cointegrated. We can arbitrarily normalise on n = 1, so that the cointegra
z
p = R ; ? j  R y , sjâ€ť.]â€™ Rj3â€™  R:â€ť, etc. The socalled Johansen (1988
are 

allows one to estimate all of these cointegrating vectors simultaneously in a
form (see Chapter 20):
where X, = (Râ€ś), Râ€˜â€ť, . . . Rq),. One can then test to see if the number of c
= Pz = .
vectors in the system equals q  1 and then test the joint null H o :
Both are tests of the PEH. Shea (1992) and Hall et a1 (1992) find that althou
not reject the presence of q  1 cointegrating vectors on the US data, never
frequently the case that not all the P, are found to be unity. Putting subt
issues aside, a key consideration in interpreting these results is whether the
adequate representation of the data generation process for interest rates (
parameters constant over the whole sample period). If the VAR is acceptabl
For the most part, long and short rates are cointegrated, with a cointegra
eter close to or equal to unity.
Spreads do tend to Granger cause future changes in interest rates for mos
The perfect foresight spread Sy is correlated with the actual spread S, w
ficient which is often close to unity and is independent of other informa
t or earlier.
The theoretical spread S (i.e. the predictions from the unrestricted VAR
i
is usually highly correlated with the actual spread although the variance
is quite often not equal to unity.
the crossequation restrictions on the VAR parameters are often rejecte
+
In broad terms, we can probably have less confidence in the PEH RE as we
the results of test (i) to test (v). Although note that the failure of the Wald t
imply a severe rejection of the EH, if the model is supported by the other te
Why Such Divergent Results?
Can we account for those divergent results as far as the PEH is concerne
stronger support for the PEH is given by the perfect foresight regressions in
with those from the VAR approach (particularly the rejection of the VAR cro
restrictions). One reason for this is that the perfect foresight regressions which
implicitly allow potential future events (known to agents but not to the econ
to influence expectations, whereas the VAR approach requires an explicit
set known both to agents and the econometrician at time t or earlier. The
shortterm instruments is often heavily influenced by the governmentâ€™s mon
stance and in â€˜second guessingâ€™ the timing of interest rate changes by centr
periods of government intervention (influence) any purely â€˜backwardlookingâ€™
might be thought to provide poor predictors of future changes in interest rate
+
the rational expectations assumption r,+j = Err,+j U,+, only requires unbia
may suffer less from this effect. Hence on this count one might expect the perfe
regressions to perform better than the VAR approach and to yield relatively gre
for the PEH hypothesis (if it is indeed true).
The above institutional detail might also explain why we find a high R2
actual spread S, and the predicted theoretical spread Si using the VAR bu
estimate of the variance ratio VR = var(S,)/var(Si) is often in excess of un
horizons. The high R2 implies that for must time periods the correlation betw
S is high and this may be due to relatively â€˜long periodsâ€™ when there is
j
government intervention in the market. However, on the (relatively few) occa
government pronouncements and actions are expected to impinge upon the m
near future, one might expect the VAR to underpredict future changes in sho
hence var(S,) to be greater than var(Si). (Such points would then be represe
large â€˜spikesâ€™ in Figure 14.2.)
future changes in interest rates via the chain rule of forecasting is less th
by the PEH. But in contrast to (i), single equation perfect foresight regres
reject the null that H, influences future changes in interest rates. Hence r
the VAR restrictions is probably due to the â€˜low weightâ€™ given to S ,  j in
regressions for ARjmâ€™.Why might the latter occur? Two reasons are suggest
agents use alternative (nonregression) forecasting schemes (e.g. chartists, see
Taylor (1989)) the VAR methodology breaks down. Second, if agents actu
the VAR regression methodology for forecasting in financial markets, one w
them to utilise almost minutebyminute observations (S,, AR,): hence forecas
(even) weekly data seem unlikely adequately to mimic such behaviour.
The frequency of the data collection, the extent to which rates are recorded
raneously (i.e. are they recorded at the same time of day?) and any approxim
in calculating yields might explain the conflicting results in each of the abo
For example, Campbell and Shiller (1991) use monthly data and for maturi
+
than one year their results are less supportive of the PEH RE than Cuthbert
One might conjecture that this may be due to (i) the weekly data used in C
means that the information set used in the VAR more closely approximates t
agents and (ii) Cuthbertson uses data on pure discount instruments as oppose
bell and Shillerâ€™s data which uses McCullochâ€™s (1987) approximation for yie
discount bonds based on interpolation, using cubic splines. (If the latter app
are less severe the longer the term to maturity, this may account for Campbell a
relatively more favourable results for the EH at longer maturities.)
Holding Period Yield
Taylor (1992) also tests the various term structure theories using excess hol
yields (over 13 weeks) (Hl:{3  r,) and in particular he tests two variants. Fi
against the proposition that the excess holding period yield (HPY) depends
varying term premium (which is modelled using a GARCH process, see C
Second, he tests the market segmentation hypothesis by running the regressio
+ B(po)t + &r+13
(H!r{3  rt) = a
where PD is the amount of debt of maturity n outstanding as a proportion of to
ment debt. Taylor finds > 0 for maturities n = 10, 15 and 20 years, which s
market segmentation hypothesis at the relatively long end of the maturity spe
Taylorâ€™s (1992) results using UK weekly data for long maturities support
segmentation hypothesis. However, the results in Cuthbertson (1996) and
(1993) suggest that for maturities of less than one year the PEH plus an
of â€˜rationalâ€™ or â€˜weakly rationalâ€™ expectations broadly characterises behaviou
in the UK interbank market. The above seemingly contradictory results are not
inconsistent. In the UK specialist agents (basically the corporate treasury dep
large companies and money brokers) deal almost exclusively at the short ma
large changes in portfolio composition may not be required to equalise expecte
alter at the margin in response to changes in relative yields (Friedman and R
and Barr and Cuthbertson, 1991). The latter are brought about when the author
the composition of government debt across the maturity spectrum.
SUMMARY
14.4
Throughout this chapter interim summaries have been provided, since for so
the material may appear, on first reading, somewhat complex. Therefore a bri
is all that is required here.
The VAR methodology provides a series of statistical tests based on a
of the actual spread S , and the â€˜bestâ€™ forecast of future changes in in
given by the perfect foresight spread, S:.
The â€˜metricsâ€™ provided by the VAR approach include a variance ratio te
of crossequation restrictions on the parameters of the VAR. Only if the
will the zero expected profit condition of the EH (with zero transaction
the orthogonality conditions of the RE hypothesis hold.
If the spread and the change in interest rates are stationary then standard
dures can be used.
Empirical results from the VAR methodology must be viewed as com
to those based on variance bounds inequalities discussed in Chapter 10
results on the validity of the EH, using the VAR methodology, are some
as is often the case in applied work.
On balance the VAR approach suggests that for the UK, the EH (with a tim
term premium) provides a useful model of the term structure at the short e
maturities of less than one year) but not at the long end of the maturity sp
vice versa for US data.
All of the tests in this chapter assume a time invariant risk premium but m
relax this assumption are the subject of Chapter 17.
ENDNOTES
1. The issues discussed here are quite subtle. Consider the following identity
forecast using all relevant information, R, :
where &,+I is the revision to expectations as more information becomes a
under RE has a zero mean value and is independent of the limited info
A,. Therefore, the econometricianâ€™s forecast of V, differs from the agen
Hence we expect P: and P, to have some correlation and for in (10)
but for VR > 1.
2. How the error term arises in (14.6) can be demonstrated by using the
footnote 1, namely:
+
W h + l IQ,) = W b + l 1Af 1 Ef+l
Substituting in the PEH, S, = E(Ar,+lJS2,)/2 we have:
where u,+1 = (˜,+1/2)and E(u,+IJA,)= 0 and u,+1 is independent of
Art1).
3. The restricted equation in this simple case is trivial and is given by the
of (Ar,+1  2S,) on a constant. The loglikelihood from this restricted r
then compared with that from the unrestricted equation (14.21) with a co
4. Note that in this simple example the two types of test become identical,
in (14.25) â€˜containâ€™ Ar, and Ar,1 only. However, in general the two app
differ.
15
I
2
The FOREX Market
The previous chapter dealt with the VAR methodology in some detail. Broadl
the VAR approach can be applied to any theoretical model involving multiperio
and which is linear in the variables. Not surprisingly, therefore, it can also be
efficiency in the forward and spot markets using the uncovered interest parity
forward rate unbiasedness (FRU) conditions outlined in Chapter 12. This cha
outlines how the VAR methodology can be applied in these cases and
illustrates the VAR methodology applied to FRU and UIP
0
discusses some practical matters in choosing the appropriate VAR
0
presents some recent empirical tests of FRU and UIP and provides an
0
example of tests of a forwardlooking version of the flexprice monetary m
spot rate, using the VAR methodology
15.1 EFFICIENCY IN THE FOREX MARKET
The Forward Market
Chapter 12 analysed tests of forward market unbiasedness, Efsf+l= fr, base
equation tests for any one pair of currencies (or a set of single equations
increase the statistical efficiency of the tests using a ZSURE estimator). Our te
will now be extended, in a statistical sense at least, by considering a VAR syst
to forecast over multiperiod horizons. The first equation is:
+ a12fpt + Wlr+1
= all As,
ASf+l
where wlr+l is white noise and independent of the RHS variables and f p r
the forward premium. In the one period case when the forward rate refers to
+
time t 1, then the test of FRU is very simple. Under the null of risk neutra
we expect
Ho :˜ 1 = 0, a12 =1
1
The above equation then reduces to the unbiasedness condition for the forwa
= fr
EfSf+l
+ (a12  Mf +
 EtASt+l = all As,  s)t
b+l Wlf+l
is independent of the limited information set At = ( A s t ,f p t ) . In the one step
we require only equation (15.1) to test FRU. However, we now consider a two
prediction (which is easily generalised to the case of rn step ahead prediction
TwoPeriod Case
ft for six mo
Suppose we have quarterly data but are considering forward rates
hence FRU is:
EtSt+2  st = EfA2St+2 = f P r
To forecast two periods ahead we use the identity
Leading (15.1) one period forward, to forecast E1Asr+2 require a forecast of
we
Hence we require an equation to determine f p which can be taken to be
+ a22fPt + W2t+l
f =a 2 1 b
Pt+l
Equations (15.1) and (15.8) are a simple bivariate vector autoregression (VAR
are I(1) variables but ( s t , ft) have a cointegrating parameter (1, 1) then f p
is I ( 0 ) and all the variables in the VAR are stationary. Such stationary variab
represented by a unique infinite moving average (vector) process which may
to yield an autoregressive process (see Hannan (1970) and Chapter 20). A
system consisting of equations (15.1) and (15.8) can be used as an illustration
It can be shown that the FRU hypothesis (15.4) implies a set of nonli
equation restrictions among the parameters a,j and these nonlinear restricti
that the twoperiod forecast error implicit in (15.6) is independent of inform
( A s t ,f p t ) . For illustrative purposes these restrictions are derived by â€˜substi
then shown how they appear in matrix form and how we can easily incorpo
of high order and use it to forecast over any horizon. Using (15.7), (15.1) and
easy to see that
+ EtASt+l
EtA2sr+2 = E&t+2
+ a12fpt) + alz(a21Asr + a22fpt)
= all(a1lAst
+ (all ASt + a12fpO
Collecting terms and equating the resulting expression for EtA2st+2 with f p t
EtA2St+2 = 81 As1 + 82EfPr = f P t
where
+ a12a21 4 ail
01 =
+
02 = a m 2 k ˜ a12
12˜22
A2st+2  Er A2sr+2
which given (15.6) and the VAR (15.10) is given by:
+ 62fpr  fPr + qt+l
61Asr
where qt+l depends on Wir+l ( i = 12). If 61 and 62 are unrestricted then th
value of the forecast error will in general depend on (Asf, fpt), that is inf
time t. It is only if 81 = 0 and 62 = 1 that the orthogonality property of RE h
previous chapter we noted that these restrictions can be tested using a likelihoo
of the restricted versus unrestricted VAR. From (15.11) we see that the rest
be rearranged to give
+ ad/a12
a21 =  a d 1
a22 = [1  a12(1 + a11)1/a12
Hence the restricted VAR is
+ a12fpf
= all Asf
The individual coefficients (all, al2) in equation (15.17) are constrained to
same value in (15.16). The loglikelihood value from the restricted system
+
(15.17) can be compared with that from the unrestricted system (15.1)
noted in the previous chapter if the difference in loglikelihoods is large (â€˜s
the restrictions are rejected (not rejected).
The problem with the above is that the restrictions have to be worked out an
programmed into the VAR equations which are estimated subject to nonlinea
techniques. Both these tasks can become â€˜complexâ€™ when the VAR has a la
of lags or the forecast horizon in the FRU equation is large. As we have alr
a computationally simpler procedure is to estimate the unrestricted (linear in
VAR and undertake a Wald test directly on the unrestricted aij coefficient e
can now be quickly demonstrated how the above problem (with lag length
be represented in matrix form and how it can be generalised. The matrix
unrestricted VAR is:
+
zr+l = & wr+l
where
It follows that
Hence the FRU hypothesis (15.6) implies:
+
elâ€˜(A A2)zt = e2â€™q
e2â€™  elâ€™(A + A2) = 0
where
It is easy to see that (15.23) are the same restrictions as were worked out
substitutionâ€™. A Wald test of (15.23) is easily constructed. The procedure can
alised to include any lag length VAR since a pth order VAR can always be w
the A matrix in companion form, as a first order system. A forward predictio
for any horizon rn is given by
and hence FRU for an mperiod forward rate fpjâ€śâ€™ is
m
C elâ€˜Aâ€™z, = f p j m )
EtAmst+m =
i= 1
The FRU restrictions for an rnperiod horizon are:
m
f (A) = e2â€™  e l â€™ x A i = 0
i= 1
which can be tested using the Wald procedure. We can also use the VAR pre
yield a series for the theoretical forward premium for rn periods,
m
1elâ€™Aizl
f pjmâ€ť =
i= 1
(see equations (15.10) and (15.26)) which can then be compared to the actu
premium (using graphs, variance ratios and correlation coefficients).
The Spot Market and Forward Market
The uncovered interest parity (UIP) condition can be applied over a multiperi
and the VAR approach used in exactly the same way as described above. The m
should equal the expected change in the exchange rate over the subsequent f
(i.e. rn = 4). The above UIP equation is similar to the FRU equation (15.26
have (rt  r,?)on the RHS and not fpjmâ€˜. However, it should be obvious that
for a VAR in As, and (r,  r:) goes through in exactly the same fashion
If the crossequation restrictions on the parameters of the VAR are not rejec
do not reject the multiperiod UIP condition. What about testing FME and U
This is easy too. Consider the trivariate VAR where z:+˜= (As, fp, r  r*
obtained the unrestricted estimates A (and put them in companion form) w
two sets of restrictions of the form
2 elâ€™Ai  e2â€˜ = 0 FRU
i=l
2 elâ€˜Aâ€˜  e3â€˜ = 0 UIP
i= 1
where the vectors eJ have unity in the Jfh element and zeros elsewhere (
We can test each restriction separately, thus testing FRU or UIP only, or tes
+
together (i.e. a joint test of FRU UIP). Of course, if we accept that cove
parity holds then rejection of either one of FRU or UIP should imply reje
other.
How Big Can a VAR Get?
We noted in Chapter 14 that if we have unbiased estimates of the true param
VAR then adding additional variables will not, in principle, rescue tests of
they have failed the Wald test on a limited information set. The key assumptio
is unbiased parameter estimates. We noted that with a finite sample, â€˜biasâ€™ is
hence additional variables may make a difference to the Wald test of the VAR
It is convenient at this point to elaborate on these arguments surrounding the
â€˜sizeâ€™ of the VAR since it is a key element in evaluating empirical work.
The reader will have noticed that FRU can be tested in a VAR that co
variables (As, fp. rt  r:) or two variables (As, fp). Which is better? In fa
greatly expand the number of variables in the information set (e.g. add st
longterm interest rates, etc.) even when testing just the FRU.
The first thing to note is that a VAR of rn variables can always be reduc
q variables, where q < rn, and in fact we can even reduce the system to q
an autoregressive equation. To illustrate this consider the 3 x 3 VAR system
above, with lag length p = 1 for ease of exposition
tuting (15.34) for d,, in the RHS of the first two equations (15.31) and (
As, and f p , as dependent variables we obtain two equations for As, and f p
only on their own lagged values. Similarly we could now repeat the proced
obtaining f p , = g ( A S ,  j ) and substitute out for this in the equation with
dependent variable, giving an ARMA model for As, (see Chapter 20).
Statistically, the choice between a univariate model or a multivariate VAR d
tradeoff between the statistical requirements of lack of serial correlation and he
ticity in the errors, the overall â€˜fitâ€™ of the equations and parsimony (i.e.
explanation with the smallest number of parameters). Test statistics are availab
among these various criteria (e.g. Akaike and Schwartz criteria) but judgemen
required even when using purely statistical tests. This is because the statist
are likely to conflict. For example, maximising a parsimony criterion like
information criterion might result in serially correlated errors.
To the above statistical criteria an economist might have prior views (base
and gut instinct) about what are the key variables to include in a VAR. H
a VAR is a reduced form of the structural equations of the whole econom
always likely to be a rather disparate set of alternative variables one might i
VAR. Hence empirical results based on any particular VAR are always ope
on the grounds that the VAR may not be â€˜the bestâ€™ possible representation of
particular, the stability of the parameters of the VAR is crucial in interpreting
ńňđ. 10 