. 5
( 6)


There are many event-type exits. For example, you might exit on the
pattern. This is an example of what a well-designed filter is meant to do.
first profitable opening-a classic exit used by Larry Williams. Or, you
After you have developed your entries, filtered them, and finished test-
might exit on a target profit. Another example that works well is to exit
ing them, you can develop your exits.
a trade based on the action of a given technical indicator; for example,
we can exit a position once we see divergence between price and an

Exits are much more complex than entries. In general, there are two dif-
ferent types: (1) an exit used to protect a profit or limit a loss, and (2) an
exit used after an event or a target occurs.
After you have designed the entries and exits for your system, you need
Let™s first discuss the protective-type exits. They exit the market when
to select the proper parameters to use in each rule. For example, if you
the market moves against a trader™s position to a given level. The classic
exit at an N-day low on a long position, what should be the proper value
exit of this type is the N days low for buys and the N days high for sell.
for N? This process is normally done using optimization. Optimization
Sometimes, depending on the entry, we cannot use this type of exit-
is often misunderstood. If used correctly, it can actually help you judge
at least early in the trade. For example, if we enter a trade on the long
whether a given set of rules and parameters should continue to work in the
side. based on intermarket divergence, we will often be entering at afive-
future. Let™s now overview the steps of properly using optimization.
day low. We will then either want to wait a few days before we use this
Use your development set and optimize across a large range of para-
exit or will want to use a longer-period low and tighten it as the trade
meters. When you have the results of the optimizations, begin to analyze
them. When analyzing parameters based on an optimization, you are
Money management stops are another type of protective exit. These
looking for traits that will produce a robust system. On the basis of all of
stops exit a trade at a fixed level of loss-for example, when the open po-
my research, I have found that reliable sets of parameters have given
sition loss exceeds $750.00. A modification of this type of exit is a trail-
traits. You should almost never select the most profitable set of parame-
ing stop. You exit once you have lost more than $750.00 of the maximum
ters. The rare exception to this rule would be if the most profitable pair
open position profit for the trade.
is surrounded by almost equally profitable pairs. Next, you want to have
Another classic method that works well in trend-following applica-
the highest possible number of neighboring parameters with similar per-
tions is a trailing stop that tightens the longer you are in a trade. One
formance. You also want reasonable drawdowns, and you do not want too
example of this method is the parabolic indicator developed by Welles
much of the total profit produced by only one or two trades. It is also im-
Wilder. A simpler example is to set initial stops at yesterday™s low
portant to have the profits evenly distributed and to have a system that has
minus three times yesterday™s range for long trades, and yesterday™s
an upward sloping equity curve.
high plus three times yesterday™s range for short trades. As the trade
progresses, this multiplier for the range and reference for yesterday™s
high and low is reduced. After 20 or more days in the trade, the stop is
positioned a few ticks away from yesterday™s high or low. This allows
us to elegantly protect our profits. How the multiplier is adjusted can
be based on many different methods-for example, using the current
Once you have found a set of parameters and a system that, based on
dominant cycle versus the current number of days in the trade. This
these criteria, works well, you are ready to test the system on the test-
type of exit is very powerful and useful in developing exits for classic
ing,set. A system that performs well on the testing set is worth testing
trend-following systems.
Developing a Trading System 219
218 Trading System Development and Testing

further. The complete process of system testing will be covered in the
next chapter.
SMarket Net Profit Trades Win% Average Trade Drawdown

DESIGNING AN ACTUAL SYSTEM D-Mark $ 69,137.50 76 41 % $ 909.70 -$11,312.50
95,512.50 74 39 1,304.22 -6,lOO.OO
102,462.50 78 38 1,313.62
Swiss Franc -12,412.50
Let™s now use what we have learned to build an actual trading system.
Our first step is to select the market(s) and time frame(s) we want to
trade. In this case, we want to trade three major currencies-the D-Mark,
the Yen, and the Swiss Franc-using daily data. Second, we need to de- What happens when we test our simple entries by holding a position for
velop a premise. Our premise is simple: the currency markets™ trend. N days? We tested N from 10 to 30 days in steps of 2. Our results showed
Because trend-following systems are based on a single series of price that our simple entries were very robust across the complete set of hold-
data, we will need to collect continuous contract data for the D-Mark, ing periods. When trading the D-Mark, every holding period from 10 to
the Yen, and the Swiss Franc. We will use data for the period from l/l/SO 30 days produced at least 75 percent of the profits of the standard rever-
to 7/26/96. The reason we select a starting point of l/1/80 is that the cur- sal system. Some of the holding periods even outperformed the original
rency markets had just started trading in 1975, and the early years do not system-for example, using a 26-day holding period produced $7 1,112.50
represent how these markets currently trade. We will break our data into with 53 percent winning trades and a -$10,187.50 drawdown. Using a
three sets: fixed holding period also improved the winning percentage. Every test
we ran on the D-Mark won 50 percent or more of its trades. We also in-
1. The period from l/l/SO to 12/31/91 = the development set creased the number of trades to a range from 117 to 194 (versus 76, using
the standard reversal system). This concept of holding for 26 days also
2. The period from l/1/92 to 12/31/95 = the test set.
worked well on the Yen, but not as well as the standard system. On the
3. The period from 111196 to 7/26/96 = the sample set.
Yen, a 26-day holding period produced only a few hundred dollars less in
profit but increased the drawdown to -$9,425.00 as compared to the orig-
Let™s now select the technology to use in developing our system based
inal system. The winning percentage also increased to 51 percent. On the
on our premise. Because our premise is based on trend following, we will
use the best trend-following method we have: adaptive channel breakouts, Swiss Franc, a 28-day holding period worked best, making a little over
$89,000.00. The 26-day holding period finished third at $83,000.00. Both
as described in Chapter 7. Briefly, this method simply buys and sells at
of these holding periods won more than 50 percent of their trades, but
the highest high and lowest low of the past dominant cycle days.
they increased the drawdown to -$14,350.00 as compared to the original
Now that we have selected our method, let™s develop our entries. We
start with the basic entries used by the adaptive channel breakout sys-
Next, we tested various target profits. Our target profits ranged from
tem. These are as follows:
$750.00 to $2,000.00 in steps of $250.00. Once we entered a position,
Buy at Highest (High, DCycle) stop; we could exit either on a target profit or when we got a signal in the op-
posite direction. Once again, our entries worked well for the complete
Sell at Lowest (Low, DCycle) stop;
range of target profits. For example, on the Swiss Franc, our target prof-
where DCycle is the current dominant cycle.
its produced between $83,000.00 and almost $93.000.00. The net profit
increased as we increased the size of our target profit. We also had a
This basic system is simply a stop and reverse. The results of this SYS-
very high percentage of winning trades; for example, a $750.00 target
tern for all three markets during the developmerit period are shown in
won™82 percent of its trade, and a $2,000.00 target won 63 percent of its
Table 15.4 (with $50.00 deducted for slippage and commissions).
220 221
Tradine Svstem Develooment and Testine Developing a Trading System

trades. The drawdown increased slightly, ranging from -$13,500.00 to
-$14,812.50. The number of trades ranged from 193 for a $2,000.00
target to 392 for a $750.00 target. The results were similar for the other Market Net Profit Trades Win% Average Trade Drawdown
two currencies. For example, the Yen profits ranged from $77,700.00
D-Mark $49,987.50 56 48% $ 892.63 -$6,412.50
for a $750.00 target to $99,162.50 for a $2,000.00 target, and had be-
Yen 89,537.50 53 45 1,689.39 -5,100.00
tween 63 percent and 79 percent winning trades, respectively. Swiss Franc 87,462.50 51 47 1,714.95 -7,562.50
After testing the fixed holding period and the target profit, we tested
these entries by lagging the trade by 1 to 5 days. Once a signal was gen-
erated, we entered on the first open after the lag period. When testing
full-period moving average. We used the same window size of 30, with 6
these small lags, we were surprised at their minimal effect on net profit.
poles, to calculate the length of the dominant cycle for the breakout. Be-
The results, using various lags on the Yen, are shown in Table 15.5 (with
cause a channel breakout system works better in trending markets, we
$50.00 deducted for slippage and commissions).
will use this indicator to filter our entries. For the development period,
You can see in Table 15.5 that the lag has little effect on the Yen. The
using this filter for our entries and exiting on a dominant cycle day high
stability in profit, using small lags, also shows how robust our entry rules
or low produced the results shown in Table 15.6 (with $50.00 deducted
are. We can enter up to five days late and still produce about the same
for slippage and commissions).
profit as the original signals. The results were similar for the other cur-
This filter improved the winning percentage and drawdown across all
rencies; the lag did not cut the profits significantly. In the case of the
three currencies, but did reduce the profits slightly.
Swiss Franc, a one-day lag produced the best (about $87,N0.00) profits
The next filter used a simple lo-day ADX above 30 in order to take a
and a five-day lag produced the worst (about $73,000.00).
trade. The results using the ADX filter are shown in Table 15.7 (with
These three tests supply valuable information during the system de-
$50.00 deducted for slippage and commissions).
velopment process. First, they can be used to develop better entries and
Comparing these two filters, only the trend filter based on the domi-
exits. Second, the fact that our entries produced very stable results across
nant cycle improved the performance across all three currencies. If we
all three currencies, over the complete sets of test, shows the robustness
were trying to develop a system for trading only one market, we could
of the adaptive channel breakout method.
have used the ADX on the Yen, for example, since it produced amazing
Now that we know that our basic entries are robust, let™s try to find
results. Because we want a system that works well across all three cur-
filters that will improve the performance of our system. The first filter
rencies, we will use the filter originally shown in Table 7.1 to filter our
we will use is the trend filter shown in Chapter 7. The indicator shown
in Table 7.1 defined a trend based on how long prices stay above a

Net Profit Trades Win% Average Trade WITH ADX TRENDIILTERS.
4 $95,312.50 73 41% $1,305.65 -$S,925.00 Market Net Profit Trades Average Trade
Win% Drawdown
5 93,125.oo 72 38 1.293.40 -9,650.OO
I 89,637.50 72 39 1.244.97 -8,600.OO D-Mark $49,162.50 56 39% $ 877.90 -6 7,337.50
2 85,550.OO 72 40 1.188.19 -9,212.50 Yell 75,712.50 33 67 2294.32 -5,537so
3 85.175.00 72 ˜42 1 ,182.99 -9,262.50 Swiss Franc 96,337.50 53 47 1,817.69 -15,537.50
222 Trading System Development and Testing 223
Developing a Trading System

Having developed our entries and filters, we need to develop our exits. TABLE 15.9 RESULTS OF MODIFIED ADAPTIVE CHANNEL
Many breakouts fail early in a trade, and these failures often lead to
large losing trades. We can help solve this problem by using a tighter stop Net Profit Win% AveraRe Trade Drawdown
Market Trades
early in the trade. In the currency system example, we used this tighter
45% $ 812.90 -$6,575.00
stop for 10 days. We selected 10 days because exiting a trade blindly after D-Mark $50.400.00 62
89,662.50 54 44 1,664.12 -4.762.50
10 days did not cause a large drop in profits. Using this concept, we de-
91,125.oo 53 51 1,734.43
Swiss Franc
signed the following exit:

MarketPosition = 1 and BarsSinceEntry < 10 then ExitShort at High-
est(High,CycleLen/2) stop
MarketPosition = -1 and BarsSince Entry < 10 then ExitLong at Low- How did this system perform on the development set for all three cur-
est(Low,CycleLen/2) stop rencies? These results are shown in Table 15.9 (with $50.00 deducted for
slippage and commissions).
This simple exit was then combined with our original exits to improve The stop reduced the drawdown slightly, compared to the version using
only the trend filter. The stop also slightly increased the profits.
the performance of the system. The concept of using a special exit within
the first few days of a trade is very powerful: for example, if we are trad- Now that we have selected our system, let™s test it on the test set, the
ing divergence using stochastics, we want that trade to be profitable data set for the period from l/1/92 to 12/31/95. The results during these
within a few days or else we exit it. four years, as well as the average yearly returns for the development set,
Let™s now put all the pieces together. The code for our currency trad- are shown in Table 15.10 (with $50.00 deducted for slippage and com-
ing system is shown in Table 15.8. missions).
When evaluating the performance of a trading system, it is important
that the system™s performance is similar in both the development set and
the test set. A system can be said to have similar performance when the
Vars: CycleLen(
(shortest cycle detectable is 6 and longest is 50. We are using 30 bars of data
and six poles1
CycleLen=Round(RSMemCyclel.2(close,6,50,30,6,0),0); MODIFIED ADAPTIVE CHANNEL BREAKOUT SYSTEM.
if valueZ=l then Buy Highest(High,CycleLen) stop;
If value2=1 then Sell at Lowest(Low,CycleLen) stop;
Market Net Profit Test Set
Trades Win% Drawdown Development Set
If MarketPosition=-l and BarsSinceEntry<lO then ExitShort
Highest(High,.5*CycleLen) stop; D-Mark $ 7,337.*0 1 8 3 9 % -$8,450.00 $4.200.00 $ 1,834.37
63.487.50 18 50 -4.187.50 7,471.87
If Mark&Position=1 and BarsSinceEntry<lO then exitlong at Ye”
50 -7.950.00 7s93.75 6,850.OO
Swisr Franc *7.400.00 22
Lowest˜Low,.5*CycleLeni stop;
ExitShort Highest(High,CycleLen) stop; Average Returnr for Development Set per Year $19,265.62
ExitLong at Lowest(Low,CycJeLen) stop; $24.556.25
Average Returns for Test Set per Year
224 Trading System Development and Testing

results between the two data sets are within the standard error. The stan-
dard error is calculated as follows:

Standard error = -

Testing, Evaluating, and
where N is the number of trades.
Table 15.10 shows that this system has averaged over $15,000.00 per
year on the Yen during this period. This profit is the result of one large
Trading a Mechanical
winning trade that produced over $31,000.00 in seven months. If we re-
move that trade, we would have made only $31,712.50 on the Yen over
Trading System
3.41 years, or $9,299.85 per year. If we use these numbers, then our av-
erage yearly returns are $17,984.20, or within 7.12 percent of the num-
hers generated during the development set period. This one Yen trade is
a major outlier that should be excluded to get a more realistic view of the
system™s performance.
The standard error for our system is l&G because we have a total of
58 trades. This means that the standard error is 13.1 percent.
Because the difference of 7.12 percent between the average annual re-
turns of the development set and the test set is within the standard error
of 13.1 percent, we can conclude that this system is most likely robust and
The previous chapter showed you how to design a trading system. This
is worth testing further.
chapter shows you how to properly test and evaluate a mechanical trad-
ing system. Testing and evaluating a trading system involves three major

1. How well can the system be expected to perform, and what are the
2. How likely is it that the system will continue to work in the future?
3. Can we predict when and if the system™s performance will either
temporarily degrade or fail completely?

Addressing these issues requires a deepunderstanding of the trading
system being evaluated-not only how the system generates its trading
signals but a statistical analysis of the system via the development set, the
testing set, and an out-of-sample set. Prediction of future system perfor-
mance during live trading requires a statistical analysis that compares
the system performance for both live trading and historical test periods.

Testing, Evaluating, and Trading a Mechanical Trading System
Tradine Svstem Develooment and Testine 227

Number of periods in test
Total net profit, gross profit, and gross loss
Total number of trades, percent profitable, number of winning trades, number of
Table 16.1 gives a general outline for both evaluating and trading a me-
losing trades
chanical trading system. This section discusses each of these steps in
Largest winning trade, largest losing trade
more detail.
Average winning trade, average losing trade
Ratio of average wins/average losses
Historical System Performance and Data Collection
Average trade (wins and losses)
Maximum consecutive winners, maximum consecutive losers
In this stage of our analysis, we need to collect statistics about system
Average number oi bars in winners, average number of bars in losers
performance. Data are collected on the system over the development,
Maximum intraday drawdown
testing, and out-of-sample periods. Data are collected for the complete
Profit factor
periods as well as for a moving window of these periods. Data collection
Trade-bv-trade historv
is important for both historical and live system evaluation and analysis.
Collecting statistical performance data on a trading system is the most
important step in analyzing a system. The data should be collected over
various time frames, as well as for long, short, and combined trading per- might try to show that the results are similar over all three periods. Or,
formance, for the development, testing, and out-of-sample sets. If the sys- you might compare the live trading results to the prelimit&y periods:
tem is being traded, data since the system went live should also be The more similar these periods™ statistical traits are, the more likely the
collected. Useful ways to collect these data are: on a yearly basis, or using system is robust. You might also run statistics that try to estimate the
a moving window with a one-year time frame. true average trade of the system within a given confidence level.
Table 16.2 shows some of the statistics that can be collected using the
TradeStation™s backtester.
Historical System Evaluation and Analysis
A lot of other valuable statistics can be collected-for example, a sim-
ple equity curve and a moving average of equity curve. Scatter and bar By analyzing the data we have collected, we can address some issues re-
charts can also tell you a lot about a trading system. Some of the more lating to system performance, reliability, and tradability, as shown in
valuable types of charts that I use are listed in Table 16.3. Table 16.4.
Other statistical tests can be run on your system. Using the statistics
collected over the development, testing, and out-of-sample periods, you
TABLE 16.1 STEPS IN TESTING A TRADING SYSTEM. 1. Distribution chart of trade profits (all trades over various data-collection
1. Historical system performance and data collection.
2. Distribution chart of trade drawdowns (all trades over various data-
2. Historical system evaluation and analysis. colleciion periodkl).
3. System trading. 3. Scatter chart of trade profit versus volatility.
4. Scatter chart of average N period profit versus next M period profit.
4. Live system performance data collection.
5. Scatter chart of current values of a given technical indicator versus future
5. Live system evaluation and analysis.
trading wofits.
6. Repeat steps 1 to˜5.
Trading System Development and Testing
228 Testing, Evaluating, and Trading a Mechanical Trading System 229

TABLE 16.4 IMPORTANT SYSTEM ISSUES. less than 50 percent of their trades. They make money by having a non-
standard distribution in which there are many more large winning trades
1. How profitable is the system on a risk-adjusted basis?
than in a standard distribution. An example of this type of distribution
2. When a trader is using the system™s own risk/reward criteria, will he or she
be able to have the discipline to follow all of the system™s signals? chart, for a standard channel breakout system on the Yen, is shown in Fig-
3. What are the system™s statistical traits? Our goal is to build a collection of ure 16.1. This is a distribution chart for a simple trend-following system.
systems with statistical characteristics that can be used like a DNA system. Another type of chart might show trading profits versus volatility. Trend-
4. On the basis of our analysis of the historical test results of this system, how following systems usually make more money with higher volatility. (An
likely is it to continue to work into the future?
example of this type of chart has been shown as Figure 10.1.)
These are only a few ideas of how to capture a view of a system™s per-
sonality. Almost any comparison of system performance that makes
Let™s discuss evaluating a system on a risk-adjusted basis. Trading re-
sense can be used. Once this information is collected, it can be used like
sults are defined by three things: (1) risk, (2) margin, and (3) profits. The
a system DNA test.
most popular measure of risk is maximum dra\vdown. A simple measure
Statistical analysis is very useful to tell whether a system will con-
of risk-adjusted performance is net profit/maximum drawdown. Another
tinue to work in the future. First, if the system was optimized, you should
measure relating to risk is called the Sharp ratio, which is defined as:
make sure that the parameters chosen were stable. If they were not, the
system could fail badly in the future. Second, you should collect system
Shqe = CR* - Fe
s traits over the development, testing, and out-of-sample periods. For all
three periods, these traits should remain constant. They can change
where RA is average returns, RF is risk-free returns, and S is standard de- slightly, but radical changes would be a warning sign that the system is not
viation of returns.
The higher the Sharpe ratio, the more stable rhe returns. If two systems
have the same returns, the one with the higher Sharpe ratio has less risk.
These are just a few ideas about measuring risk-adjusted performance.
There are many other methods that are beyond the scope of this chapter.
When using a mechanical trading system, it is important to trade all of
the system™s signals. Profitable mechanical trading systems make money
over time either by winning a high percentage of their trades or by hav-
ing the average winner much greater than the average loser. If you do not
take trades from the system that turn out to be very profitable, you have
changed the statistical basis of your expected returns and, in fact, could
even lose money on a profitable system. For this reason, the trader should
spend time analyzing the system so that he or she understands its risks
and rewards. Only then will the trader be able to trade the system cor-
rectly, using the correct required capital.
When we collect statistical data on a trading system, our analysis can
give us an internal look at the system™s traits--a kind of system DNA test.
Let™s look at a few system traits that we can capture. The first is a distri- FIGURE 16.1 The distribution of trade profits for a standard channel
breakout system on the Yen.
bution chart of trade profits. Most trend-following systems will only win
Testing, Evaluating, and Trading a Mechanical Trading System 231
Trading System Development and Testing

This is done by comparing the statistical characteristics of the system
robust. Third, you should analyze both an equity curve and a one-year mov-
over both the development and the live trading periods. The more simi-
ing window of equity. The equity curve should be stable and upward slop-
lar their performance, the more likely the system will perform well in
ing. The one-year moving window of equity should be above 0 at least 70
the future. Let™s look at an example. Suppose we have a trend-following
percent of the time. It should spend more time above zero than below it. A system that made an average of $lZ,OOO.OO per year trading the curren-
short and choppy moving window of equity is a danger sign for a trading cies during the development period, and it still made $11,500.00 over the
system. Also, the moving window of equity should slope upward. Another current twelve months of live trading. The system could still be danger-
measure I use is a ratio of system profits to standard deviation for the test- ous to trade if the statistical characteristics of the system have changed.
ing set versus the development set and out-of-sample set test periods. These
If we compare charts of system trading profit distributions during the de-
ratios should always be above .5 and, for a reliable system, above .7.
velopment period and the live trading period, and they are dissimilar, the
assumptions we made about the system and the trending nature of the
currencies during the system™s development may have changed. The sys-
System Trading
tem could still make money by winning a higher percentage of its trades,
System trading can be either real trading or paper trading. What is im- even if the original assumption of a nonstandard distribution of larger
portant is that the data used are collected after the development of the
winning trades must be discarded. This would mean that the system™s
system is finished. Two trading records should be kept when a system is
backtested results are no longer valid and the system is not safe to trade.
trading with real money: (1) the results based on simulated trading per-
We can also look for simple relationships in the live data, such as only
formance, and (2) the true profits and losses achieved by the trader using
trading the system when the equity curve is upward sloping. These are
the system. This information will be valuable. If all the trading signals of only a few examples of this type of analysis.
the system were followed, the result of these two records should be the If the analysis shows the system is still safe to trade, you can continue
same except for errors in estimating slippage. When the records match. to trade the system and repeat steps 3,4, and 5 in Table 16.1.
we can use these results to improve our slippage estimates. When they are
different, it means that the trader did not follow all of the system™s trad-
ing signals. We must discover the gaps because this practice makes all of
the profitability analysis for the system invalid. The cause could be either
the trader™s lack of discipline or an underestimation of the perceived risk
Now that you have an overview of how to test, evaluate, and trade a me-
of trading the system. chanical trading system, we will walk through an example of using these
ideas for the system development described in the previous chapter.
live System Performance Data Collection
Historical System Performance and Data Collection
This data collection process is identical to the one used for historical data
except that it uses only data issued since the release date of the system. Let™s start by collecting statistics for our currency trading system. The
These data, along with the historical datacollection, will be used to pre- first set of statistics we will collect has been outputted by TradeStation™s
dict future systems performance. backtest.for the development, testing, and out-of-sample sets. The list of
statistics was shown earlier, in Table 16.2.
We will now examine the statistics for our system over three differ-
Live System Evaluation and Analysis
ent sets of data: (1) the development set, (2) the testing set, and (3) a
Live system evaluation and analysis consist of˜not only evaluating current combined set. We are using a combined set because the blind set has
system profitability but also trying to predict future system performance.
Trading System Development and Testing Testing, Evaluating, and Trading a Mechanical Trading System

too few trades to be significant. Data comparing these numbers for three TABLE 16.5 TESTING RESULTS FOR A
currencies-the Yen, the D-Mark, and the Swiss Franc-are shown in
Table 16.5. Development Set Testing Set Combined Set
Next, we need to develop the series of charts discussed earlier, to cre-
ate a system “footprint” or DNA. Three of these charts will be compiled
Net profit $89,862.50 $63,487.50 $63,187.50
for all three data sets in Table 16.5:
Trades 54 18 20
Win% 44 50 50
1. A distribution chart of trading profits. Win/Loss 4.84 7.00 6.08
2. An equity curve. Max. consec. winners 4 4 4
Max. consec. losers 8 3 3
3. A one-year moving window of equity. Ave. bars, winners 81 79 73
Ave. bars, losers 15 10 11
The first step in collecting our needed data is to modify our system Profit factor 3.87 7.00 6.08
slightly by adding the following line of code: -$4,537.50 -$4,187.50 -$4,187.50
Annual ROA™ 157.25% 379% 329.62%
Print(file(“d:\articles\sysval2\equity,txt”),Date,“,“,NetProfit); D-Mark
Net profit $50.400.00 $7,337.50 $9,250.00
This will trigger TradeStation, when the system is run, to create a text Trades 62 18 19
file that can be used to analyze the system™s equity curve. Win% 45 39 42
Win/Loss 2.86
After adding this line of code, we then run the system using the devel- 2.22 2.09
Max. consec. winners 4 2 2
opment, testing, and combined sets. This approach is valid because the
Max. co”sec. losers 8 3 3
testing set was not used as part of the optimization process. Analysis of Ave. bars, winners 64 72 69
the optimization results is very important; it helps us to predict future Ave. bars, losers 15 21 21
system performance. Next, each data set is run through the system, and Profit factor 2.35 1.41 1.52
Drawdown -$6,575.00
the summary and trade-by-trade results are saved for a spreadsheet com- -$8,450.00 -$8,450.00
Annual ROA 63.91% 21.75% 23.59%
patible file. (This process is covered in the TradeStation manual.) This
process must be completed for the development, testing, and combined
data sets. Net Profit $91.925.00 $27,400.00 $31.700.00
Having collected a data profile, we can show graphically the footprint Trades 53 22 25
of the system, using the charts in Table 16.5. Win% 51 50 48
Win/Loss 3.45 2.43
A distribution chart of trade profits is developed by using the trade-by- 2.63
Max. consec. winners 7 6 6
trade history that TradeStation saves for each data set in a spreadsheet Max. consec. losers 4 4 4
format. A distribution bar chart can be developed, using the trade profit Ave. bars, winners 67 53 58
column of the spreadsheet. Other charts can be developed using the file Ave. bars, losers 16 16 15
Profit factor
created by the extra line of code we added to the system, which entered 3.58 2.43 2.43
Drawdown -$7,950.00 -$7,950.00
the date and the net profit to a file. To develop a simple equity curve, we -$7,950.00
Annual ROA 96.33% 86.25% 86.27%
just plot the date and net profit. To develop a one-year moving window
of equity, we add one more column to the spreadsheet. The new column ˜ROA (return on account) is just returns based on net profit and drawdown with no marginr.
Testing, Evaluating, and Trading a Mechanical Trading System 235
234 Trading System Development and Testing

contains a formula that subtracts the current net profit from the net profit
255 trading days ago. When we plot the date versus this column, we have
created a one-year moving window of equity.
Let™s look at examples of some of these charts for our currency trading.
Figure 16.2 shows our system™s profit distribution chart for the devel-
opment period for the Yen.
As in the classic channel breakout system, Figure 16.2 has more large
winning trades than a standard distribution. Another interesting feature
of this chart is the very large number of small winning trades between
$0.00 and $l,OOO.OO. These small trades are caused by the time-based
exit method we added to our system. Without this exit, many of these
small trades would have become losing trades.
Figure 16.3 shows a profit distribution chart for the combined testing
and out-of-sample periods for the Yen. Notice that both Figure 16.2 and
Figure 16.3 have large positive tails and the same high number of small
winning trades. It is very important that this two charts are similar. Even
FIGURE 16.3 The distribution of trade profits for the adaptive channel
if the system has similar profitability during both periods, it would be
breakout system, using the combined testing and out-of-sample sets in
risky to trade if these charts were not similar, because the distribution of
Chapter 15.

trades would have changed. Most of my research has shown that the dis-
tribution of trades changes prior to a system™s failure. This change will
often occur in the distribution of trades of a profitable system before a
system actually starts losing money.
Figure 16.4 shows a one-year moving window of equity for the devel-
opment set for the Yen. The one-year moving window of equity is almost
always above zero and, in general, has an upward slope.
Let™s now briefly analyze this system. First, we learned that it has a
very stable performance when it is viewed as trading a basket of curren-
cies. For example, the system has not performed well on the D-Mark over
the past few years but has done better than expected on the Yen. When
we view the results as a basket, they have remained similar over the de-
velopment, testing, and combined sets. (We showed this for the develop-
ment and testing sets in Chapter 15.) We also see that the distribution of
trades has remained similar for the Yen during both the development and
the combined sets, as shown in Figures 16.2 and 16.3. The results for the
FIGURE 16.2 The distribution of trade profits fw the adaptive channel
other two currencies are also similar enough to give us confidence in this
breakout system, using the development set ins Chapter 15.
Testing, Evaluating, and Trading a Mechanical Trading System
Trading System Development and Testing 237

trading systems make money based on the results of a few large winning
trades. This makes it very important to follow each and every trade that
the system generates. If we know we are not disciplined enough to trade
the model, we can still trade it by giving it to a broker who is equipped to
trade the system. The broker can be given a limited power of attorney
and would be legally obligated to follow all of the system™s signals. We
will use the results of the system™s live performance to adjust our slippage
estimates and collect live trading performance data.

Live System Performance Data Collection

The live data collection process is the same as the historical data collec-
tion process, except that it is based on results recorded since the system
has gone on line.
During the process of trading the system, we collect data on the sys-
tem just as we did during the historical testing period. We use these data
to develop the same characteristic benchmarks we had during the devel-
FIGURE 16.4 The one-year moving average of equity for the channel
opment periods. We compare these to make sure that the live trading pe-
breakout system on the Yen, using the development set.
riod is still similar to our previous results. This process is the same as
comparing the development set to the testing set or the out-of-sample set.
system. Many factors, such as the average length of winning and losing
If they are similar within the standard error of the samples, then the sys-
trades, have also been relatively constant on the basis of a basket of cur-
tem is still tradable. Any change in the profile of the system™s perfor-
rencies. The one-year moving window of equity for the Yen (Figure 16.4)
mance must be explained-even an increased percentage of winning
is above zero most of the time and has a general upward bias on the de-
velopment set.
On the basis of our analysis, we can conclude that this system has a
good probability of continuing to work well for this basket of three cur- Live System Evaluation
rencies for some time to come. Now that we think we have a reliable sys-
Let™s look at some danger signs for a system. One bad sign would be a
tem, we need to discuss actually trading it.
150 percent increase in maximum drawdown since the development pe-
riod. Another would be a 150 percent increase in the number of maxi-
System Trading mum consecutive losing trades.
If the system performance degrades, you should stop trading the sys-
To trade our system on all three currencies, we would need a minimum
tem. If the system is performing well, you should continue collecting
of $60,000.00. This amount would give good returns and limit the maxi-
the live data (using the data collection process) and analyzing the data
mum drawdown to about 33 percent, with returns on the account of about
at regular intervals. In this way, if a problem occurs, you can stop trad-
31 percent per year. Because winning trades last about four months and
ing the system while minimizing losses. If the system is performing
losing trades last about three weeks, it will take some discipline to trade
well, your analysis will give you confidence to trade the system and
this model, The problem is that if we don™t follow the system exactly, we
maximize your profits.
could lose money even if the system continues to w˜ork. Most mechanical
Part Five
Data Preprocessing
and Postprocessing
Preprocessing and postprocessing data involve transforming data to
make relationships more obvious or to extract information from raw
data. Preprocessing refers to transforming raw data into a form that
makes it easier for a modeling method, such as a neural network or ma-
chine induction, to find hidden relationships in the data, which can be
used for forecasting. Postprocessing is the act of processing results from
a model in order to extract all of the knowledge the model has learned.
To illustrate, many neural-network-based models require postprocess-
ing to make the forecasts useful. For example, we might find that the
model is more accurate when the output of the neural network is above
a given threshold.


There are many steps in developing good preprocessing. These steps are
shown in Table 17.1.
Now that we have overviewed the steps involved in preprocessing, let
us discuss them in more detail.

Using Advanced Technologies to Develop Trading Strategies Data Preorocessine and Postorocessine
242 243

1. Select the modeling method you are going to use.
The first step in building a model is to decide what analysis method to
2. Decide on the half-life of the model you want to build. For example, you
use. Neural networks are very powerful but will not allow you to see the
might want to develop a model that you must retrain every 20 days (or every
rules used. In fact, a neural network is the best modeling method if you
3 or 6 months, or once a year). This issue is not as easy to resolve as
picking the longest period, because the longer the life you want your model need to predict a continuous output. Neural networks are not easy to use
to have, the more data you need to develop it. In general, the more data you with discrete variables such as day of week. To use a discrete variable,
use, the better your raw data selection and preprocessing must be.
you need to convert it into a binary variable. Machine induction methods
3. Select what your model is going to predict. This is important for any type of
like C4.5 or rough sets will show you the rules but do not work well on
model, whether based on a neural network, machine learning, or a genetic
continuous data. When using a machine induction method, you need to
algorithm, because your preprocessing should be designed to be predictive
break continuous data into bins, so that it becomes a series of discrete or
of your desired target.
symbolic values for both the inputs and the output(s).
4. Select and collect raw data that will be predictive of your desired output.
These raw inputs can be different data series, such as the S&P500 or T- When developing preprocessing, I have found that it is easier to convert
Bonds, or non-price-based data like the COT report or the traders™ sentiment continuous-data-type preprocessing (the kind you would use in a neural
numbers (a report of what percentage of a given group of traders are bullish
network) into a discrete type that can be used in a method like C4.5 or
or bearish).
rough sets than to convert discrete preprocessing into preprocessing that
5. Select data transforms that can be tested to see how predictive they are of
would work well in a neural network. This issue is important because,
your target.
often, applying different methods to the same data will produce different
6. Using statistical analysis, evaluate which transforms work best in predicting
models. One of my methods is to use the same data for both a neural net-
your desired target. These transforms can be based on engineering methods,
work and a rough set model. I have found that when they both agree, the
technical indicators, or standard statistical analysis.
best trades are made.
7. When you find transforms that are predictive, you must decide how to
sample them. Sampling is the process of selecting from a series data points
that allow a modeling method to develop a reliable model based on the least
possible data. This is important because using too many inputs will lower
the reliability of the model.
8. Aiter you have selected your inputs and sampling, you need to break your
How long you want your model to work before retraining also has a big
data into development, testing. and out-of-sample sets.
effect on your preprocessing. In general, the shorter the life of the model,
9. Develop your model and then start the cycle of eliminating inputs and
the easier it is to develop the preprocessing. This is because a model with
rebuilding the model. If the model improves or performs the same after
a shorter-term life span does not require development of preprocessing
eliminating one of the inputs, remove that input from your preprocessing.
Continue this process until you have developed the best model with the to detect general market conditions such as bear and bull markets. For
fewest inwts.
example, it is easier to develop a neural network that uses 1 year of his-
tory--to train it, test it for 6 weeks, and trade it for 6 weeks--than it is
to develop a neural network trained on 10 years of data, which can be
tested on 1.5 years and traded for 1.5 years before retraining. Believe it
or not, this extra level of complexity in preprocessing can increase the re-
quired development time by an order of magnitude. On the other hand,
you don™t want to build a model with a very short life span because, often,
this type of model will stop working unexpectedly. This happens because
Using Advanced Technologies to Develop Trading Strategies
244 Data Preprocessing and Postprocessing 245

it has learned short-term patterns that may exist for only several months. (MACD) or even a simple moving-average crossover, and (2) designer in-
Because the testing period is short for these types of models, you will dicators that have been developed for the purpose of being predictable
not have enough statistical data to see a failure coming, before it results by a modeling method such as a neural network. We will look at three
in large losses. examples of designer indicators:
Another factor that affects the life span is what you are trying to predict.
In general, neural networks that predict smoother targets and technical in- 1. Forward percent K.
dicators, or are based on intermarket analysis, will have a longer life than 2. Forward oscillators.
models based on price-only data that predict price changes directly.
3. Forward momentum (of various types).
The modeling tool you use is also important. In general, machine-
induction-based models, in which you can select rules based on simplic-
The forward percent K indicator is calculated as follows:
ity, will have a longer life than neural networks, even when developed on
the same data. The reason for using a neural network, however, is that Forward K =(Highest(High+,,N) - Close)/(Highest(High,,N)
these rules might cover only 20 percent of the total cases, whereas a robust - Lowest(Low+,“,N))
neural network can produce a reliable forecast 60 to 80 percent of the time.
where +N is a price N days into the future, and N is the number of days
into the future.
DEVELOPING TARGET OUTPUT(S) FOR When this indicator is high, it is a good time to boy. When it is low, it
A NEURAL NETWORK is a good time to sells. We can also take a short-term moving average of
this indicator, that is, a three-period average so that we can remove some
The decision on what target to use for a neural network should be based of the noise.*
on several factors: (1) the desired trade frequency; (2) the risk/reward Another designer indicator is a forward oscillator. An example of this,
criteria; and (3) the expertise available for building market-timing neural a forward price relative to the current moving average, is calculated as
network models. We can predict three major classes of outputs. The eas- follows:
iest to predict is forward-shifted technical indicators. Many modeling
methods perform better when predicting a smooth target. Next, we can Close,, - Average(Close, x)
make a direct price forecast, i.e., the percent of change five days into
the future, or whether today™s bar is a top or a bottom. These types of in- where +N is days into the future, and X is the length of the moving
dicators are harder to predict, because of the effect of noise in financial average.
data. The final type of target we will discuss is non-price-based fore- If this indicator is positive, then it is bullish. If it is negative, then it is
casts. These include targets that predict the error in one forecasting bearish. This true of forward oscillator was discussed in Paul Refene™s
method, switch between systems, or make consensus forecasts. These book, Neural A&m-ks in the Capital Markets (John Wiley & Sons, Inc.,
are very powerful targets that have been used in many successful trad- 1995).
ing applications.

Technical Indicator Prediction
*This indicator was presented as a target for a neural network at the Artificial Intel-
There are two classes of indicators that we can predict: (1) classic tech- ligence Application on Wall Street Conference, April 1993. by Gia Shuh Jang and
nical indicators such as Moving ˜Average Convergence/Divergence Feipei Lai.
246 247
Using Advanced Technologies to Develop Trading Strategies Data Preprocessing and Postprocessing

Another forward oscillator I have used in several applications follows raw-price-based outputs is relatively easy on weekly or monthly data but
the general uptrends and downtrends of a given market but is resistant to much harder on daily data.
short periods of adverse movement. It is defined as follows:
Price-Based Targets
Average((Close+, - Lowest(Close,,S))/(Highest(Close,,,5)
- Lowest(Close+,,5)),5) The classic price-based target is to predict percent of change over the next
N days. When predicting this type of target, it is easier to predict a five-
This output was used in a T-Bond neural network that I discussed in day change than a one-day change, because noise has a smaller effect on
Futures Magazine May, June 1995. I showed how this target has a 0.77 the five-day change. When using this type of target, you can predict the
correlation with a five-day percent change five days into the future over raw change or just the sign of the change. If you want to predict short-term
the training and testing sets, and is much smoother than predicting raw moves or day trades, you can predict the difference between tomorrow™s
open and close. This type of target is a good tool for developing day trad-
price change. Predicting this target with reasonable accuracy will lead to
developing a very profitable model. ing systems that do not even require intraday data.
Another example of a designer output uses a percent change based on Another often used target is predicting tops and bottoms. Normally,
for these predictions, you will have one model for predicting tops and an-
a smooth low-lag version of price. This type of output could be devel-
oped using classic exponential moving averages (EMAs), but more often other for predicting bottoms. You will also filter cases so that you will de-
velop your models based only on cases that can be possible tops or
we could develop it using a Kalman filter to smooth the price data be-
fore taking the momentum. A Kalman filter is a special moving average bottoms. For example, if the price is higher than 3 days ago, today cannot
be a bottom. One way to define a top or bottom is to use a swing high or
that uses feedback to make a prediction of the next values in order to re-
move some of the lag. low of a given number of days. You can either make your output a 1 at the
turning points or make it a scaled value based on how far off the turning
One of the outputs of this type that I have used in many different proj-
ects is based on Mark Jurik™s adaptive moving average. When I use this the market moved during the definition period. For example, if we iden-
tify a 5.day swing high, we can output a 0.5 if we fall less than .25 per-
adaptive filter, I use a very short period smoothing constant (e.g., 3),
which induces about one bar of lag. I then have been able to predict this cent, a 0.75 if we fall more than .50 percent, and a 1 if we fall more than
1 .O percent. Another type of target that can be used to predict tops or bot-
curve successfully 3 to 5 bars into the future.
Predicting a forward Pearson™s correlation between intermarkets is toms with a neural network, outputs the number of days to the next top or
also a good target for a neural network or other modeling method. We bottom. This type˜of target has been discussed by Casey Klimassauskas.
can find a predictive correlation between the intermarke,t and the market
we are trading. This curve is relatively smooth and, if we know it, we can Model-Based Targets
then use current changes in an intermarket to trade with high accuracy the
Model-based outputs are based on the output of another system or mod-
market we are interested in. Considering the results using intermarket
eling method. There are several categories of these types of outputs, as
analysis without correlation analysis, and how using standard prediction
listed in Table 17.2.
correlation can improve results, just imagine what predictive correlation
Let us now discuss some examples of these targets, beginning with de-
can do. We can also predict other indicators such as volatility, which can
veloping targets based on system performance. One of the simplest
be used to trade options.
system-performance-based targets is either predicting a simple change
Now that we have discussed predicting various technical indicators,
in the equity curve or predicting a moving-average crossover of equity.
let us examine some targets that are based on raw price. Predicting
Data Prewocessine and Postorocessine 249
248 Usinp. Advanced Technoloaies to Develoo Trading Strateeies

Data availability is also an important issue. For example, many data se-
TABLE 17.2
ries have only 3 to 5 years of daily data available. Even if they have been
Predicting system performance.
very predictive over this period of time, they do not give us enough data
2. Selecting between models.
to work with. In general, we would like a minimum of 8 to 10 years of
3. Developing a consensus based on many different models.
data in order to use a series in developing a model. Another issue relat-
4. Predicting error correction values ior a given model.
ing to data availability is that when you use fundamental data, the data
5. Predicting non-price-based indicators such as the current dominant cycle
are often revised and the question becomes: How do you deal with the re-
vised data? For example, do you replace the original data or do you deal
with the revision in some other way?
These types of targets can be used as filters for an existing trading sys-
tem or model.
Another application is to develop a model using a neural network or ge-
netic algorithm that has multiple outputs and encodes which system or
There is an almost infinite number of different types of transforms you
systems should be traded.
can use to preprocess data. Let™s discuss some of the general types of
The concept of developing a consensus system is another example of
transforms, shown in Table 17.3.
this technology. Your output would be the same as a simple model, but
Let™s now discuss each of these in detail.
your inputs would be the result of multiple models plus other data. The
goal of a consensus-type system is to take many models that are 60 per-
cent or so accurate, and create a consensus model that is better than any Standard Technical Indicators
of the individual models.
Standard technical indicators and proprietary indicators used by market
analysts are great sources for data transforms for preprocessing. The most
popular indicators to use in a neural network are MACD, stochastics, and

The next major step is to decide what raw input data to use in the model.
This decision is based on two factors: (1) what influences the price of
the market you want to trade, and (2) data availability. TABLE 17.3 TYPES OF DATA TRANSFORMS.
First, you need to look at the factors that affect the prices of what you
1. Standard technical indicators, as well as components used to calculate these
want to trade. You can start with the actual open-close, high-low of the indicators.
market you are trading. Next, you should think about any related mar- 2. Data normalization.
kets. For example, if you are developing preprocessing for T-Bonds, you 3. Percent or raw differences and log transforms.
would use data for markets like Eurodollars, CRB, XAU, and so on. If we 4. Percent or raw differences and log transforms relative to moving average.
were predicting the S&P500, we would use T-Bonds but we could also 5. Multibit encoding.
use factors like an advancedecline line or a put/call ratio. Commitment 6. Prefiltering raw data before further processing.
of traders (COT) data can also be valuable. 7. Trading system signals.
In general, any data series that logically has an effect on the market 8. Correlation between price action and indicators or other markets.
you are predicting is worth collecting and analyzing for use in develop- 9. Outputs of various other modeling methods.
ing inputs to your model.
Data Preprocessing. and Postprocessing
Using Advanced Technolo&s to Develop Trading Strategies

Percent or Raw Differences
ADX. Usually, these three are used together because stochastics work in
trading range markets and MACD works in trending ones. ADX is used One of the most common transforms used in developing any predictive
to identify when the market is trending versus not trending. model is the difference or momentum type of transform. There are sev-
Another use of technical indicators as data transforms is to further eral widely used transforms of this type, as follows:
transform the output of one or more indicators to extract what an indica-
tor is trying to reveal about the market. (Several examples of this were X = Value - Values
given in Chapter 4.) X = Log(Value/ValueJ
When using technical indicators in a neural network, I have found that x = (Value - ValueJValue
the intermediate calculations used to make the indicators are often pow-
erful inputs for a model. As an example, let™s look at RSI, which is cal- Value” is the value of the series n bars ago.
culated as follows:

RSI = 100 - (loo/( 1 + RS)) Percent or Raw Differences Relative to the Mean
Percent or raw differences from the mean are also popular data trans-
where RS = average of net up closes for a selected number of days/aver- forms. There are many different variations on these transforms, and
age of net down closes for a selected number of days. We can use each of many different types of moving averages can even be used. Some of the
these averages as input to a neural network. basic variations for this type of transform are shown in Table 17.4. The
moving average (MA) can be of any type or length.
Another variation on this theme is the difference between price and a
Data Normalization
block moving average. An example of this is as follows:
Data normalization is a very important data transform in developing pre-
processing for almost every modeling method. Let™s now examine two X = Value1 - MAlcentered nl
classic methods of normalization:
where MA[centered n] is a moving average centered n days ago.
1. Normalize the data from 0 to 1 and use the formula as follows: We can also have a log transform or percent transform on this theme:
X = Value -Lowest (Value, N)l(Highest (Value, n) -Lowest (Value, n)) X = Log(Valuel/MA[centered nl)
X = (Value1 - MA[centered n])/MA centered n
If you want to scale between -1 and 1, subtract 0.5 and then multi-
ply by 2.
2. Normalize relative to the mean and standard deviation. An example
X = Value - MA
of this calculation is as follows:
X = Lo@lue/MA)
X = MAShott - MALong
X = Log(MAShort/MALon$
X = (Value - MAMValue
where x= mean of the data set, and 0 = standard deviation.
252 253
Using Advanced Technologies to Develop Trading Strate&s Data Preorocessine and Postorocessine

Multibit Encoding a special type of adaptive moving average. For example, a Kalman filter
is a moving average with a predictive component added to remove most
The next type of data transform we will discuss is multibit encoding, a
of the lag.
type of encoding that is valuable in many different types of transforms.
In my research in this area, I use a moving average called the Jurik
One good use of it is for encoding day of week or month of year. When
AMA, developed by Jurik Research. I apply it to the raw data with a very
developing a model, you should not code days of the week by using a sin-
fast smoothing constant of three. This removes a lot of the noise and in-
gle number. For instance, you should not use 2 for Tuesday because a 3
duces only about one bar of lag. After applying the Jurik AMA, I then
for Wednesday might be considered (incorrectly) a higher value for out-
transform the data normally. When developing a target for these models,
put. The effects of the day of week coding are not based on the actual
I use the smooth data.
day™s values. Instead, these values need to be encoded into discrete val-
ues. as shown here:
Trading System Signals
Another data transform you can use is trading signals from simple trad-
010 00
ing systems. For example, you can use the signals generated by some of
the systems you have seen in this book as inputs to a neural network. To
This encoding would be for a Tuesday.
illustrate, if your system is long, output a 1; if short, output a -1. You
Another type of encoding uses a thermometer-type scale. Here is how
could also output a 2 for the initial long signal, a -2 for the initial short
we would encode ADX into a thermometer-type encoding:
signal, and just a 1 or -1 for staying long or short. If you don™t use the sig-
nal, the components that generate the signals are often very good trans-
>lO >20 >30 >40
forms for your models.
1 I 1 0

This encoding would represent an ADX value between 30 and 40. Correlation Analysis
This type of encoding works well when the critical levels of raw input
Intermarket relationships are a very powerful source of preprocessing
are known. This encoding also makes a good output transform because if
for your models. As we have discussed earlier in this book, intermarket
the encoding output from a model is not correct, we know that the fore-
relationships do not always work, but by using Pearson™s correlation we
cast may not be reliable. For example, if bit one was a 0, we could not be
can judge how strong the relationship currently is and how much weight
sure that ADX is really over 30 (it does not follow our encoding). The
to put on it. One of the classic ways to do this is to take the correlation
reliability of a forecast can often be judged by designing a multiple bit
between another transform and your target shifted back in time, and use
output-for example, two outputs, one of which is the opposite of the
that correlation as an input to your neural network.
other. We would take the predictions only when they agree. This method
is frequently used in neural network applications.
Outputs of Various Other Modeling Methods
Prefiltering Raw Data before Further Processing
Another powerful method is to use the input produced by various other
One of my best methods, when I am predicting short-term time frames all modeling methods. For example, you can use the dominant cycle, predic-
tion, or phase produced from MEM, and then apply data transform to
the way from daily to intraday data, is to first process the data using a
those data. One valuable transform is to take a simple rate of change
low-lag filter, before applying other data transforms. Low-lag filters are
Using Advanced Technolo&s to Develop Trading. Strategies

in the dominant cycle. This transform is valuable because, without it,
when the dominant cycle is getting longer, you will be too early in pre-
dicting turning points, and when it is getting shorter, you will be too late.


Having discussed many different data transforms, how do we decide
which ones to use? The general rule is: Select transforms performed on
data that are predictive of your desired output. The transforms you select
can be used either as inputs into your model or to split the data set into
multiple models. For example, we would split the data records into two
sets, one for when the market is trending and the other for nontrending
conditions. Another alternative is to split the data into a possible top set
and possible bottom set by splitting it based on whether the market is
FIGURE 17.1 A scatter chart of CR6 Y˜WJS gold-an example of a
lower or higher than N days ago.
linear relationship.
Subdividing the data sets based on variables that are very predictive
is a powerful tool in developing reliable high-performance models. Let™s
now see how to judge the value of a given input set in predicting our de-
sired output. Kc

Scatter Charts
One of the simplest methods for evaluating a data transform is to use scat-
ter charts, plotting the output on the Y axis and the data transform on
the X axis. When analyzing these charts, we look for either linear shapes
or nonlinear patterns. If we have a cloud around the axis, there is no re-
lationship between the transform and the desired output. Let™s look at an
example of a linear relationship between an input variable and an output
variable. Figure 17.1 shows the CRB index on the X axis and gold fu-
tures on the Y axis. In general, these have a linear relationship when gold
is below $600.00 an ounce.
Figure 17.2 shows the nonlinear relationship between T-Bond prices
and gold. In general, T-Bonds and gold futures are negatively correlated
until the T-Bond price rises above 90. When the price of T-Bonds rises
above 90, the relationship between T-Bonds and gold becomes positively
correlated until the T-Bond price rises to 110. At 110, the correlation FIGURE 17.2 A scatter chart of T-Bond versus gold-an example of a
once again becomes negative. nonlinear relationship.

256 257
Using Advanced Technologies to Develop Trading Strategies

Figure 17.3 shows a typical pattern when there is no relationship be- Both scatter charts and correlation analysis can be used to help you
tween two variables. In this example, we are using the one-day change 10 select inputs, but, besides these methods, you can use a modeling method
days ago on the X axis, and the current one-day change on the Y axis for such as C4.5 or rough sets to select input transforms. Using one of the ma-
gold futures. This chart shows that the one-day change in gold 10 days chine induction methods, you would use all of your transforms to build a
ago is not predictive of the price change tomorrow. model. You would then select the inputs that, according to these methods,
When analyzing these charts, you can use the current value of your made the biggest contribution to predicting your output.
transforms or past values. Many times, a variable™s relationship with a
multiperiod lag is not as clear as when using the current value.
Scatter charts are a great tool, but simple numerical methods for eval-
uating how predictive a transform will be are also available. Let™s now
discuss some of these methods. When developing preprocessing, you do not merely want to use all of the
The first method is to take a simple Pearson correlation between the past values of each transform in your models because using all of the data
data transforms and the desired output. This method will detect only lin- will result in a less robust model. On the other hand, because the model
ear correlations, and, often, even correlations that are known to exist and method often needs to “see” patterns in the transform to use them effec-
are used to build trading systems will be missed during simple correla- tively, you cannot use only the current value. You can solve this problem
tion analysis. However, if you have a strong correlation, you should use by sampling the data. When I sample data, I normally use the following
that input in your model. scheme: 0,1,2,3,5,7,10,15,20,30,50. This approach was discussed by Mark
Jurik (Futures Magazine, October 1992). This sequence allows detection
of short-, medium-, and long-term patterns without sampling every bar.

--I I*l*/ I I I I

After you have selected transforms and sampling that are predictive of
your desired output, you must develop your data sets. You can develop
either one development/testing/out-of-sample unit based on the complete
data set, or multiple sets, in which each set contains cases that meet given
criteria. For example, suppose the market is trending or the last Fed move
was a tightening. Subdividing the data sets can often produce good results
if the criteria used for subdividing are truly predictive. When developing
your data sets, your development set should contain at least 5 times the de-
sired life of your model. If you have 5 years of data in your development
set, you can expect the model to predict for about a year (at most) after the
end of the development set. This means you can test for 6 months and trade
for 6 months. If you want to develop models using 10 or more years of
data, subdividing the data based on very predictive relationships can help.
When developing your models, you should have at least 30 cases per
FIGURE 17.3 An example of a classic scatter that shows no
input. The higher this ratio, the more robust your model will be. The goal
relationship between the variables.
258 Usine Advanced Technoloeies to Develoo Tradine Strateeies

in developing your data sets is to have as few inputs as possible while
producing the best possible results.

An iterative process of developing a model uses a neural network or
other methods. This process deletes one variable at a time and then re-
trains the model. You should try to reduce the number of connections if
you are training a neural network. If the performance of the model im-
Developing a
proves or stays the same, you can remove that variable or hidden node.
This process is repeated until the best model is found.
Neural Network
Often overlooked is the process of examining when the model is cor-
rect and when it is wrong. If you can find a pattern in these cases, it may
Based on Standard
reveal a need for a new input. Adding such inputs will often greatly im-
prove your results.
Rule-Based Systems

Postprocessing uses the output from one or more models and applies data
transforms to extract knowledge contained in the model(s). Let™s now dis-
cuss some of the most common methods. Postprocessing is often used for
neural-network-based models. One example of this type of postprocess-
ing uses the signals from the neural network only when they are above or
below a given threshold. For example, if our neural network predicted the
One of the most powerful but easiest approaches for building a neural-
S-day percent change, we would go long only if the network outputted
network-based model begins with an existing tradable rule-based system.
a value greater than 1 percent and would go short only if it outputted a
In this approach, we break a rule-based model into its components, which
value below -1 percent. This is because neural network outputs that are
can be used in developing a neural network. This process allows us to de-
too close to zero are within the error of the model and are not reliable.
velop a tradable neural network model because our goal is to improve the
Another form of postprocessing for a neural network model takes an
existing system, m)t to develop a new one from scratch.
average of the output of several models trained on the same data. This
method will often produce better results than any of the networks alone.
Postprocessing can also be used to help you decide when a neural net-
work needs retraining. You must run a correlation between the output of
the neural network and the actual target. If this correlation drops below
a given threshold based on the standard deviation of past correlations,
Let™s investigate the process of developing a neural network based on an
you can turn the network off.
existing trading system. The steps involved are overviewed in Table 18.1.
Preprocessing and postprocessing are the most important steps in de-
Table 18.2 lists some of the applications of neural networks that are
veloping models using advanced technologies. We will use what we
developed using existing trading systems. As you can see, the range of
learned here to build a neural network for the S&P500 in Chapter 18.
applications is broad.

Usine Advanced Technoloeies to Develoo Trading Strateaies Developing = Neural Network
260 261

1. Develop a good rule-based system first, using as few parameters as possible.
Ideally, use fewer than four parameters in the system. Keep the system
simple, without any fancy filters or exit rules. It will be the neural network™s 1. Breakout-type systems.
job to use the extra information you are supplying to improve your results. 2. Moving-average crossover systems.
These extra inputs can include any information that you would have used to
3. Oscillator-type countertrend systems.
filter the original system.
4. Intermarket divergence-type systems.
2. Analyze the results of your rule-based system. Examine where your entries
and exits occur, and take into account the premise of the system. Analyze
the trade-by-trade results to try to discover how a discretionary trader might
have used the system as an indicator and outperformed the system.
3. Use your analysis.to develop your target output. We can use a neural network to supercharge systems based on the con-
4. After selecting your output, develop your inputs based on the original cepts in Table 18.2, as well as many other types of systems. Let™s first
indicators used in your rule-based system, plus any filters you would have
overview how we can develop neural networks based on the applications
used. Add inputs based on how a human expert trader would have used this
in Table 18.2. Later in the chapter, we will explore a real example using
system as part of his or her discretionary trading.
an intermarket divergence system for the S&P500.
5. Develop your data sets, using the first 80 percent to train your model. The
remaining 20 percent will be used for the testing set and the out-of-sample
set. Normally, 15 percent of the data is used for the testing set and the
Neural Networks and Breakout Systems
remaining 5 percent is used for the out-of-sample set. These numbers are
not set in stone; they are only guides.
Neural networks can be used to improve a classic channel breakout sys-
6. Train your model, then test it on the testing set. Repeat this process three to
tem or the adaptive channel breakout systems shown in Chapter 7. Be-
five times, using different initial weights. Neural networks that perform well on
cause a standard channel breakout system does not trade very often, we
both the training set and the testing set, and periorm similarly across multiple
trainings, are more likely to continue their performance into the future. must increase the number of trades if we are going to use neural networks
7. After you have developed a good neural network that performs well and is to improve our model. We will accomplish this increase by exiting our
stable, analyze it to see whether it can be improved. One method I use is to
trades when a target profit is reached, or exiting after a fixed number of
analyze the training period to see whether there are any periods in which it
days if the target is not reached. We tested this basic concept using adap-
has performed badly for an extended time. Next, compare these results to
tive channel breakout on the Swiss Franc over the period from l/1/80 to
the original system. If both the network and the system performed badly
during the same period, add new indicators that can filter out the times 10/l/96. The target profit was $l,OOO.OO, and we had a lo-day holding
when the premise the system is based on performs badly. If the network period. How this model performed relative to the original adaptive chan-
performs badly while the original system performs well, change the nel breakout (with $50.00 allowed for slippage and commissions) is shown
transforms used on the original data.
in Table 18.3.
8. When you have developed an acceptable model, start a process of
The net profit dropped 39 percent, and the number of trades increased
eliminating inputs and retraining and testing the model to produce the best
by almost 5 to 1. The winning percentage went up to 69 percent, and
possible model with the smallest number of inputs and hidden nodes.
9. After you have finished developing your neural network, analyze the model drawdown was only slightly higher at -$14,950.00. The problem with this
50 that you can develop the best possible trading strategy based on this system is the high drawdown, which resulted from several large losing
model. Often, patterns in the output of neural networks can tell you when a trades. If we can develop a neural network that will at least filter out these
given forecast may be right or wrong. One of the simplest and most common
large losing trades and will possibly select the most profitable breakouts,
relationships is that the accuracy of a forecast is often higher when the
we can have a great short- to medium-term currency trading system with
absolute value of the output of the neural network is above a given level.
a high winning percentage and low drawdown.
262 263
Developing a Neural Network
Using Advanced Technolo+s to Develop Trading Strategies

Neural Network Moving-Average Crossover Systems
One of the most often used trading systems is a simple moving-average
Target + Hold
Original Channel Breakout crossover. Neural networks can be used to supercharge this classic system
by predicting a moving-average crossover a few days into the future. For
Net profit $129,750.00
107 example, suppose we want to predict the difference between a 20-day and
44 69
Win% a 40-day moving average, 2 days into the future. This approach works well
Largest losing trade in many different markets and will even work on individual stocks. It is
-$12,412.50 -$14,950.00
Drawdown a good application because it is easy to predict a moving-average
crossover, even with only a basic understanding of neural networks and
the markets.
The key is that the base moving-average pair must be profitable and
We can develop a neural network for this application. Our target is 1
reliable before you try to predict it. Predicting a crossover enhances the
when the market is $l,OOO.OO higher within the 10 days following a
original system but will not make a bad set of parameters a good one. On
breakout, or else it is 0. Our analysis should include only cases that are
average, a well-designed neural network can improve Jhe net profit and
drawdown of a moving-average crossover system by 30 percent to as much
One of the most important issues in developing this type of model is
as 300 percent. Even though this method works well, it is not always pos-
the problem of having only about 500 cases for training a network. We can
sible to find a reliable moving-average crossover system. In cases where
solve this problem by normalizing our inputs so that we can use data from
we cannot find a robust moving-average crossover combination, we can
three major currencies-the D-Mark, the Yen, and the Swiss Franc-in
use other predictive indicators.
developing our model. This would give us about 1,500 cases: 1,250 for
training, 200 for testing. and 50 for out-of-sample data. The first step in
developing inputs for such a neural network would be to use standard in- Using Neural Networks to Enhance Oscillator-Type Systems
dicators, such as ADX, as part of the preprocessing. In this type of ap-
Oscillator-type indicators, such as stochastics and RSI, are often used to
plication, we would use both the raw ADX value and simple differences
develop trading systems that try to enter near tops or bottoms. They are
of ADX. Next, we should use many of the outputs from the maximum en-
usually based on divergence between the price and the oscillator. We
tropy method (MEM)-for example, the dominant cycle, phase, and
learned in Chapter 4 that, in order to trade bounded oscillators like sto-
MEM predictions. For the dominant cycle, we should use both the value
chastics and RSI effectively, their period must be set to 50 percent of the
and the simple differences over the past several bars. We should also pre-
current dominant cycle. This type of system works well in a trading range
process the price series data by using standardized price differences-for
market, but it works badly in a trending one. We can use this relationship
example, log(close/close,J x 100. These types of inputs, when sampled,
to help a neural network improve standard oscillator systems. In devel-
will allow the network to learn patterns that produce good channel break-
oping this network, we would first include the current and sampled ver-
outs. For example, when an upside breakout is preceded by the market™s
sions of the oscillator, as well as its rate of change. Next, we would include
rallying for 3 or more days in a row, the breakout will normally fail. These
normalized price differences. Because these systems work best when the
are the types of patterns that the network can learn and implement to im-
market is not trending, we should include ADX and simple differences of
prove the performance of a channel breakout system.
ADX, to allow the network to know when the oscillator system is likely
This method offers promise; in fact, one of my clients claims to have
to work well. Cycle information-the current dominant cycle, the rate
produced $25,000.00 a year trading a 1 lot of the Swiss Franc using this
of change of the dominant cycle, and the phase angle calculated using
type of method.
Using Advanced Technologies to Develop Trading Strategies Developing a Neural Network
264 265

many trades. This is very desirable because filtering an intermarket di-
MEM-could also help a neural network. We should develop two sepa-
vergence system often leads to too few trades to make the system fit the
rate networks, one for tops and another for bottoms. Our target for this
needs of most shorter-term traders. Besides increasing the number of
neural network could be tops and bottoms identified using divergence
trades, you can often increase profits and reduce drawdown. Let™s now
based on a human expert trader. This type of application may not always
look at an example of developing an intermarket divergence neural net-
give enough cases to train a neural network on daily bars, so we want to
work for the S&P500, using T-Bonds as the intermarket. In this example,
either (1) standardize the data so we can train it using data from multi-
we will discuss the theory behind developing the preprocessing for this
ple markets, or (2) use it on intraday applications where we will have
type of application. Let™s now develop our S&P500 neural network.
enough cases.

Developing an S&P500 Intermarket Neural Network
Using Intermarket Divergence to Develop a Neural Network
We begin our development of our neural network by revisiting trading the
Intermarket divergence systems are countertrend trading systems that try
S&P500 based on intermarket divergence, using T-Bonds as our inter-
to predict turning points based on the divergence between two funda-
market. Our development set used in optimization is the period from
mentally linked markets. In general, these systems are early, are on time,
4/21/82 to 12/31/94. Our combined testing and out-of-sample set uses
or totally miss a major move. They usually produce a high percentage of
data for the period from l/1/95 to 8/30/96. On the basis of our analysis
winning trades but will sometimes produce large losing trades. These los-
using intermarket divergence, we found that the following rules produced
ing trades occur because either the markets become so correlated that
good results:
divergences are not generated, or they decouple for long periods of time
and the market being traded becomes controlled by other technical or
1. If S&P500 < Average (S&PSOO, 12) and T-Bonds > Average (T-
fundamental forces. We can combine the concept of intermarket diver-
Bonds, 26), then buy at open.
gence with neural networks to develop powerful predictive neural-
network-based models. One target I often predict, using this application, 2. If S&P500 > Average (S&PSOO, 12) and T-Bonds < Average (T-
is an N-day percentage change smoothed with a Y-day moving average. Bonds, 26), then sell at open.
Most often, I use a 5.day percentage change with a 3-day moving average.
Another target I use is a forward stochastic, as discussed in Chapter 17. During the development period (with $50.00 deducted for slippage and
After choosing our target, we can add other information-for example, commissions), these basic rules produced the results shown in Table 18.4.
correlation and predictive correlation-to help the model. Finally, we
should add technical indicators such as ADX and stochastic% They allow TABLE 18.4 DEVELOPMENT SET RESULTS
the neural network to trade the market technically when it has detected, FOR SIMPLE INTERMARKET DIVERGENCE
based on both correlation and predictive correlation, that intermarket OF THE S&P500 AND T-BONDS
analysis is currently not working. FOR TRADING THE S&P500.
Net profit $321,275.00
Trades 118
Average trade ;;,722.67
Largest losing trade -$17,800
During my research in developing intermarket divergence-based neural Drawdown -$26,125.00
networks, I have found that they will trade much more often than the orig- Profit factor 3.92
inal intermarket-based system, sometimes yielding three to five times as
Using Advanced Technologies to Develop Trading. Stratedes

then tested this method on our combined testing and out-of-sample
set (again allowing for $50.00 slippage and commissions) and produced
Strengths of Intermarket Divergence
the results shown in Table 18.5.
The percentage of winning trades is high (often 60% to 80%).
This system produced great results during the development period, but
Good profits per year are realized.
only $12,200.00 during the combined testing period. One of the main
The basic premise is sound, if correct intermarkets are used.
reasons for the substandard performance was a -$20,750.00 losing trade This approach can pick tops and bottoms.
that occurred during the period from l/23/96 to 6/7/96. There was also a
-$6,100.00 losing trade during the period from 7/15/96 to 7/29/96. The Weaknesses of an Intermarket Divergence System
first trade was a short signal given during this year™s correction in the Occasionally, large losing trades occur.
T-Bond market. During this period, T-Bonds and the S&P500 decoupled. Sometimes, drawdowns will make the system untradable, even though they are
The second trade was a buy signal during a period when bonds rallied as still only 10%1 5% of net proiit.
a safe haven to the correction going on the stock market. In general in- A long period of flat equity occurs when intermarket relationships decouple.
termarket systems, the S&P500 underperforms the market during bull Many systems have long average holding periods.
markets and outperforms it during bear markets. This system is based
on a sound concept and, over a long period of time, should continue to be
profitable and will profit greatly when the bear market finally occurs.
We can help to correct some of these weak points by using correlation
and predictive correlation analysis, as discussed in Chapter 8. Intermar-
Intermarket Divergence-A Closer Look
ket correlation analysis can be used to filter out the occasional large los-
In general, T-Bonds are predictive of the S&P500. Our goal in using a ing trades and cut the drawdown of these systems. The problem is that
neural network is to extract from the data more information than can be using these filters significantly cuts the number of trades and increases
extracted using the simple system. In order to accomplish this, we need the holding period, but the resulting systems are more tradable. Inter-
to take a closer look at intermarket divergence. In Table 18.6, we show market divergence is a powerful concept that can produce very profitable
some of the strengths and weaknesses of the intermarket divergence trading systems, but the simple rules we are using cannot capture many
approach. of the inefficiencies that can be exploited when using these intermarket
relationships. To understand these systems better, we need to study how
they work and where they enter and exit the market.
FOR SIMPLE INTERMARKET DIVERGENCE Anatomy of an Intermarket Divergence System
TRADING THE S&P500. Intermarket divergence systems are countertrend trading systems that try
to predict turning points. In general, these systems are either early or on
Net profit
time. They produce a high percentage of winning trades but sometimes,
when correlated intermarkets do not diverge, they will miss a major mar-
Y&2.50 ket move. For example, a period in which the S&P500 and T-Bonds move
Average trade
Largest losing trade together with a very high correlation may fail to generate a divergence
Drawdown signal for the S&P500 in the direction of bonds. Without that divergence,
Profit factor
we would be trapped in the wrong direction and the result would be a
Developing a Neural Network 269
2htl Usine Advanced Technoloeies to Develop Trading Strate&

very large losing trade. Another problem is that sometimes markets will
decouple and a standard intermarket divergence-based system will lose
money unless a correlation filter is applied.
Let™s now look at some of the trades from our S&P500, T-Bond inter-
market divergence system.
Figure 18.1 shows the system for the period from mid-December 1993
to mid-September 1994. The short signals that we entered on 2/23/94 pro-
duced a large correction, which at one time accounted for 35 points of
open position profit. The problem was that the S&P500 and T-Bonds
failed to have a bullish divergence in order to exit the trade. This caused
the system to give back most of the profit in late August of 1994. Luck-
ily, the market corrected again just before Thanksgiving, and this time it
generated a bullish divergence that allowed us to exit this trade with a
$13,750.00 profit on November 25, 1994. Even though this trade was a big
winner, the inefficiency of our simple intermarket divergence rules was
Figure 18.2 shows the period from December 1995 to August 1996. Dur-
ing this period, several trading signals were generated from our divergence Trading signals from a simple intermarket divergence
system, December 1995 to August 1996.
model. For example, a buy signal in late January 1996 produced a

$1,750.00 winning trade. Unfortunately, when T-Bonds collapsed in Feb-
ruary 1996, our system went short and remained short until June 7, 1996.
This intermarket divergence trade lost over $20,000.00. It was caught on
the wrong side of an explosive move in the S&P500-almost 60 points in
about 6 weeks. A human expert trader would have gone long, based on
momentum, well before the move ended. For example, one logical place
to go long would have been at 642.00. This was one tick above the previ-
ous record high of $641.95, set on January 4, 1996. This would have lim-
ited the loss to only $8,150.00, and the reversal signal would have
produced profit of over $15,000.00 if we had exited our trade based on the
next short signal from our system. This change in the trading sequence
would have produced about a $7,000.00 profit instead of a loss of over
$20,000.00. This type of expert interpretation, combining intermarket
analysis with classic technical analysis, can be a powerful concept that is
not easy to express using simple rules, but it can greatly improve a Stan-
dard intermarket divergence system.
Another method that can be used to improve an intermarket-based sys-
Trading signals from a simple intermarket divergence
tem is correlation analysis, which was discussed in Chapter 8. Correlation
system, mid-December 1993 to mid-September 1994.
270 271
Using Advanced Technologies to Develop Trading Strategies Developing a Neural Network

In Chapter 8, we discussed using correlation analysis to improve the
analysis has many uses. For example, it can be used to turn the system on
performance of trading systems based on intermarket analysis. We used
and off, or to detect when a market will have a sustainable trend. The
two very powerful methods: (1) looking at the level of the actual corre-
problem is to integrate all of these concepts into one system without curve
lation, and (2) the concept of predictive correlation. The level of the ac-
fitting. A neural network can be used to integrate multiple trading meth-
tual correlation can give insight about how a given intermarket-based
ods into one model. In addition, a neural network can perform both pat-
system will work currently and about whether the market being traded is
tern matching and modeling in the same model and can make use of all
about to have a major trend. For example, when fundamentally linked
of the information that we can extract from intermarket analysis. Our
markets like the S&P500 and T-Bonds currently have a strong correlation
goal now is to develop an S&P500 model based on the relationship be-
that is well above historical averages, the dependent market (in this case,
tween the S&P500 and T-Bonds, as well as technical analysis to forecast
the S&PSOO) will often trend. The sign of the correlation can tell us a lot
a price-based target.
about the fundamental forces in place. If we look at the relationship be-


. 5
( 6)