<<

. 4
( 6)



>>

FT-SE100
elements and connections.
c4.5

In the late 194Os, Claude Shannon developed a concept called “informa-
FIGURE 11.2 A simple three-layer neural network.
tion theory,” which allows us to measure the information content of data
by determining the amount of confusion or “entropy” in the data. Infor-
why each neural network product has its own proprietary version of a
mation theory has allowed us to develop a class of learning-by-example
backpropagation-like algorithm.
algorithms that produce decision trees, which minimize entropy. One of
When you develop a solution to a problem using neural networks, you
these is C4.5. C4.5 and its predecessor, ID3, were developed by J. Ross
must preprocess your data before showing it to the neural network. Pre-
Quinlan. Using a decision tree, they both classify objects based on a list
processing is a method of applying to the data transforms that make the
of attributes. Decision trees can be expressed in the form of rules. Fig-
relationships more obvious to the neural network. An example of this pro-
ure 1 I .3 shows an example of a decision tree.
cess would be using the difference between historical prices and the mov-
ing average over the past 30 days. The goal is to allow the neural network
to easily see the relationships that a human expert would see when solv-
ing the problem.
YES NO
We will be discussing preprocessing and how to use neural networks as
part of market timing systems in the next few chapters.
Let™s now discuss how you can start using neural networks success-
fully. The first topic is the place for neural networks in developing mar-
IMCOME INCO IME
ket timing solutions. The second is the methodology required to use
neural networks successfully in these applications. IEXPENSES 2
IEXPENSES J
I
I L
Neural networks are not magic. They should be viewed as a tool for de-
veloping a new class of powerful leading indicators that can integrate
many different forms of analysis. Neural networks work best when used
as part of a larger solution.
A neural network can be used to predict an indicator, such as a per-
cent change, N bars into the future; for example, the percent change of
FIGURE 11.3 A simple decision tree.
the S&P500 5 weeks into the future. The values predicted by the neural
Statistically Based Market Prediction 155
An Overview of Advanced Technologies
154



Let™s take a closer look at the binary version of C4.5. It creates a two- by dropping conditions and then retesting them on unseen data. A do-
way branch at every split in the tree. Attributes are selected for splitting main expert could also specialize a rule by adding a condition to it. When
based on the information content of each attribute in terms of classify- developing machine-induced rules, you don™t want to use all the rules that
ing the outcome groups. The attributes containing the most information were generated. You only want to use “strong rules”-those with enough
are at the top of the tree. Information content decreases as we move to- supporting cases. For this reason, when using C4.5, you need a product
ward the bottom level of the tree, through the “leaves.” that offers statistical information about each of the leaves on the tree.
For discrete attributes, the values are split between branches so as to An example of a product that has this feature is XpertRuleTM by Attar
maximize information content. Numerical data are broken into bins or Software.
ranges. These ranges are developed based on a numeric threshold derived
to maximize the information content of the attribute. The output classes
Rough Sets
or objects must be represented by discrete variables. This requires nu-
merical output classes to be manually split into ranges based on domain Rough sets is a mathematical technique for working with the imper-
expertise. fections of real-world data. Rough sets theory, proposed by Pawlak in
Both C4.5 and ID3 handle noise by performing significance testing at 1982, can be used to discover dependencies in data while ignoring su-
each node. The attributes,must both reduce entropy and pass a signifi- perfluous data. The product of rough sets theory is a set of equivalence
cance test in order to split a branch. C4.5 and ID3 use the Chi-square test classifications that can handle inconsistent data. Rough sets methodol-
for significance. Several parameters can be set to help C4.5 develop rule ogy facilitates data analysis, pattern discovery, and predictive modeling
sets that will generalize well. The first parameter is the lower branch in one step. It does not require additional testing of significance, cross-
limit-the number of data records below which the induction process will correlation of variables, or pruning of rules.
terminate the branch and develop a leaf. A good starting point for this pa- Let™s now try to understand how rough sets work. We will assume that
rameter is about 2 percent of the number of records in the database. After real-world information is given in the form of an information table. Table
the decision tree is induced, a process called “pruning” can improve gen- 11.1 is an example of an information table.
eralization. Noise causes excessive branches to form near the leaves of the The rows in this table are called examples. Each example is composed
tree. Pruning allows us to remove these branches and reduce the effects of attributes and a decision variable. In Table 11.1, headache, muscle
of noise. There are two types of automatic pruning: (1) error reduction pain, and temperature are attributes, and flu is the decision variable.
and (2) statistical. Error reduction pruning is based on a complexity/ac- Rough sets theory uses this type of table to develop rules from data.
curacy criterion. Branches that fail the test are pruned. Rough sets theory is an extension of standard set theory, in which the
The statistical pruning algorithm is particularly suited to situations definition of the set is integrated with knowledge.
where the noise in the data is caused by not having all the relevant attri-
butes to classify the outcome and by the presence of irrelevant attributes.
ROUGH SETS EXAMPLE 1.
TABLE 11 .l
This is true of the financial markets as well as many other real-world _
problems. ROW Muscle Pain Temperature Flu
Headache
The statistical pruning algorithm works backward from the leaves to 1 Normal
Yes Yes NO
remove all attribute branches of the induced tree that are not statistically High
2 Yes Yes Ye5
3 Yes Yes
significant (using the Chi-square test). Very high Yes
4 NO Ye5 Normal NO
Another type of pruning is based on domain expertise. A domain ex-
NO NO High NO
s
pert could examine the rules generated and delete any of them that don™t
6 NO NO Very high Yes
make sense in the real-world application. Rules can also be generalized
156 Statistically Based Market Prediction 157
An Overview of Advanced Technologies


To make this explanation easier to understand, let™s review some of the sets, are said to be indiscernible. In the example shown in Table 11.1, the
basics of set theory. attributes headache and muscle pain can be used to produce two differ-
ent subsets. These are formed by the rows (Rl,R2,R3] and [ RKR6).
These two subsets make up two different elementary sets.
Subsets
Any union of elementary sets is called a definable set. The concept of
Subsets are made up of only elements contained in a larger set. A super- indiscernibility relation allows us to define redundant attributes easily.
set is the inverse of that makeup. A proper set is a subset that is not iden- Using Table 11.1, let™s define two sets. The first is based on the attri-
tical to the set it is being compared to. Let™s now look at some examples butes headache and temperature. The second will add muscle pain and
of subsets and supersets. use all three attributes. Using either pair of attributes produces the same
elementary sets. These are the sets formed by the single elements
Let A = lRl,R2,R3,R4,R5,R6,R7,R8,R9) [Rl),(R2],(R3],(R4),[R5},(R6].Becausethesesetsofattributesform
the same sets, we can say that the attribute muscle pain is redundant: it
Let B = [R4,R5,R8)
did not change the definition of the sets by addition or deletion. Sets of
In this example, B is a subset of A; this is expressed as B c A. We can attributes with IK) redundancies are called independents. Sets that contain
also say that A is a supersei of B: A 2 B. the same elements as other sets but possess the fewest attributes are
The union of two sets forms a set containing all of the elements in both called redacts.
sets. For example, let™s suppose we have two sets, A and B, as follows: In Table 11.2, muscle pain has been removed because it did not add
any information.
A = (Rl,R2,R3,R4,R5.R6) Let™s now develop elementary sets based on the decisions in Table 11.2.
B = {R7,R8,R9] An elementary set that is defined based on a decision variable, which in
our case would be yes or 110, is called a “concept.” For Tables 11.1 and
The union of these sets yields the set (Rl,R2,R3,R4,R5,R6,$7, 11.2, these are (Rl,R4,R5) and {R2,R3,R6). These are defined by the
R8,R9). This is expressed as X = A u B. sets in which the decision (flu) is no for (Rl,R4,R5] and yes for
Let™s now calculate the intersection of the following two sets: [ R2,R3,R6).
What elementary sets can be formed from the attributes headache and
A = [Rl,R2,R3,R4,R&R6,R7) temperature together? These are the single-element sets [ Rl ],[R2),
B = [R2,R4,R6,R8,RlO) [R3],[R4),(R5],(R6). Because each of these sets is a subset of one of
These two sets intersect to form the set (R2,R4,R6). The intersection
is expressed as X = A n B. Another set function is called the cardinal-
TABLE 11.2 ROUGH SETS EXAMPLE 2.
ity-the size of a given set.
With this overview of set theory, let™s lwlw use these basics to see how Headache Flu
Temperature
ROW
rough sets work.
Normal NO
1 YE
Yes
High
? Yes
3 Yes Very high Yes
The Basics of Rough Sets
4 No NO
Normal
The main concept behind rough sets is collections of rows that have the NO
High
5 NO
Yes
same values for one or more attributes. These˜sets, called elementary Very high
6 NO
159
158 Statistically Based Market Prediction An Overview of Advanced Technologies



TABLE 11.3 RULES FOR EXAMPLE 2. greatest definable set contains all cases in which we have no conflicts and
is called the lower approximation. The least definable sets are ones in
(Temperature, normal)=(Flu,No)
which we may have conflicts. They are called the upper approximation.
(Headache, No) and (Temperature, High)=(Flu,No)
As in our earlier example, when there are no conflicts, we simply cre-
(Headache, Yes) and (Temperature, High)=(Flu,Yes)
ate a series of sets of attributes and a set for each decision variable. If the
(Temperature, Very High)=(Flu,Yes)
attribute set is a subset of the decision set, we can translate that rela-
tionship into a rule. When there are conflicts, there is no relationship and
we need to use a different method. We solve this problem by defining two
our decision-based elementary sets, we can use these relationships to pro-
different boundaries that are collections of sets. These are called the
duce the rules shown in Table 11.3.
upper and lower approximations. The lower approximation consists of all
Let™s now add the two examples (R7,R8) shown in Table 11.4.
of the objects that surely belong to the concept. The upper approximation
Having added these two examples, let™s redefine our elementary sets
consists of any object that possibly belongs to the concept.
of indiscernibility relationships for the attributes headache and temper-
Let™s now see how this translates into set theory. Let I = an elemen-
ature.Thesesetsare: {Rl),{R2),(R3],(R4J,(R5,R7),(R6,R8]. Ourel-
tary set of attributes, and X = a concept. The lower approximation is de-
ementary sets, based on our decision variable, are:
fined as:
For Flu = No, [Rl,R4,R&R8)
hveer = [x E U:l(x) E X)
For Flu = Yes, [R2,R3,R6,R7)
In words, this formula says that the lower approximation is all of the
As shown in Table 11.4, the decision on flu does rxx depend on the at- elementary sets that are proper subsets of the concept X. In fact, the U
tributes headache and temperature because neither of the elementary sets, means the universe, which is a fancy way of saying all.
[R5,R7] and (R6,R8], is a subset of any concept. We say that Table 11.4 The upper approximation is defined as:
is inconsistent because the outcomes of [R5 J and (R7) are conflicting,
For the same attribute values, we have a different outcome. Upper = (x E U:l(x) n x f 0)
The heart of rough sets theory is that it can deal with these types of in-
consistencies. The method of dealing with them is simple. For each con- This simply means that the upper approximation is all of the elemen-
cept X, we define both the greatest definable and least definable sets. The tary sets that produce a nonempty intersection with one or more con-
cepts.
The boundary region is the difference between the upper and lower
TABLE 11.4 ROUGH SETS EXAMPLE 3.
approximations.
Rough sets theory implements a concept called vagueness. In fact, this
ROW Headache Temperature Flu
concept causes rough sets to sometimes be confused with fuzzy logic.
1 Yes Normal NO
The rough sets membership function is defined as follows:
2 Ye5 High Yes
3 Yes Very high YS
4 NO Normal NO
5 NO High NO
6 NO Very high YC?S
7 NO High Ye5 This imple formula defines roughness as the cardinality of the inter-
NO
a Very high NO section of (1) the subset that forms the concept and (2) an elementary set
160 Statisticallv Based Market Prediction An Overview of Advanced Technologies 161


of attributes, divided by the cardinality of the elementary set. As noted TABLE 11.5 STEPS USING GENETIC ALGORITHM.
earlier, cardinality is just the number of elements. Let™s see how this con-
1. Encode the problem into chromosomes.
cept would work, using the sets in Table 11.4 in which headache is no
2. Using the encoding, develop a fitness function for ure in evaluating each
and temperature is high, (R5 and R7). If we want to compare the rough
chromosome™s value in solving a given problem.
set membership of this set to the decision class Flu = Yes, we apply our
3. Initialize a population of chromosomes.
formula to the attribute set and the Flu = Yes membership set
4. Evaluate each chromosome in the population.
(RZ,R3,R6,R7]. The intersection of these two sets has just one element,
5. Create new chromosomes by mating two chromosomes. (This is done by
R7. Our Headache = No and Temperature = High set has two elements, so
mutating and recombining two parents to form two children. We select
the rough set membership for this elementary set of attributes and the parents randomly but biased by their fitness.)
Flu = Yes concept is % = 0.5. 6. Evaluate the new chromosome.
This roughness calculation is used to determine the precision of rules 7. Delete a member of the population that is less fit than the new chromosome.
produced by rough sets. For example, we can convert this into the fol- and insert the new chromosome into the population.
lowing possible rule: 8. If a stopping number of generations is reached, or time is up, then return the
best chromosome(s) or, alternatively, go to step 4.
If Headache and Temperature = High, Flu = Yes (SO).

Rough sets technology is very valuable in developing market timing
systems. First, rough sets do not make any assumption about the distri- Genetic algorithms are a simple but powerful tool for finding the best
bution of the data. This is important because financial markets are not combination of elements to make a good trading system or indicator. We
based on a gaussian distribution. Second, this technology not only handles can evolve rules to create artificial traders. The traders can then be used
noise well, but also eliminates irrelevant factors. to select input parameters for neural networks, or to develop portfolio
and asset class models or composite indexes. Composite indexes are a
specially weighted group of assets that can be used to predict other as-
GENETIC ALGORITHMS-AN OVERVIEW
sets or can be traded themselves as a group. Genetic algorithms are also
useful for developing rules for integrating multiple system components
Genetic algorithms were invented by John Holland during the mid-1970s
and indicators. These are only a few of the possibilities. Let™s now discuss
to solve hard optimization problems. This method uses natural selection,
each component and step of a genetic algorithm in more detail.
“survival of the fittest,” to solve optimization problems using computer
software.
There are three main components to a genetic algorithm:
DEVELOPING THE CHROMOSOMES
1. A way of describing the problem in terms of a genetic code, like a
Let™s first review some of the biology-based terminology we will use. The
DNA chromosome.
initial step in solving a problem using genetic algorithms is to encode the
2. A way to simulate evolution by creating offspring of the chromo-
problem into a string of numbers called a “chromosome.” These numbers
somes, each being slightly different than its parents. can be binary real numbers or discrete values. Each element on the chro-
3. A method to evaluate the goodness of each of the offspring. mosome is called a “gene.” The value of each gene is called an “allele.”
The position of the gene on the chromosome is called the “locus.” The
This process is shown in Table 11 S, which gives˜ an overview of the string of numbers created must contain all of the encoded information
steps involved in a genetic solution. needed to solve the problem.
lb2 An Overview of Advanced Technolo+ lb3
Statisticallv Based Market Prediction


Drawdown, and Winning percentage on each rule and then evaluate their
As an example of how we can translate a problem to a chromosome,
fitness using a simple formula:
let™s suppose we would like to develop trading rules using genetic
algorithms. We first need to develop a general form for our rules, for
= (Net Profit/Drawdown)*Winning percentage
Fitness
example:

The goal of the genetic algorithm in this case would be to maximize
If Indicator (Length) > Trigger and Indicator (Length)[l] < Trig-
this function.
ger, then Place order to open and exit N days later.
Note: Items bold are being encoded into chromosomes
in

INITIALIZING THE POPULATION
We could also have encoded into the chromosomes the > and < opera-
tors, as well as the conjunctive operator “AND” used in this rule template. Next, we need to initialize the population by creating a number of chro-
Let™s see how we can encode a rule of this form. We can assign an in- mosomes using random values for the allele of each gene. Each numeri-
teger number to each technical indicator ;“” would like to use. For exam- cal value for the chromosomes is randomly selected using valid values
ple: RSI = 1,SlowK = 2, and so on. Trigger would be a simple real for each gene. For example, gene one of our example chromosome would
number. Place order could be 1 for a buy and -1 for a sell. N is the num- contain only integer values. We must also limit these values to integers
ber of days to hold the position. that have been assigned to a given indicator. Most of the time, these pop-
Let™s see how the following rule could be encoded. ulations contain at least 50 and sometimes hundreds of members.
If R˜%(9) > 30 and RSI(9)(1] c 30, then Buy at open and Exit 5 days
later. THE EVOLUTION

The chromosome for the above rule would be: 1,9,30,1,9,30,1,5. Reproduction is the heart of the genetic algorithm. The reproductive pro-
Having explained the encoding of a chromosome, I now discuss how to cess involves two major steps: (1) the selection of a pair of chromosomes
develop a fitness function. to use as parents for the next set of children, and (2) the process of com-
bining these genes into two children. Let™s examine each of the steps in
more detail.
EVALUATING FITNESS The first issue in reproduction is parent selection. A popular method
of parent selection is the roulette wheel method,* shown in Table 11.6.
We will now have the two parents produce children. Two major oper-
A fitness function evaluates chromosomes for their ability or fitness for
ations are involved in mating: (1) crossover and (2) mutation. (Mating is
solving a given problem. Let™s discuss what would be required to develop
not the only way to produce members for the next generation. Some ge-
a fitness function for the chromosome in the above example. The first
netic algorithms will occasionally clone fit members to produce children.
step is to pass the values of the chromosome™s genes to a function that can
This is called “Elitism.“)
use these values to evaluate the rule represented by the chromosome. We
will evaluate this rule for each record in ou training. We will then col-
lect statistics for the rule and evaluate those statistics using a formula
that can return a single value representing how fit the chromosome is for
solving the problem tit hand. Fork example, we can collect Net Profit, *Using this method, we will select two parents who will mate and produce children.
An Overview of Advanced Technologies
Statistically Based Market Prediction 165
164


TABLE 11.6 PARENT SELECTION. The Two-Point Crossover
A two-point crossover is similar to the one-point method except that two
I. Sum the fitness of all the population members, and call that sum X.
2. Generate a random number between 0 and X. cuts are made in the parents, and the genes between those cuts are ex-
3. Return the first population member whose fitness, when added to the iitness changed to produce children. See Figure 11.5 for an example of a two-
of the preceding population member, is greater than or equal to the random point crossover.
number from step 2.


The Uniform Crossover
There are three popular crossover methods or types: (1) one-point (,sin- In the uniform crossover method, we randomly exchange genes between
gle-point). (2) two-point, and (3) uniform. All of these methods have their the two parents, based on some crossover probability. An example of a
own strengths and weaknesses. Let™s now take a closer look at how the uniform crossover appears in Figure 11.6.
various crossover methods work. All three of our examples showed crossovers using binary operators.
You might wonder how to perform crossovers when the genes of the chro-
mosomes are real numbers or discrete values. The basics of each of the
The One-Point Crossover
crossover methods are the same. The difference is that, once we have se-
The one-point crossover randomly selects two adjacent genes on the chro- lected the genes that will be affected by the crossover? we develop other
mosome of a parent and severs the link between the pair so as to cut the operators to combine them instead of just switching them. For example,
chromosome into two parts. We do this to both parents. We then create when using real-number genes, we can use weighted averages of the two
one child using the left-hand side of parent 1 and the right-hand side of parents to produce a child. We can use one set of weighting for child 1 and
parent 2. The second child will be just the reverse. Figure 1 I .4 shows how another for child 2. For processing discrete values, we can just randomly
a one-point crossover works. select one of the other classes.




FIGURE 11.5 a two-point crossover.
An example of
FIGURE 11.4 of a˜one:pdint crbssover
An example
166 Statistically Based Market Prediction An Overview of Advanced Technologies 167


genes, defining schemata requires the symbols O,l˜ and *, where 0 and 1
are just binary digits, and * means don™t care. Figure 11.8 examines a
chromosome and two different schemata.
Schema 1 is a template that requires genes 1,2,3,6 to be 1. Schema 2 re-
quires a 0 in gene 4 and a 1 in gene five. Our sample chromosome fits both
schemata, but this is not always the case. Let™s say that, in a population
of 100 chromosomes, 30 fit schema 1 and 20 fit schema 2. One of the
major concepts of genetic algorithms then applies,
Let™s suppose that the average fitness of the chromosomes belonging
to schema 1 is 0.75, and the average fitness of those in schema 2 is 0.50.
The average fitness of the whole population is 0.375. In this case, schema
1 will have an exponentially increased priority in subsequent generations
FIGURE 11.6 An examDIe a uniform crossover.
of of reproduction, when compared to schema 2.
Schemata also affect crossovers. The longer a schema, the more eas-
ily it can get disrupted by crossover. The length of a schema is measured
Mutation as the length between the innermost and outermost 0 or 1 for a binary
chromosome. This distance is called the “defining length.” In Figure 11.8,
Mutation is a random changing of a gene on a chromosome. Mutations
schema 1 has a longer defining length.
occur with a low probability because, if we mutated all the time, then the
Different crossovers also have different properties that affect com-
evolutionary process would be reduced to a random search.
bining the schemata. For example, some schemata cannot be combined
Figure 11.7 shows an example of a mutation on a binary chromosome.
without disrupting them using either a single-point or a two-point
If we were working with real-number chromosomes, we could add a
crossover. On the other hand, single-point and two-point crossovers are
small random number (ranging from f 10 percent of the average for that
good at not disrupting paired genes used to express a single feature. The
gene) to its value to produce a mutation.
uniform crossover method can combine any two schemata that differ by
Several concepts are important to genetic algorithms. We will overview
these concepts without covering the mathematics behind them.
The first concept we must understand is the concept of similar tem-
plates of chromosomes, called schemata. If we are working with binary




PIPI™ I™ 10101
I I I I I
I I
Before mutation




After mutation

FIGURE 11.8 An example of a schema.
FIGURE 11.7 An example of mutation
168 Statistically Based Market Prediction An Overview of Advanced Technoloeies 169



one or more genes, but has a higher probability of disrupting schemata the more the measured length curves. It can also be a fractional number,
that require paired genes to express a feature. This point must be kept in such as 1.85. The fractal dimension of the data is important because sys-
mind when selecting encoding or crossover methods for solving problems. tems with similar fractal dimensions have been found to have similar
properties. The market will change modes when the fractal dimension
changes. A method called resealed range analysis can give both an indi-
UPDATING A POPULATION cator that measures whether the market is trending or not (similar to ran-
dom walk), and the fractal dimension on which the financial data is
After the genetic algorithm has produced one or more children, we apply calculated.
the fitness function to each child produced, to judge how well the child Thanks to Einstein, we know that in particle physics the distance that
solves the problem it was designed for. We compare the fitness of the new a random particle covers increases with the square root of the time it has
children to that of the existing population, and delete a randomly selected been traveling. In equation form, if we denote by R the distance covered
member whose fitness is less than that of a child we have evaluated. We and let T be a time index, we see that:
then add this child to the population. We repeat this process for each child
produced, until we reach a stopping number of generation or time. R = constant x To.5
Genetic algorithms are an exciting technology to use in developing
trading-related applications. To use them effectively, it is important to Let™s begin with a time series of length M. We will first convert this
understand the basic theory and to study case material that offers other time series of length N = M - 1 of logarithmic ratios:
applications in your domain or in similar domains. You don™t need to un-
derstand the mathematics behind the theory, just the concepts.
Nj=Loy%i=1,2,3 ,.._ (M-l)


CHAOS THEORY We now need to divide our time period of length N into A contiguous
subperiods of length n so that A x II = N. We then label each of these sub-
periods I,, with a = 1,2,3....A. We then label each of these elements Z, as
Chaos theory is an area of analysis that describes complex modes in
N,,a such that k = 1,2,3...n. We then define OK mean e by taking the time
which not all of the variables or initial conditions are known. One exam-
series of accumulated departures (X,,,) from the mean value e in the fol-
ple is weather forecasting; predictions are made using an incomplete se-
lowing form:
ries of equations. Chaos theory is not about randomness; it™s about how
real-world problems require understanding not only the model but also
the initial conditions. Even small numerical errors due to round off can
lead to a large error in prediction over very short periods of time. When
studying these types of problems, standard geometry does rx)t work. Summing over i = 1 to k, where k = 1,2,3,. ,n. The range is defined as
For example, suppose we want to measure the sea shore. If we measure the maximum minus the minimum value of X, for each subperiod Ia:
the shore line using a yardstick, we get one distance. If we measured it
using a flexible tape measure, we get a longer distance; the length depends
on how and with what tools we make the measurement.
Benoit Mandelbrot tried to solve this problem by creatingfracral geom- where 1 < = k < = n. This adjusted range is the distance that the under-
err-y. The fractal dimension is ˜a measure of how “squiggly a given line lying system travels for time index M. We then calculate the standard de-
is.” This number can take values of I or higher-the higher the number, viation of the sample for each subperiod Ia.
An Overview of Advanced Technologies 171
Statisticallv Based Market Prediction
170


covered than in a completely random time series. In trading terms, this
is called a “trading range.”
We know that the Hurst exponent can be used to tell if the market is
where 1 5 k < n. This standard deviation is used to normalize the range trending or if it is in a trading range. The question is: Can changes in the
R. Hurst generalized Einstein™s relation to a time series whose distribu- Hurst exponent be used to predict changes in the correlation between mar-
kets or in the technical nature of the market?
tion is unknown by dividing the adjusted range by the standard deviation,
showing that:

STATISTICAL PATTERN RECOGNITION

Statistical pattern recognition uses statistical methods to analyze and
classify data. Statistical pattern recognition is not just one method; it is
Now we can calculate H using the relationship
a class of methods for analyzing data.

0 Constant
R One example of statistical pattern recognition is called case-based
= nH
reasoning (CBR). CBR compares a library of cases to the current case. It
7”
then reports a list of similar cases. This idea is used by traders such as
where n is a time index and H is a power called the Hurst exponent, which Paul Tutor Jones and even by Moore Research in their monthly publica-
can lie anywhere between 0 and 1. We can calculate the (R/S) equation tion. This process requires developing an index for the cases, using meth-
ods such as C4.5 or various statistical measures. One of the most common
for subperiod II and create a moving Hurst exponent. This can now be
methods for developing these indexes is “nearest neighbor matching.”
used like any other indicator to develop trading rules.
Let™s see how this matching is done.
This is a simplified version of Peter™s published methodology, yet still
gives a good estimate of H.* The normal process for calculating H re- If the database depends on numerical data, calculate the mean and the
standard deviation of all fields stored in the database. For each record in
quires you to do a least squares on all (R/Sjn. We can skip this step and
the database, store how far each field is from the mean for that field in
use the raw H value,to develop indicators which are of value in trading
terms of standard deviation. For example, if the mean is 30 with a Stan-
systems. This does make H noisier but removes much of the lag. Peters
dard deviation of 3, an attribute with a value of 27 would be -1.0 stan-
suggests that the Hausdorff dimension can be approximated by the fol-
dard deviation from the mean. Make these calculations for each attribute
lowing relationship:
and case in the database. When a new case is given, develop a similarity
DH=2-H score. First, convert the new case in terms of standard deviation from the
mean and standard deviation used to build the index. Next, compare each
where DH is the fractal dimension and His the Hurst exponent. of the attributes in the new case to the standardized index values, and se-
The Hurst exponent H is of interest to traders since a value of 0.5 is lect the cases that are the nearest match. An example of a closeness func-
simply a random time series. If H is above 0.5, then the series has a mem- tion is shown in Table 11.7.
ory; in traders™ terms this is called “trending.” If H is less than 0.5 the Apply this function to each record in the database, and then report the
lower scoring cases as the most similar. Use these methods to find simi-
market is an antipersistent time series, one in which less distance is
lar patterns for automatic pattern recognition.
Similarity analysis can also be done using Pearson™s correlation or an-
*See Peters, Edgar E. (1994). Froctal Marker Analysis. (New York: John Wiley &
other type of correlation called “Spearman ranked correlation.”
SOlIS).
An Overview of Advanced Technoloeies
Statistically Based Market Prediction 173
172


functions for each variable™s attributes. We need to develop fuzzy mem-
TABLE 11.7 CLOSENESS FUNCTION.
bership functions for the height attributes of the mother, father, and child.
For our example, these attributes are tall, normal, and short. We have de-
(New case attribute-Stored case attributes)x Weight
Closeness = fined generic membership functions for these height attributes as follows
c. Total weights
(SD = standard deviation):
New case attribute is the normalized value of a given attribute for new cases.
Stored case attribute is the normalized value of a given attribute for the current Tall=maximum(O,min( 1,(X-Average Height)/(SD of height))).
database case being measured.
Short=maximum(O,min( l,(Average Height-X)/(SD of height))).
Total weights is the sum of all of the weighting factors.
Normal=maximum(O,(l-(abs(X-Average Height)/(SD of height)))),

When using these membership functions, substitute the following val-
ues for average height and standard deviation for the mother, father, and
Statistical pattern recognition can also be used to develop subgroups
child.
of similar data. For example, we can subclassify data based on some sta-
tistical measure and then develop a different trading system for each Mother: average height 65 inches, SD 3 inches.
class.
Father: average height 69 inches, SD 4 inches
Statistical pattern recognition is a broad area of advanced methods,
Cl˜ild: average height (12 months) 30 inches, SD 2 inches.
and this brief explanation only touches the surface. I showed the nearest
neighbor matching method because it is simple and useful in developing
Having developed the membership functions, we can now develop
analogs of current market conditions to previous years.
our fuzzy rules. These rules and their supporting facts are shown in
Table 11.8.
Using the above facts and our fuzzy membership functions for both
FUZZY LOGIC
the mother and father, we calculate the following output values for each
membership function:
Fuzzy logic is a powerful technology that allows us to solve problems that
require dealing with vague concepts such as tall or short. For example, a
Mother™s height short .66 normal .33 tall 0
person who is 6 feet tall might b e considered tall compared to the gen-
eral population, but short if a member of a basketball team. Another issue Father™s height short S normal .5 tall 0
is: How would we describe a person who is 5 feet 11 inches, if 6 feet is
considered tall? Fuzzy logic can allow us to solve both of these problems.
Fuzzy logic operators are made of three parts: (1) membership func-
TABLE 11.8 RULES FOR CHILD™S HEIGHT.
tion(s), (2) fuzzy rule logic, and (3) defuzzifier(s). The membership func-
tion shows how relevant data are to the premise of each rule. Fuzzy rule These two fuzzy rules are in our expert system:
logic performs the reasoning within fuzzy rules. The defuzzifier maps If Mother-Short and Father-Short, then Child-Short
1.
the fuzzy information back into real-world answers. 2. If Mother-Short and Father_Normal, then Child_Normal
Let™s see how fuzzy logic works, using a simple height example. We
We also have the following facts:
want to develop fuzzy rules that predict a one-year-old male child™s height Mother is 63 inches tall.
in adulthood, based on the height of his mother and father. The first step F&h&r is 67 inches tall.
in developing a fuzzy logic application is to develop fuzzy membership
174 Statisticallv Bared Market Prediction An Overview of Advanced Technologies 175


Let™s see what happens if we rerun the fuzzy rules using these facts. functions. This will convert the fuzzy output back into a real height for
our one-year-old male child:
1. If Mother-Short (.66) and Father-Short(S), then
(S x 28 + .5 x 30 + 0 x 32)/(.5 + S) = 29 inches tall
Child-Short (S).
2. If Mother-Short (.66) and Father-Normal (S), then Child-Normal To see how these membership functions interact for the height of our
(.5). one-year-old child, look at Figure 11.9.
This chapter has given an overview of different advanced technologies
Using the roles of fuzzy logic, we take the minimum of the values as-
that are valuable to traders. We will use these technologies in many ex-
sociated with the conditions when they are joined by an “and.” If they amples in the remaining chapters. Now that we have domain expertise in
are joined by an “or,” we take the maximum. both analyzing the markets and using many different advanced tech-
As you can see, the child is both short and normal. We will now use nologies, we are ready to design state-of-the-art trading applications.
something called defuzzification to convert the results of these rules back
to a real height. First, find the center point of each of the membership
functions that apply to the height of the child. In our case, that is 28 for
short, 30 for normal, and 32 for tall. Next, multiply the output from the
rules associated with each membership function by these center point
values. Divide the result by the sum of the outputs from the membership




33.0 34.0
27.0 28.0 29.0 30.0 31 .o 32.0

Dekuifkation converts fuzzy rule output into numerical values.
FIGURE 11.9 An example of a siinple defuzzication function for height.
12
Part Three How to Make Subjective
Methods Mechanical
MAKING SUBJECTIVE
METHODS MECHANICAL


Ways of making subjective forms of analysis mechanical form one of the
hottest areas of research in trading systems development. There are two
key reasons for this concentrated activity. First, many people use sub-
jective methods and would like to automate them. Second, and more im-
portant, we can finally backtest these methods and figure out which ones
are predictive and which are just hype.
Based on both my research and the research of others such as Tom
Joseph, I have developed a general methodology for making subjective
trading methods mechanical. This chapter gives an overview of the pro-
cess. The next two chapters will show you how to make Elliott Wave and
candlestick recognition mechanical, using Omega TradeStation. Let™s
NW discuss the general methodology I use to make subjective methods
mechanical.
The first step is to select the subjective method we wish to make me-
chanical. After we have selected the method, we need to classify it, based
on the following categories:

1. Total visual patterns recognition,
2. Subjective methods definition using fuzzy logic.

179
180 Making Subjective Methods Mechanical How to Make Subjective Methods Mechanical 181



3. Human-aided semimechanical methods. rules that are understood but may not be easily defined. My work has
shown that, in these types of subjective methods, the better approach is
4. Mechanical definable methods.
to identify only 15 percent to 40 percent of all cases, making sure that
each has been defined correctly. The reason is that the eye can identify
A subjective form of analysis will belong to one or more of these cat-
many different patterns at once.
egories. Let™s now get an overview of each one.
For example, if we are trying to mechanize divergence between price
and an oscillator, we need to define a window of time in which a diver-
TOTALLY VISUAL PATTERNS RECOGNITION gence, once set up, must occur. We also need to define the types of di-
vergence we are looking for. The human eye can pick up many types of
divergences. that is, divergences based on swing highs and lows or on the
This class of subjective methods includes general chart patterns such as
angle between the swing high and the swing low.
triangles, head and shoulders, and so on. These are the hardest types of
Figure 12.1 shows several different types of divergence that can be
subjective methods to make mechanical, and some chart patterns cannot
picked up by the human eye. It also shows how a product called Diverg-
be made totally automated. When designing a mechanical method for this
EngineTM, by Inside Edge Systems, was able to identify several differ-
class of pattern, we can develop rules that either will identify a large per-
ent divergences during late 1994, using a five-period SlowK. One
centage of that pattern but with many false identifications, or will iden-
tify a small percentage of the pattern with a high percentage of accuracy.
In most cases, either approach can work, but developing the perfect def-
inition may be impossible.


SUBJECTIVE METHODS DEFINITION USING FUZZY LOGIC

Subjective methods that can be defined using fuzzy logic are much eas-
ier than methods that develop a purely visual type of pattern. Candle-
stick recognition is the best example of this type of subjective method.
Candlestick recognition is a combination of fuzzy-logic-based attributes
and attributes that can be defined 100 percent mechanically. Once you
have developed the fuzzy definitions for the size of the candlesticks, it is
very easy to develop codes to identify different candlestick patterns.


HUMAN-AIDED SEMIMECHANICAL METHODS

0.00
A human-aided semimechanical method is one in which the analyst is
JUI m oc!
SW NO” Dee
using general rules based on observations and is actually performing the
FIGURE Several different types of divergence can be picked up by
analysis of the chart. There are many classic examples of this method. 12.1
the human eye. A product called DivergEngineTM is able to identify simple
The first one that comes to mind is divergence between price and an os-
divergences automatically.
cillator. This type of pattern is often drawn on a chart by a human, using
182 Making Subjective Methods Mechanical How to Make Subjective Methods Mechanical 183


example is a divergence buy signal set up in late September and early MECHANICALLY DEFINABLE METHODS
October of 1994. (Divergences are shown by circles above the bars in
Figure 12.1.) In early November 1994, we had a sell signal divergence. Mechanically definable methods allow us to develop a mathematical for-
This divergence led to a 30-point drop in the S&P500 in less than one mula for the patterns we are trying to define. One example of these types
month. of patterns is the swing highs and lows that are used to define pivot-point
Another type of analysis that falls into this class is the method of draw- trades. Another example would be any gap pattern. There are many ex-
ing trend lines. When a human expert draws a trend line, he or she is con- amples of this class of methods, and any method that can be defined by a
necting a line between lows (or between highs). Important trend lines statement or formula falls into this class.
often involve more than two points. In these cases, an expert™s drawn
trend line may not touch all three (or more) points. The subjective part of
drawing trend lines involves which points to connect and how close is MECHANIZING SUBJECTIVE METHODS
close enough when the points do not touch the trend line. Figure 12.2
shows an example of a hand-drawn major trend line for the S&P500 dur- Once you have classified the category that your method belongs to, you
ing the period from July to October 1994. Notice that not all of the lows need to start developing your mechanical rules. You must begin by iden-
touch the trend line. After the market gapped below this trend line, it col- tifying your pattern or patterns on many different charts-even charts
lapsed 20 points in about three weeks. using different markets.
After you have identified your subjective methods on your charts, you
are ready to develop attributes that define your patterns-for example, in
candlestick charts, the size and color of the candlestick are the key at-
tributes. With the attributes defined, you can develop a mathematical
definition or equivalent for each attribute. Definitions may use fuzzy
concepts, such as tall or short, or may be based on how different techni-
cal indicators act when the pattern exists. Next, you should test each of
your attribute definitions for correctness. This step is very important bt-
cause if these building blocks do not work, you will not be able to develop
an accurate definition for your patterns. After you have developed your
building blocks, you can combine them to try to detect your pattern.
When using your building blocks™ attributes to develop your patterns for
making your subjective method mechanical, it is usually better to have
many different definitions of your pattern, with each one identifying only
10 percent of the cases but with 90 percent correctness.
Making subjective methods mechanical is not easy and should continue
to be a hot area of research for the next 5 to 10 years. Given this outline
of how to make a subjective method mechanical, I will mw)w show you two
/ examples: (1) Elliott Wave analysis and (2) candlestick charts. These will
Jun AdI Od MV Eec
*w se13
be shown in the next two chapters, respectively.
FIGURE 12.2 An example of an S&P500 trend line, drawn between July
and October 1994.
Building the Wave 185




13
3 5 c
Building the Wave
4
1 a
2
P b
A
Failed breakout
Normal five-wave Double top five-wave
sequence sequence

Elliott Wave analysis is based on the work of R. N. Elliott during the Historically, these two patterns occur 70% of the time
1930s. Elliott believed that the movements of the markets follow given
patterns and relationships based on human psychology. Elliott Wave
FIGURE 13.1 Three possible five-wave Elliott Wave patterns.
analysis is a complex subject and has been discussed in detail in many
books and articles. Here, we will not go into it in detail but will provide
an overview so that you can understand (1) why I think Elliott Wave is over, the market sells off, creating wave two. Wave two ends when the
analysis is predictive, and (2) how to make it mechanical so that it can be market fails to make new lows and retraces at least 50 percent, but less
used to predict the markets. than 100 percent, of wave one. Wave two is often identified on a chart by
a double-top or head-and-shoulders pattern. After this correction, the
market will begin to rally again-slowly at first, but then accelerating as
AN OVERVIEW OF ELLIOTT WAVE ANALYSIS
it takes out the top of wave one. This is the start of wave three. As another
sign of wave three, the market will gap in the direction of the trend. Com-
Elliott Wave theory is based on the premise that markets will move in ra-
mercial traders begin building their long position when the market fails
tios and patterns that reflect human nature. The classic Elliott Wave pat-
to make new lows. They continue to build this position during wave three
tern consists of two different types of waves:
as the market continues to accelerate. One of the Elliott Wave rules is
that wave three cannot be the shortest wave and is, in fact, normally at
1. A five-wave sequence called an impulse wave.
least 1.618 times longer than wave.one. This 1.618 number was not se-
2. A three-wave sequence called a corrective wave.
lected out of thin air. It is one of the Fibonacci numbers-a numerical se-
quence that occurs often in nature. In fact, many of the rules of Elliott
The classic five-wave patterns and the three-wave corrective wave are
Wave relate to Fibonacci numbers.
shown in Figure 13.1. Normally, but not always, the market will move in
At some point, profit taking will set in and the market will sell off.
a corrective wave after a five-wave move in the other direction.
This is called wave four. There are two types of wave four: (1) simple
Let™s analyze a classic five-wave sequence to the upside. Wave one is
and (2) complex. The type of wave four to expect is related to the type of
usually a weak rally with only a feti traders participating. When wave one

184
Building the Wave 187
186 Makine Subiective Methods Mechanical


wave two that occurred. If wave two was simple, wave four will be com- TABLE 13.1 TRADING THE ELLIOTT WAVE.
plex. If wave two was complex, wave four will be simple. After the wave- We can trade the basic five-wave pattern as follows:
four correction, the market rallies and usually makes new highs, but the
1. Enter wave three in the direction of the trend.
rally is fueled by small traders and lacks the momentum of a wave-three
2. Stay out of market during wave four.
the
rally. This lack of momentum, as prices rally to new highs or fall to new
3. Enter the wave-five rally in the direction of the trend.
lows, creates divergence using classic technical indicators. After the five
4. Take a countertrend trade at the top of wave five.
waves are finished, the market should change trend. This trend change
will be either corrective or the start of a new five-wave pattern. The mir-
ror image of this pattern exists for a five-wave move to the downside.
example, if we identify a wave three on both the weekly and daily charts,
Elliott Wave patterns exist on each time frame, and the waves relate to
we have a low-risk, high-profit trading opportunity. If we are in a five-
each other the same way. For example, a five-wave pattern can be found
wave downward sequence on a weekly chart but a wave-three upward pat-
on a monthly, weekly, daily, or intraday chart. You must be in the same
tern on a daily chart, the trade would be a high-risk trade that may not be
wave sequence in each time frame. For example, in a five-wave down-
worth taking. When trading Elliott Waves, it is important to view the
ward pattern, you would be in a wave four in a monthly or weekly time
count on multiple time frames.
frame, and in a wave three to the upside on a daily or intraday time frame.
When you study an Elliott Wave pattern closely, you will see that each
wave is made up of similar patterns. Many times, in a five-wave pattern,
wave one, or three, or five will break down into additional five-wave pat- USING THE ELLIOTT WAVE OSCILLATOR TO IDENTIFY
THE WAVE COUNT
terns. This is called an extension.
Elliott Wave analysis has many critics because it is usually a subjec-
Let™s now learn how to objectively identify the classic five-wave pattern.
tive form of analysis. This chapter will show you how to make the most
In 1987, Tom Joseph, of Trading Techniques, Inc., discovered that using
important part of Elliott Wave analysis-the pattern of waves three, four,
a five-period moving average minus a thirty-five-period moving average
and five--objective and totally mechanical.
of the (High + Low)/2 produced an oscillator that is useful in counting
Elliott Waves. He called this discovery the Elliott Wave oscillator. Using
this oscillator and an expert system containing the rules for Elliott Wave,
TYPES OF FIVE-WAVE PATTERNS
he produced software called Advanced GET, published by Trading Tech-
niques, Inc. Advanced GETTM also has many Gann™methods for trading,
The three possible five-wave patterns have been shown in Figure 13.1.
and seasonality and pattern matching are built into the package. GET
The first two are the classic five-wave sequence and the double-top mar-
does a good job of objectively analyzing Elliott Waves. It is available for
ket. The mirror image of these patterns exists on the downside and, ac-
MS-DOS. Windows, and for TradeStation. Tom agreed to share some of
cording to Tom Joseph, these two patterns account for 70 percent of all
h!s research with us so that we can begin to develop our own TradeSta-
possible historical cases. Finally, when the market fails to hold its trend
and the trend reverses, we have afailed breakour sequence pattern. The tion utility for Elliott Wave analysis.
The Elliott Wave oscillator produces a general pattern that correlates
first two five-wave patterns consist of a large rally; then consolidation
to where you are in the Elliott Wave count. Based on the research of
occurs, followed by a rally that tests the old highs or sets new ones. The
Tom Joseph, we can explain this pattern by identifying a five-wave se-
failed breakout pattern occurs 30 percent of the time and is unpredictable.
quence to the upside. We start this sequence by first detecting the end
The classic five-way pattern can be traded as shown in Table 13.1.
of a five-wave sequence to the downside. The first rally that occurs after
Trading the five-wave pattern sounds easy, but the problem is that
the market makes new lows but the Elliott Wave oscillator does not is
the current wave count depends on the time frame being analyzed. For
188 Makine Subiective Methods Mechanical Buildine the Wave 189


called wave one. After the wave-one rally, the market will have a cor- USER FUNCTIONS FOR ELLIOTT WAVE TOOL.
TABLE 13.2
rection but will fail to set new lows. This is wave two, which can be one
Copyright 0 1996 Ruggiero Associates. This code for the Elliott Wave oscillator
of two types. The first is simple; it may last for a few bars and have lit- is only for personal use and is not to be used to create any commercial product.
tle effect on the oscillator. The second, less common type is a complex
wave two. It will usually last longer, and the oscillator will pull back Inputs: DataMNumeric)
Vars: Osc535(O),Price(O);
significantly. There is a relationship between wave two and wave four.
Price=(H of Data(DataSet)+L of Data(DataSetIY2:
If wave two is simple, wave four will be complex. If wave two is com-
If Average(Price,35)oO then begin
plex, wave four will be simple. After wave two is finished, both the mar-
Osc535=Average(Price,S)-Average(Price,35);
ket and the oscillator will begin to rise. This is the start of wave three.
end;
This move will accelerate as the market takes out the top of wave one.
ElliottWaveOsc=Osc535;
A characteristic of wave three is that both the market and the Elliott
Wave oscillator reach new highs. After wave three, there is a profit- Copyright 1996 Ruggiero Associates. This code for the Elliott trend indicator is
taking decline-wave four. After wave four, the market will begin to only for personal use and is not to be used to create any commercial product.
rally and will either create a double top or set new highs, but the Elliott
Inputs: DataSet(Numeric),Len(Numeric),Trigger(Numeric);
Wave oscillator will fail to make new highs. This divergence is the clas-
Vars: Trend(O),Osc(O);
sic sign of a wave five. The oscillator and prices could also make new
Osc=ElliottWaveOsc(DafaSet);
highs after what looks like a wave four. At this point, we have to rela-
If Osc=Highest(Osc,Len) and Trend=0 then Trend=l;
bel our wave five a wave three. Another important point is that wave
If Osc=Lowest(Osc,Len) and Trend=0 then Trend=-1:
five can extend for a long time in a slow uptrend. For this reason, we
If Lowest(Osc,LenkO and Trend=-1 and 0˜0.1 *Trigger*Lowest(Osc,Len) then
cannot be sure the trend has changed until the Elliott Wave oscillator
Trend=1 ;
has retraced more than 138 percent of its wave-five peak.
If Highest(Osc,Len)>O and Trend=1 and 0x-l *Trigger*Highest(Osc,Len) then
Trend=-1 ;
ElliottTrend=Trend;
TRADESTATION TOOLS FOR COUNTING ELLIOTT WAVES

The first step in developing our Elliott Wave analysis software is to de- stand-alone system; it gives up too much of its trading profit on each
velop the Elliott Wave oscillator. The code for this oscillator and a user trade before reversing. Even with this problem, it is still predictive and
function to tell us whether we are in a five-way sequence to the upside is is profitable as a stand-alone system on many markets. Let™s now use
shown in Table 13.2, coded in TradeStation EasyLanguage. this Elliott Trend indicator to build a series of functions that can be used
The user function in Table 13.2 starts with the trend set to zero. We to count the classic 3,4,5 wave sequence. The code for the functions that
initiated the trend based on which occurs first, the oscillator making a count a five-wave sequence to the upside is shown in Table 13.3, stated
“Len” bar high or making it low. If the trend is up, it remains up until the in TradeStation™s EasyLanguage.
Elliott Wave oscillator retraces the “Trigger” percent of the Len bar high The code in Table 13.3 has five inputs. The first is the data series we
and that high was greater than 0. The inverse is also true if the current are applying the function to. For example, we could count the wave pat-
terns on both an intraday and a daily time frame by simply calling this
trend is down. It will remain down untiI the market retraces the Trigger
percent of the Len bar low as long as the low was less than 0. function twice, using different data series. Next, not wanting to call these
functions many times because they are computationally expensive, we
This trend indicator normally will change trend at the top of wave one
pass, in both the Elliott Wave oscillator and the Elliott Trend indicator.
or when wave three takes out the top of one. For this reason, it is not a
190 Making Subiective Methods Mechanical Build& the Wave 191


TABLE 13.3 SIMPLE ELLIOTT WAVE COUNTER FOR 3,4,5 UP. TABLE 13.3 (Continued˜
Copyright 0 1996 Ruggiero Associates. This code to count five waves up is only HiPrice=HiPrice2:
for personal use and is not t o be used to create any commercial product. HiOsc2=-999;
HiPriceZ=-999;
Inputs: DataSet(Numeric),Osc(NumericSeries),ET(NumericSeries),Len(Numeric),
end;
Trig(Numeric);
( If the trend changes in a wave 5 label this a -3 or a wave three down1
Vars: Price(O),Wave(O),HiOsc(-999),HiOsc2(-999),HiPrice(-999),HiPrice2(-999);
( and reset all variables)
Price=(High of Data(DataSet)+Low of Data(DataSet))/Z;
I f ET=-1 then begin
( Is current wave sequence up or down}
wave=-3;
I When we change from down to up label it a wave 31
HiOsc=-999;
I and save current high osc and pricet
HiPrice=-999;
I f ET=1 and ET[lI=-1 and 0˜00 then begin;
HiOsc2=-999;
HiOsc=Osc;
HiPrice2=-999;
HiPricePrice;
end:
wave=3;
wave345up=wave;
end;
I If wave 3 and oscillator makes new high save itl
if Wave=3 and HiOsc<Osc then HiOsc=Osc;
( if wave 3 and price makes new high save itl Our final two arguments are (1) the Len used for the window to identify
the wave counts and (2) the retracement level required to change the trend.
if Wave=3 and HiPricxPrice then HiPrice=Price;
[ If your in a wave 3 and the oscillator pulls back to zero Let™s now use these functions to create the Elliott Wave counter. This
label it a wave 41 code is shown in Table 13.4.
if Wave=3 and Osc<=O and ET=1 then Wave=4; The code in Table 13.3 sets the wave value to a three when the trend
( If you™re in a wave 4 and the oscillator pulls back above zero and prices changes from-l to 1. After that, it starts saving both the highest oscil-
break out then label it a wave 5 and set up second set of high oscillator and lator and price values. It continues to call this a wave three until the os-
price1 cillator retraces to zero and the trend is still up. At this point, it will
if Wave=4 and Price=Highest(Price,5) and Oso=O then begin
Wave=S;
HiOsc2=Osc; TABLE 13.4 SIMPLE ELLIOTT WAVE COUNTER USER FUNCTION
HiPriceZ=Price; FOR THE UP WAVE SEQUENCE.
end;
Copyright 0 1996 Ruggiero Associates. The code™for this Elliott Wave Counter is
if Wave=5 and HiOscZ<Osc then HiOsc2=0sc;
only for personal use and is not t o be used t(, create any commercial product.
if Wave=5 and HiPrice2<Price then HiPriceZ=Price;
1 If Oscillator sets a new high relabel this a wave 3 and reset wave 5 levelsl Inputs: DataSet(Numeric),Len(Numeric),Trig(Numeric);
I f HiOscZ>HiOsc and HiPrice2>HiPrice and Wave=5 and ET=1 then begin vars: WavCount(0);
Wave=3;
WavCount=Wave345Up(DataSet,EIliottWaveOsc(DataSet),EIIiottTrend(DataSet,
HiOsc=HiOscZ; Len,Trig).Len,Trig);
Elliott345=WavCount;
192 Making Subjective Methods Mechanical Buildine the Wave 193


label it a wave four. If we are currently in a wave four and the oscilla- TABLE 13.6 ELLIOTT WAVE COUNTER
tor pulls above zero and the (High + Low)/2 makes a five-day high, we SYSTEM RESULTS D-MARK.
label this a wave five. We then set a second set of peak oscillator val-
Net profit $35,350.00
ues. If the second peak is greater than the first, we change the count Trades 57
back to a wave three. Otherwise, it stays a wave five until the trend in- Percent profitable 51%
dicator flips to -1. Average trade $690.35
Let™s see how to use our functions to develop a simple Elliott Wave Drawdown -$10,237.50
Profit factor 2.10
trading system. The code for this system is shown in Table 13.5.
Our Elliott Wave system generates a buy signal when the wave count
changes to a wave three. We reenter a long position when we move from This was not just an isolated case: over 80 percent of the cases we
a wave four to a wave five. Finally, we reenter a long position if the wave tested in the above range produced profitable results.
count changes from wave five back to wave three. Our exit is the same After developing these parameters on the D-Mark, we tested them on
for all three entries, when the Elliott Wave oscillator retraces to zero. the Yen. Once again, we used type 67/99 continuous contracts supplied
The entries of this system are relatively good, but if this were a real by Genesis Financial Data Services. We used data for the period from
trading system, we would have developed better exits. We tested this sys- 8/U76 to 308196. The amazing results (with $50.00 deducted for slippage
tem of the D-Mark, using 67/99 type continuous contracts in the period and commissions) are shown in Table 13.7.
from 2/13/75 to 3/18/96, and it performed well. Because our goal is to This same set of parameters did not work only on the D-Mark and Yen,
evaluate Elliott Wave analysis as a trading tool, we optimized the system it also worked on crude oil and coffee as well as many other commodities.
across the complete data set in order to see whether the system was These results show that Elliott Wave analysis is a powerful tool for use in
robust. We optimized across a large set of parameters (ranging from 20 developing trading systems. The work done in this chapter is only a start-
to 180 for length, and from .5 to 1 .O for trig) and found that a broad range ing point for developing mechanical trading systems based on Elliott
of parameters performed very well. The set of parameters using a length Waves. Our wave counter needs logic added to detect the wave one
of 20 and a trigger of .66 produced the results shown in Table 13.6 for the and wave two sequence as well as adding ratio analysis of the length of
period from 2/13/75 to 3/18/96 (with $50.00 deducted for slippage and each wave. Our system does not detect the top of wave three and wave
commissions). five. If we can add that feature to the existing code and do even a fairjob
of detecting the end of both wave three and wave five, we may signifi-
cantly improve our performance. We could also trade the short side of
TABLE 13.5 CODE FOR ELLIOTT WAVE the market. Even with these issues, our basic mechanical Elliott Wave
COUNTER TRADING SYSTEM.
Inputs: Len(SO),Trig(.7);
TABLE 13.7 THE ELLIOTT WAVE COUNTER
Vars: WavCount(O),Osc(O);
SYSTEM RESULTS ON THE YEN.
Osc=ElliottWaveOsc(l);
Net profit $89,800.00
WavCount=Elliott345(1 ,Len,Trig);
Trades 51
If WavCount=3 and WavCount[ll<=O then buy at open;
Percent profitable 51%
If WavCount=5 and WavCount[ll=4 then buy at open; Average trade $1,760.70
If WavCounk3 and WavCount[ll=5 then buy at open; Drawdown -$5,975.00
If Osc<O then exitlong at open; Profit factor 4.16
194 Buildine the Wave 19s
Making Subjective Methods Mechanical



system shows that Elliott Wave analysis does have predictive value and
can be used to develop filter trading systems that work when applied to
various commodities.


EXAMPLES OF ELLIOTT WAVE SEQUENCES USING
ADVANCED GET

We will discuss some examples using charts and Elliott Wave counts gen-
erated from Tom Joseph™s Advanced GET software. Elliott Wave analy-
sis can be applied to both the Futures markets and individual Stocks.
In the first example (Figure 13.2), the March 1997, British Pound is
shown. From mid-September 1996 through November 1996, the British
Pound traded in a very stong Wave Three rally. Then the market enters
into a profit-taking stage followed by new highs into January 1997. How-
ever, the new high in prices fails to generate a new high in Tom Joseph™s
FIGURE 13.3 Boise Cascade--Daily Stock Chart.
Elliott Oscillator, indicating the end of a Five Wave sequence. Once a
Five Wave sequence is completed, the market changes its trend.
The daily chart of Boise Cascade is shown on Figure 13.3 trading in a
Five Wave decline. The new lows in Wave Five does not generate a new




FIGURE 13.4 British Pound with Profit Taking Index (PTI).
British Pound March˜l997.
FIGURE 13.2
Making. Subiective Methods Mechanical
196



low in Tom Joseph™s Elliott Oscillator, indicating the end of a Five Wave
sequence. Once a Five Wave sequence is completed, the market changes
its trend.
14
Using the Profit-Taking Index (PIT)
Mechanically
When a Wave Four is complete, the major question confronting the trader
is whether the market will make a new high in Wave Five. Tom Joseph and
Identifying and Testing
his staff at Trading Techniques Inc., has devised a model that will predict
the potential for a new high. This model is called the Profit Taking Index
Candlestick Patterns
(PTI). The PTI is calculated by measuring the area under Wave Three
and comparing it with the area under Wave Four. If the PTI is greater
than 35, a new high is expected (Figure 13.4).
If the PTI is less than 35, the market fails to make a new high and will
usually result in a failed Fifth Wave or Double Top (Figure 13.5).




Candlestick chart analysis is a subjective form of analysis. The analyst
must first identify the patterns and then judge their significance. For ex-
ample, a white hammer pattern is more bullish after a major downtrend.
Several software vendors have developed software to automatically iden-
tify candlestick patterns. Some of these products also generate mechan-
ical trading signals. Generally, these packages do well at identifying the
patterns, but they have mixed results in using their mechanical trading
signals.
In this chapter, we will use fuzzy logic to identify several candlestick
patterns using TradeStation. We will also show you how to integrate other
forms of technical analysis with candlesticks to develop mechanical trad-
ing signals.


FIGURE 13.5 Weekly Boise Cascade Stock. Double Top. HOW FUZZY LOGIC JUMPS OVER THE CANDLESTICK

Let™s now see how fuzzy logic can be used to analyze candlestick charts.
Trading Techniques Inc. provides free information on mechanically counting Elliott
In our height example, we saw that the first step in developing a fuzzy
Waves and other studies. They can be contacted at (330) 645.0077 or download from
logic application is to list the variables involved and then develop a list of
their web site www.tradingtech.com.

197
Mechanicallv Identifviw and Testine Candlestick Patterns 199
198 Making Subjective Methods Mechanical


FUZZY PRIMITIVES FOR CANDLESTICKS
TABLE 14.1 A CANDLE™S ATTRIBUTES.

Color A single candlestick has the following characteristics: color, shape, upper
White or black
shadow size, and lower shadow size. Not all characteristics require fuzzy
Shape logic. As noted above, color does not require fuzzy logic. Let™s look at an
Long, small, or about equal :..
example of a fuzzy logic function that identifies a candle with a long
Upper Shadow Size shape. The code for this function in TradeStation™s EasyLanguage is
Long, small, or about none
shown in Table 14.2.
Lower Shadow Size The function in Table 14.2 will return a 1 when the current candle size
Lone. small, or about none
is greater than or equal to On&of times the average candle size over the
last lookback days, and a zero when it is less than ZeroCof times the av-
erage size. When the candle size is between these range limits, it returns
a scaled value between 0 and 1. This function can also handle a case
attributes for each variable. For a single candlestick, the attributes are as
where the previous candle was very long and the next candle should also
shown in Table 14.1.
be long, but, using the rule based on the average size, the candle would
Not all variables require fuzzy logic. In our list, color does not, be-
cause color is simply the sign of the close minus the open. We will now
develop a membership function for each of these variables. The “shape”
candlestick variable is represented graphically in Figure 14.1. TABLE 14.2 CODE FOR FUZZY LONG FUNCTION.

Inputs: OPrice˜NUMERICSERIES˜,CPrice˜NUMERlCSERlES˜,LBack˜NUMERlC˜,
OneCof(NUMERIC),ZeroCof(NUMERIC);
Vars: PrevLong˜O˜,CRangefO˜,AveRange˜O˜.ZTrig˜O˜,OneTrig˜O˜,TalIfO˜,Scale˜O˜:
1 .oo I Calculate the range for the candle]
CRange=absvalue(OPrice-CPrice);
I Calculate what level represents a 01
Measuring stick
.?5 ZTrig=Average(CRang$,LBack)*ZeroCof;
1 Calculate what level represents a 1)
1
OneTrig=Average(CRange,LBack)˜OneCof;
.50 - long function I Calculate the diiference between the zero and one level]
Scale=OneTrig-ZTrig:
1 If One Level and Zero Level are the same set to 99.99 50 it can be a large bar]
.25 -
if Scale=0 then Scale=99.99;
˜.
( Calculate the furry membership to tall]
Tall=maxlist(O,minlist(l ,(CRange-OneTrig)/(Scale)));
.oo
I If previous bar is big relax requirements)
Average*+ one trigger:
Average
Average,- zero triggers:
if Tall[li=l and CRange[ll-ZTrigoO then Tall=maxlist(O,minlist(l.(CRange-
uTall” Is two times the average height. CRangeIll)/KRange[l I-ZTrig)));
FuzzvLone-Tall:
A fuzzy logic function that identifies tall candlesticks.
FIGURE 14.1
200 Making Subjective Methods Mechanical 201
Mechanically Identifying and Testing Candlestick Patterns



TABLE 14.3 CANDLESTICK PRIMITIVE FUNCTIONS.
not have been identified correctly. We also handle divide-by-zero condi-
tions that occur when the open, high, low, and close are all the same. Candlestick Color
To identify most of the common candlestick patterns, we need func- CandleColotfOpen,Close˜
tions that can classify all of the attributes associated with a candlestick. Shape
Candlestick
The shape of a candlestick can be long, small, or doji. The upper and FuzzyLongfOpen,Close,LookBack,OneTrigger,ZeroTrigger)
lower wick can be large, small, or none. We also need to be able to iden- FuzzySmali(Open,Close,LookBack,OneTrigger,ZeroTrigger)
tify whether there are gaps or whether one candle engulfs another. After Miscellaneous Functions
we have developed functions to identify these attributes, we can start to EnCulfingfOpen,Close,RefBarJ
identify more complex patterns. WindowDown(Open.High,Low,Close,LookBack)
WindowLJpfOpen,High,Low,Close,LookBack)


DEVELOPING A CANDLESTICK RECOGNITION
UTILITY STEP-BY-STEP Let™s now discuss some of the inputs to these functions, beginning with
the parameter LookBack. This is the period used to calculate a moving
The first step in developing a candlestick recognition tool is to decide average of the body size of each candlestick. The moving average is used
what patterns we want to identify. In this chapter, we will identify the as a reference point to compare how small or large the current candle is,
following patterns: dark cloud, bullish engulf, and evening star. Next, we relative to recent candles.
need to develop a profile of each of these patterns. The plates for the pat- The OneTrigger is the percentage of the average candle size that will
terns have been illustrated by Steve Nison in his first book, Japanese Can- cause the function to output a 1, and the ZeroTrigger is the percentage of
dlestick Charring Techniques, published by John Wiley & Sons, Inc., 1990. the average candle size for outputting a zero. The RefBar parameter is
Let™s now describe each of these three patterns, beginning with the used by the engulfing function to reference which candlestick the current
dark cloud cover. The dark cloud cover consists of two candlesticks: candlestick needs to engulf.
(1) a white candle with a significant body and (2) a black candle that Another important issue when using these functions is that the
opens above the high of the white candle but closes below the midpoint OneTrigger is smaller than the ZeroTrigger for functions that identify
of the white candle. This is a bearish pattern in an uptrend. small or doji candles. When using the long candle size function, the
The bullish engulfing pattern also consists of two candlesticks. The OneTrigger is larger than the ZeroTrigger.
first is a black candle. The second is a white candle that engulfs the black The engulfing function returns a 1 if the current candle engulfs the
candle. This is a bullish sign in a downtrend. RefBar candle. The window-up and window-down functions return a
Our final pattern is an evening star. This pattern is a little more com- number greater than zero when there is a gap in the proper direction. The
plex. It consists of three candles: (1) a significant white candle, (2) a exact return value from these functions is based on the size of the gap
relative to the average candle size over the past LookBack days.
small candle of either color, and (3) a black candle. The middle candle
gaps above both the white and black candlesticks. The black candle opens Let™s now see how to combine these functions to identify the three
candlestick formations discussed earlier in the chapter. We will start with
higher than the close of the white but then closes below the midpoint of
the da& cloud.
the white. This is a bearish pattern in an uptrend.
Let™s now see how we can translate the candlestick definitions into The dark cloud is a bearish formation. Many times, it signals a top
in the market or at least the end of a trend and the start of a period of
TradeStation™s EasyLanguage code.
consolidation. The EasyLanguage code for the dark cloud is shown in
The primitive functions that we will use in identifying the dark cloud,
Table 14.4.
bullish engulf, and evening star are shown in Table 14.3.
Mechanically Identifying and Testing Candlestick Patterns
202 Making Subjective Methods Mechanical 203


TABLE 14.5 CODE FOR BULLISH ENGULF PATTERN.
TABLE 14.4 CODE FOR DARK CLOUD FORMATION.
Inputs: LookBack(Numeric),OneCof(Numeric),ZeroCof(Numeric);
Inputs: LookBack(Numeric),OneCof(Numeric),ZeroCof(Numeric);
Vars: Colot(O),SBody(O),LBody(O);
vars: color(o),SBody(O);
Color=CandleColor(O,C);
Vars: FuzzyRange( Return(O);
SBody=FuzzySmall(O,C,LookBack,OneCof*.3,ZeroCof*l);
Color=CandleColor(O,Ci;
LBody=FuzzyLong(O,C,LookBack,OneCof*2,ZeroCof*l);
(Furry Small has the following arguments
if EnGulfing(O,C,l)=l and Color=1 and Colorlll=-1 then BullEngulf=
FuzzySmall(Lookback,OneCof,ZeroCoi))
minIist6BodyIl1,LBody)
[We reversed On&of and ZeroCof 50 that we can test for Not Small as input to
l&e
the dark cloud function1
BullEngulf=O;
SBody=FurrySmall(O,C,LookBack,ZeroCof*.3,OneCof*l);
Return=O;
FuzzyRange=Close-˜Open˜1l+Close˜ll˜/2;
if Color=-1 and Color[ll=l and open>High[lI and FuzzyRange< then begin logic functions for candle size. We take a fuzzy “˜AND” between the
Return=1 -SBody[ll; membership of the previous candle in the Small class and the membership
end; of the current candle in the Large class. This value measures the impor-
DarkCloud = Return; tance of the engulfing pattern. If the pattern does not qualify as a bull-
ish engulfing pattern, we return a 0.
Table 14.6 shows the code of an evening star. The code first tests the
color of each candle as well as its membership in the Small class. Next,
Let™s walk through the code in Table 14.4. First, we save the color of
we test to see where the close of the current candle falls in the range of
each candlestick. Next, we save the membership of each candlestick to the first candle in the formation.
the Fuzzy Small set. Notice that we inverted the OneCof and ZeroCof
arguments. (The dark cloud requires the first white candle to have a sig-
nificant body.) We did this by inverting the small membership function.
TABLE 14.6 CODE FOR THE EVENING STAR PATTERN.
If we had used the long membership function, we would have missed
many dark clouds because the first candle was significant but not quite Inputs: LookBack(Numeric),OneCof(Numeric),ZeroCof(Numeric);
long. Next, we calculate whether the second candlestick is black and falls Vars: Color(O),SBody(O);
below the midpoint of the first candle that is white. The second candle Vars:FurzyRange(O),Return(O);
must also open above the high of the first. Color=CandleColor(O,C);
If the candle qualifies as a dark cloud, we return the fuzzy inverse SBody=FuzzySmall(O,C,LookBack,OneCof*.3,ZeroCof*l);
membership of the first candle to the class Fuzzy Small as the value of Return=O;
the fuzzy dark cloud. FuzzyRange=Close-˜CIose˜2l+Open˜21˜/2;
How do we identify a bullish engulfing pattern? The EasyLanguage if Color=-1. and Color[21=1 and WindowUp(O,H,˜,C,1)[1]>0 and
code for this pattern is shown in Table 14.5. open>open[%l and FuzzyRange< then begin
When identifying a bullish engulf, the first thing we do is to evaluate Return=minList(SBody[l ],I-SBody[21);
the color and size of each candle, If the current candle is white and en- end;
gulfs the first candle that is black, we have a possible bullish engulf. The EveningStar=Return;
significance of the bullish engulf pattern is measured by using our fuzzy
Makine Subiective Methods Mechanical Mechanically Identifying and Testing Candlestick Patterns
204 205


For an evening star, we need a black current candle and a white can- These results are horrible and cannot even cover slippage and com-
dle two candlesticks ago. Next, we need the second candle in the forma- missions. Let™s now see how combining the correlation analysis between
tion to have gapped higher. Finally, the current candle must open higher the CRB and gold can improve the performance of the dark cloud cover
than the middle candle but must close at or below the middle of the first pattern for trading Comex gold. In Chapter 10, we showed that gold nor-
candle. All of these requirements must be met in order for the formation mally trends only when it is highly correlated to the CRB. Let™s use this
to qualify as an evening star. We then return the fuzzy “AND” of one information to develop two different exit rules for our dark cloud, com-
candle ago to the class Small and the “AND” of two candles ago to the in- bined with a lo-day RSI pattern. We will now exit using a 5-day high
verse of the Small class. If the formation does not qualify as an evening when the 50&y correlation between gold and the CRB is above .50. We
star, the function returns a 0. will use a limit order at the entry price (-2 x average(range,lO)) when
We can repeat this process to identify any candlestick patterns we the correlation is below .50. According to the theory, we use a trend type
wish. Once we have written a code, we need to test it. To test the codes exit when the market should trend, and a limit order when the market has
given here, we used the plates from Nison™s book Japanese Candlestick a low probability of trending. The code we are using for these rules, and
Charting Techniques, and tested our routines on the same charts. If you the results without slippage and commissions, are shown in Table 14.8.
are not identifying the patterns you want, you can adjust the LookBack The performance of the dark cloud is incredibly improved by simply
period as well as the scaling coefficients. In general, mechanical identi- using intermarket analysis to select an exit method. There are not enough
fication will miss some patterns that can be detected by the human eye. trades to prove whether this is a reliable trading method for gold. It is
After you have developed routines to identify your candlestick patterns, used only as an example of how to use candlestick patterns to develop
you can use them to develop or improve various trading systems. mechanical trading systems.
How can we use candlesticks to develop mechanical trading strate-
gies? We will test use of the dark cloud cover pattern on Comex gold dur-
ing the period from S/1/86 to 12/26/95. We will go short on the next open
TABLE 14.8 CODE AND RESULTS OF COMBINING
when we identify a dark cloud cover pattern and have a lo-day RSI
INTERMARKET ANALYSIS AND CANDLESTICKS.
greater than 50. We will then exit at a five-day high.
Table 14.7 shows the code and results, without slippage and commissions. Vars: DC˜O˜,Correl˜O˜.CRB˜O˜,CC˜O˜;
CRB=Close of Data2:
CC=ClOW;
TABLE 14.7 CODE AND RESULTS FOR
Correl=RACorrel(CRB,GC,SO˜;
SIMPLE DARK CLOUD SYSTEM.
DGDarkCloud(l5,l ,I);
Van: DC(O); If DC>.5 and RSl(close,10)>50 then sell at open;
DC=DarkCloudil5,1,1); If CoveI>. then exitshort at highest(high,5) stop;
If DC>.5 and RSl(close,10b50 then sell at open; If Correlc.5 then exitshort at entryprice-l*averagefrange,lO)
exitshort at high&high,51 stop; limit;
$150.00
Net profit Net profit $5,350.00
13
Trades Trades 12
4
Wins Wins 9
9
Losses LOSE!5 3
31
Win% Win% 75
$11.54
Average trade Average trade $445.83
206 Making Subjective Methods Mechanical



Combining candlesticks with other Western methods and with inter-
market analysis is a hot area of research for developing mechanical trad-
ing systems. The ability to backtest candlestick patterns will help answer
Part Four
the question of how well any given candlestick pattern works in a given
market. By evaluating candlestick patterns objectively, at least we know
how well they worked in the past and how much heat we will take when
trading them.
TRADING SYSTEM
DEVELOPMENT
AND TESTING
™3
210 Trading System Development and Testing Developing a Trading System 211


TABLE 15.1 STEPS IN DEVELOPING A SYSTEM. length of 10 days, whereas a simple channel breakout system has an av-
1. Decide what market and time frame you want to trade. erage trade length of 50 to 80 days.
2. Develop a premise that you will use to design your trading system. Other issues also have an effect on your choice-how much money you
3. Collect and organize the historical market data needed to develop your have to trade, and your own risk-reward criteria. For example, if you only
model into development, testing, and out-of-sample sets. have $lO,OOO.OO, you would not develop an S&P500 system that holds an
4. Based on the market you want to trade, and your premise, select trading overnight position. A good choice would be T-Bonds.
methods that are predictive of that market and meet your own risk-reward
criteria.
5. Design your entries and test them using simple exits.
DEVELOPING A PREMISE
6. Develop filters that improve your entry rules.
7. After you have developed your entries, design more advanced exit methods
The second and most important step in developing a trading system is to
that will improve your system™s performance.
develop a premise or theory about the market you have selected. There
8. When selecting the parameters that will be used in your trading system,
are many rich sources for theories that are useful in developing systems
base your selections not only on system performance but aI50 on
for various markets. Some of these sources are listed in Table 15.2. Many
robustness.
of those listed were covered in earlier chapters of this book.
9. After you have developed your rules and selected your parameters, test the
system on your testing set to see how the system works on new data. if the
system works well, continue to do more detailed testing. (This is covered in
Chapter 16.1 DEVELOPING DATA SETS
10. Repeat steps 3 through 3 until you have a system that you want to test
further.
After you select the market and time frame you want to trade, you need
to collect and organize your historical data into three sets: (1) the devel-
opment set, which is used to develop your trading rules; (2) a test set,
system. After you have selected your market(s). it is very important to de- where your data are used to test these rules; and (3) a blind or out-of-
cide what time frame you want to trade on and to have an idea of how sample set, which is used only after you have selected the final rules and
long you would like your average trade to last. parameters for your system. III developing many systems over the years,
Selecting a time frame means deciding whether to use intraday, daily, I have found that one of the important issues involved in collecting these
or weekly data. Your decision on a time frame should be based on both
how often you want to trade and how long each trade should last. When
you use a shorter time frame, trading frequency increases and length of
TABLE 15.2 PREMISES FOR TRADING SYSTEMS.
trades decreases.
1. Intermarket analysis.
Another little discussed issue is that each time frame and each market
2. Sentiment indicators.
has its own traits. For example, on the intraday S&P500, the high and/or
3. Market internals-for example, for the S&P500, we would use data such as
low of the day is most likely to occur in the first or last 90 minutes of the
breadth, arm index, the premium between the cash and futures, and so on.
day.
4. Mechanical models of subjective methods.
When using daily data, you can have wide variation in trade length-
5. Trend-based models.
from a few days to a year, depending on what types of methods are used
6. Seasonality, day of week, month of year. and so on.
to generate your signals. For example, many of the intermarket-based
7. Models that analyze technical indicators or price-based patterns.
methods for trading T-Bonds,˜ shown in Chapter lo, have an average trade
Developing a Tradinp. System 213
212 Trading System Development and Testing


TABLE 15.3 TYPES OF TRADING METHODS
data is whether we should use individual contracts or continuous con- AND IMPLEMENTATION.
tracts for the futures data we need to use. It is much easier to use
Premise Implementation
continuous contracts, at least for the development set, and the continuous Pro/Cons
contracts should b-e back adjusted. Depending on your premise, you might Trend following Moving averages All trend-following methods work
need to develop data sets not only for the market you are trading but also Channel breakout badly in nontrending markets. The
Consecutive closes
for related markets; that is, if we were developing a T-Bond system using channel breakout and consecutive
closes are the most robust
intermarket analysis, we would want to also have a continuous contract for
implementations. You will win only
the CRB futures. 30 percent to 50 percent of your
The next important issue is how much data you need. To develop reli- trades.
able systems, you should have at least one bull and one bear market in
Countertrend Oscillator divergence Don™t trade often. These offer
your data set. Ideally, you should have examples of bull and bear markets methods and cycle-based higher winning percentages than
in both your development set and your test set. I have found that having methods trend-following methods, but they
at least 10 years™ daily data is a good rule of thumb. can suffer large losing trades.
Cause and effect Technical analysis, These systems can work only on
(intermarket and comparing two or the markets they were developed
SELECTING METHODS FOR DEVELOPING fundamental rrwre data series to trade. They can suffer high
A TRADING SYSTEM analysis) drawdowns but yield a good
winning percentage.
Pattern and
After you have developed your premise and selected the technologies that Simple rules of three Don™t trade often. These need to
statistically based or more conditions be tested to make sure they are not
can be used to support it, you must prove that your premise is valid and
methods curve-fitted. They have a good
that you actually can use it to predict the market you are trying to trade. winning percentage and drawdown
Another issue that you need to deal with is whether a system that is if thev are robust.
based on your premise will fit your trading personality. This issue is often
overlooked but is very important. Let™s suppose your premise is based
on the fact that the currency markets trend. This approach will mt work
for you if you cannot accept losing at least half your trades or are unable
to handle watching a system give up half or more of its profit on a given
inflation. As in Chapter 1, several different commodities can be used as
trade.
measures on inflation; for example, the CRB index, copper, and gold
In another example, you might have a pattern-based premise that pro-
can all be used to predict T-Bonds. Next, you need to test your premise
duced a high winning percentage but traded only 10 times a year. Many that inflation can be used to predict T-Bonds. When inflation is rising
people, needing more action than this, would take other trades not based
and T-Bonds are also rising, you sell T-Bonds. If inflation is falling and
on a system and would start losing money.
so are T-Bonds, you buy T-Bonds. Youxan then test this simple diver-
Let™s now look at several different premises and how they are imple-
genci: premise using either the price momentum or prices relative to a
mented. This information, plus the pros and cons of each premise, is moving average. Once you have proven that your premise is predictive,
shown in Table 15.3. you can use a simple reversal system and start to develop your entries
After you have selected your method, you need to test it. Let™s sup- and exits.
pose your premise was that T-Bond prices can be predicted based on
214 Trading System Development and Testing 215
Developing a Trading System


DESIGNING ENTRIES very large losing trades. They increase the drawdown and can make the
system untradable.
The simplest type of system is the classic reversal, which has been shown The top traders in the world use these types of complex entries; for
several times in this book. For example, the channel breakout system example, many of Larry Williams™s patterns are based on buying or sell-
shown in Chapter 4 is a reversal-type system. This type of system can be ing on a stop based on the open, plus or minus a percentage of yester-
either long or short-a feature that creates problems because sometimes day™s range when a condition is true.
a reversal-type system can produce very large losing trades. Now that you have a basic understanding of developing entries, you
Having discussed the simplest type of system entry, let™s now examine need to learn how to test them. When testing entries, you should use sim-
the entries themselves. There are two major types of entries: (1) simple ple exits. Among the simple exits I use are: holding a position for N bars,
and (2) complex. Simple entries signal a trade when a condition is true. using target profits, and exiting on first profitable opening. Another test
Complex entries need an event to be true, and another “trigger” event of entries is to use a simple exit and then lag when you actually get into
must occur to actually produce the entry. A simple entry will enter a trade the trade. I normally test lags between 1 to 5 bars. The less a small lag af-
on the next open or on today™s close, if a given event is true. For exam- fects the results, the more robust the entry. Another thing you can learn
ple, the following rule is a simple entry: is that, sometimes, when using intermarket or fundamental analysis, lag-
ging your entries by a few bars can actually help performance. Testing
If today = Monday and T-Bonds > T-Bonds[5], then buy S&P500 at your entries using simple exits will help you not only to develop better en-
open. tries but also to test them for robustness.
When you have found entry methods that work well, see whether there
The rule™s conditional part, before the “then,” can be as complex as are any patterns that produce either substandard performance or superior
needed, but the actual order must always occur when the rule is true. performance. These patterns can be used as filters for your entry rules.
Complex entries use a rule combined with a trigger, for example:

If today = Monday and T-Bonds > T-Bonds[S], then buy S&P500 at DEVELOPING FILTERS FOR ENTRY RULES
open + .3 x range stop.
Developing filters for entry rules is a very important step in the system
This is a complex entry rule because we do not buy only when the rule development process. Filters are normally discovered during the process
is true. We require the trigger event (being 30 percent of yesterday™s of testing entries; for example, you might find out that your entry rules
range above the open) to occur, or we will not enter the trade. do not work well during September and October. You can use this infor-
For both types of entries, the part of the rule before the “then” states mation to filter out trades in those months.
the events that give a statistical edge. In the above examples, Mondays in Another popular type of filter is a trend detection indicator, like ADX,
which T-Bonds are in an uptrend have a statistically significant upward for filtering a trend-following system. The goal in using these types of fil-
bias. Triggers are really filters that increase the statistical edge. Suppose, ters is to filter out trades that have lower expectations than the overall
for day-trading the S&P500, we use a trigger that buys a breakout of 30 trades produced by a given pattern. For extimple, our correlation filter in
percent of yesterday™s range above the open. This works because once a Chapte; 8 was applied to our buy on Monday, when T-Bonds were above
market moves 30 percent above or below the open in a given direction, the their 26.day moving-average rule. It succeeded in filtering out about 60
chance of a large-range day counter to that move drops significantly. This trades that produced only $6.00 a trade. With the average trade over
is important because the biggest problem a trading system can have is $200.00, filtering out these trades greatly improved the results of this
Developing a Trading System 217
216 Trading System Development and Testing

<<

. 4
( 6)



>>