ńňđ. 4 |

elements and connections.

c4.5

In the late 194Os, Claude Shannon developed a concept called â€śinforma-

FIGURE 11.2 A simple three-layer neural network.

tion theory,â€ť which allows us to measure the information content of data

by determining the amount of confusion or â€śentropyâ€ť in the data. Infor-

why each neural network product has its own proprietary version of a

mation theory has allowed us to develop a class of learning-by-example

backpropagation-like algorithm.

algorithms that produce decision trees, which minimize entropy. One of

When you develop a solution to a problem using neural networks, you

these is C4.5. C4.5 and its predecessor, ID3, were developed by J. Ross

must preprocess your data before showing it to the neural network. Pre-

Quinlan. Using a decision tree, they both classify objects based on a list

processing is a method of applying to the data transforms that make the

of attributes. Decision trees can be expressed in the form of rules. Fig-

relationships more obvious to the neural network. An example of this pro-

ure 1 I .3 shows an example of a decision tree.

cess would be using the difference between historical prices and the mov-

ing average over the past 30 days. The goal is to allow the neural network

to easily see the relationships that a human expert would see when solv-

ing the problem.

YES NO

We will be discussing preprocessing and how to use neural networks as

part of market timing systems in the next few chapters.

Letâ€™s now discuss how you can start using neural networks success-

fully. The first topic is the place for neural networks in developing mar-

IMCOME INCO IME

ket timing solutions. The second is the methodology required to use

neural networks successfully in these applications. IEXPENSES 2

IEXPENSES J

I

I L

Neural networks are not magic. They should be viewed as a tool for de-

veloping a new class of powerful leading indicators that can integrate

many different forms of analysis. Neural networks work best when used

as part of a larger solution.

A neural network can be used to predict an indicator, such as a per-

cent change, N bars into the future; for example, the percent change of

FIGURE 11.3 A simple decision tree.

the S&P500 5 weeks into the future. The values predicted by the neural

Statistically Based Market Prediction 155

An Overview of Advanced Technologies

154

Letâ€™s take a closer look at the binary version of C4.5. It creates a two- by dropping conditions and then retesting them on unseen data. A do-

way branch at every split in the tree. Attributes are selected for splitting main expert could also specialize a rule by adding a condition to it. When

based on the information content of each attribute in terms of classify- developing machine-induced rules, you donâ€™t want to use all the rules that

ing the outcome groups. The attributes containing the most information were generated. You only want to use â€śstrong rulesâ€ť-those with enough

are at the top of the tree. Information content decreases as we move to- supporting cases. For this reason, when using C4.5, you need a product

ward the bottom level of the tree, through the â€śleaves.â€ť that offers statistical information about each of the leaves on the tree.

For discrete attributes, the values are split between branches so as to An example of a product that has this feature is XpertRuleTM by Attar

maximize information content. Numerical data are broken into bins or Software.

ranges. These ranges are developed based on a numeric threshold derived

to maximize the information content of the attribute. The output classes

Rough Sets

or objects must be represented by discrete variables. This requires nu-

merical output classes to be manually split into ranges based on domain Rough sets is a mathematical technique for working with the imper-

expertise. fections of real-world data. Rough sets theory, proposed by Pawlak in

Both C4.5 and ID3 handle noise by performing significance testing at 1982, can be used to discover dependencies in data while ignoring su-

each node. The attributes,must both reduce entropy and pass a signifi- perfluous data. The product of rough sets theory is a set of equivalence

cance test in order to split a branch. C4.5 and ID3 use the Chi-square test classifications that can handle inconsistent data. Rough sets methodol-

for significance. Several parameters can be set to help C4.5 develop rule ogy facilitates data analysis, pattern discovery, and predictive modeling

sets that will generalize well. The first parameter is the lower branch in one step. It does not require additional testing of significance, cross-

limit-the number of data records below which the induction process will correlation of variables, or pruning of rules.

terminate the branch and develop a leaf. A good starting point for this pa- Letâ€™s now try to understand how rough sets work. We will assume that

rameter is about 2 percent of the number of records in the database. After real-world information is given in the form of an information table. Table

the decision tree is induced, a process called â€śpruningâ€ť can improve gen- 11.1 is an example of an information table.

eralization. Noise causes excessive branches to form near the leaves of the The rows in this table are called examples. Each example is composed

tree. Pruning allows us to remove these branches and reduce the effects of attributes and a decision variable. In Table 11.1, headache, muscle

of noise. There are two types of automatic pruning: (1) error reduction pain, and temperature are attributes, and flu is the decision variable.

and (2) statistical. Error reduction pruning is based on a complexity/ac- Rough sets theory uses this type of table to develop rules from data.

curacy criterion. Branches that fail the test are pruned. Rough sets theory is an extension of standard set theory, in which the

The statistical pruning algorithm is particularly suited to situations definition of the set is integrated with knowledge.

where the noise in the data is caused by not having all the relevant attri-

butes to classify the outcome and by the presence of irrelevant attributes.

ROUGH SETS EXAMPLE 1.

TABLE 11 .l

This is true of the financial markets as well as many other real-world _

problems. ROW Muscle Pain Temperature Flu

Headache

The statistical pruning algorithm works backward from the leaves to 1 Normal

Yes Yes NO

remove all attribute branches of the induced tree that are not statistically High

2 Yes Yes Ye5

3 Yes Yes

significant (using the Chi-square test). Very high Yes

4 NO Ye5 Normal NO

Another type of pruning is based on domain expertise. A domain ex-

NO NO High NO

s

pert could examine the rules generated and delete any of them that donâ€™t

6 NO NO Very high Yes

make sense in the real-world application. Rules can also be generalized

156 Statistically Based Market Prediction 157

An Overview of Advanced Technologies

To make this explanation easier to understand, letâ€™s review some of the sets, are said to be indiscernible. In the example shown in Table 11.1, the

basics of set theory. attributes headache and muscle pain can be used to produce two differ-

ent subsets. These are formed by the rows (Rl,R2,R3] and [ RKR6).

These two subsets make up two different elementary sets.

Subsets

Any union of elementary sets is called a definable set. The concept of

Subsets are made up of only elements contained in a larger set. A super- indiscernibility relation allows us to define redundant attributes easily.

set is the inverse of that makeup. A proper set is a subset that is not iden- Using Table 11.1, letâ€™s define two sets. The first is based on the attri-

tical to the set it is being compared to. Letâ€™s now look at some examples butes headache and temperature. The second will add muscle pain and

of subsets and supersets. use all three attributes. Using either pair of attributes produces the same

elementary sets. These are the sets formed by the single elements

Let A = lRl,R2,R3,R4,R5,R6,R7,R8,R9) [Rl),(R2],(R3],(R4),[R5},(R6].Becausethesesetsofattributesform

the same sets, we can say that the attribute muscle pain is redundant: it

Let B = [R4,R5,R8)

did not change the definition of the sets by addition or deletion. Sets of

In this example, B is a subset of A; this is expressed as B c A. We can attributes with IK) redundancies are called independents. Sets that contain

also say that A is a supersei of B: A 2 B. the same elements as other sets but possess the fewest attributes are

The union of two sets forms a set containing all of the elements in both called redacts.

sets. For example, letâ€™s suppose we have two sets, A and B, as follows: In Table 11.2, muscle pain has been removed because it did not add

any information.

A = (Rl,R2,R3,R4,R5.R6) Letâ€™s now develop elementary sets based on the decisions in Table 11.2.

B = {R7,R8,R9] An elementary set that is defined based on a decision variable, which in

our case would be yes or 110, is called a â€śconcept.â€ť For Tables 11.1 and

The union of these sets yields the set (Rl,R2,R3,R4,R5,R6,$7, 11.2, these are (Rl,R4,R5) and {R2,R3,R6). These are defined by the

R8,R9). This is expressed as X = A u B. sets in which the decision (flu) is no for (Rl,R4,R5] and yes for

Letâ€™s now calculate the intersection of the following two sets: [ R2,R3,R6).

What elementary sets can be formed from the attributes headache and

A = [Rl,R2,R3,R4,R&R6,R7) temperature together? These are the single-element sets [ Rl ],[R2),

B = [R2,R4,R6,R8,RlO) [R3],[R4),(R5],(R6). Because each of these sets is a subset of one of

These two sets intersect to form the set (R2,R4,R6). The intersection

is expressed as X = A n B. Another set function is called the cardinal-

TABLE 11.2 ROUGH SETS EXAMPLE 2.

ity-the size of a given set.

With this overview of set theory, letâ€™s lwlw use these basics to see how Headache Flu

Temperature

ROW

rough sets work.

Normal NO

1 YE

Yes

High

? Yes

3 Yes Very high Yes

The Basics of Rough Sets

4 No NO

Normal

The main concept behind rough sets is collections of rows that have the NO

High

5 NO

Yes

same values for one or more attributes. These˜sets, called elementary Very high

6 NO

159

158 Statistically Based Market Prediction An Overview of Advanced Technologies

TABLE 11.3 RULES FOR EXAMPLE 2. greatest definable set contains all cases in which we have no conflicts and

is called the lower approximation. The least definable sets are ones in

(Temperature, normal)=(Flu,No)

which we may have conflicts. They are called the upper approximation.

(Headache, No) and (Temperature, High)=(Flu,No)

As in our earlier example, when there are no conflicts, we simply cre-

(Headache, Yes) and (Temperature, High)=(Flu,Yes)

ate a series of sets of attributes and a set for each decision variable. If the

(Temperature, Very High)=(Flu,Yes)

attribute set is a subset of the decision set, we can translate that rela-

tionship into a rule. When there are conflicts, there is no relationship and

we need to use a different method. We solve this problem by defining two

our decision-based elementary sets, we can use these relationships to pro-

different boundaries that are collections of sets. These are called the

duce the rules shown in Table 11.3.

upper and lower approximations. The lower approximation consists of all

Letâ€™s now add the two examples (R7,R8) shown in Table 11.4.

of the objects that surely belong to the concept. The upper approximation

Having added these two examples, letâ€™s redefine our elementary sets

consists of any object that possibly belongs to the concept.

of indiscernibility relationships for the attributes headache and temper-

Letâ€™s now see how this translates into set theory. Let I = an elemen-

ature.Thesesetsare: {Rl),{R2),(R3],(R4J,(R5,R7),(R6,R8]. Ourel-

tary set of attributes, and X = a concept. The lower approximation is de-

ementary sets, based on our decision variable, are:

fined as:

For Flu = No, [Rl,R4,R&R8)

hveer = [x E U:l(x) E X)

For Flu = Yes, [R2,R3,R6,R7)

In words, this formula says that the lower approximation is all of the

As shown in Table 11.4, the decision on flu does rxx depend on the at- elementary sets that are proper subsets of the concept X. In fact, the U

tributes headache and temperature because neither of the elementary sets, means the universe, which is a fancy way of saying all.

[R5,R7] and (R6,R8], is a subset of any concept. We say that Table 11.4 The upper approximation is defined as:

is inconsistent because the outcomes of [R5 J and (R7) are conflicting,

For the same attribute values, we have a different outcome. Upper = (x E U:l(x) n x f 0)

The heart of rough sets theory is that it can deal with these types of in-

consistencies. The method of dealing with them is simple. For each con- This simply means that the upper approximation is all of the elemen-

cept X, we define both the greatest definable and least definable sets. The tary sets that produce a nonempty intersection with one or more con-

cepts.

The boundary region is the difference between the upper and lower

TABLE 11.4 ROUGH SETS EXAMPLE 3.

approximations.

Rough sets theory implements a concept called vagueness. In fact, this

ROW Headache Temperature Flu

concept causes rough sets to sometimes be confused with fuzzy logic.

1 Yes Normal NO

The rough sets membership function is defined as follows:

2 Ye5 High Yes

3 Yes Very high YS

4 NO Normal NO

5 NO High NO

6 NO Very high YC?S

7 NO High Ye5 This imple formula defines roughness as the cardinality of the inter-

NO

a Very high NO section of (1) the subset that forms the concept and (2) an elementary set

160 Statisticallv Based Market Prediction An Overview of Advanced Technologies 161

of attributes, divided by the cardinality of the elementary set. As noted TABLE 11.5 STEPS USING GENETIC ALGORITHM.

earlier, cardinality is just the number of elements. Letâ€™s see how this con-

1. Encode the problem into chromosomes.

cept would work, using the sets in Table 11.4 in which headache is no

2. Using the encoding, develop a fitness function for ure in evaluating each

and temperature is high, (R5 and R7). If we want to compare the rough

chromosomeâ€™s value in solving a given problem.

set membership of this set to the decision class Flu = Yes, we apply our

3. Initialize a population of chromosomes.

formula to the attribute set and the Flu = Yes membership set

4. Evaluate each chromosome in the population.

(RZ,R3,R6,R7]. The intersection of these two sets has just one element,

5. Create new chromosomes by mating two chromosomes. (This is done by

R7. Our Headache = No and Temperature = High set has two elements, so

mutating and recombining two parents to form two children. We select

the rough set membership for this elementary set of attributes and the parents randomly but biased by their fitness.)

Flu = Yes concept is % = 0.5. 6. Evaluate the new chromosome.

This roughness calculation is used to determine the precision of rules 7. Delete a member of the population that is less fit than the new chromosome.

produced by rough sets. For example, we can convert this into the fol- and insert the new chromosome into the population.

lowing possible rule: 8. If a stopping number of generations is reached, or time is up, then return the

best chromosome(s) or, alternatively, go to step 4.

If Headache and Temperature = High, Flu = Yes (SO).

Rough sets technology is very valuable in developing market timing

systems. First, rough sets do not make any assumption about the distri- Genetic algorithms are a simple but powerful tool for finding the best

bution of the data. This is important because financial markets are not combination of elements to make a good trading system or indicator. We

based on a gaussian distribution. Second, this technology not only handles can evolve rules to create artificial traders. The traders can then be used

noise well, but also eliminates irrelevant factors. to select input parameters for neural networks, or to develop portfolio

and asset class models or composite indexes. Composite indexes are a

specially weighted group of assets that can be used to predict other as-

GENETIC ALGORITHMS-AN OVERVIEW

sets or can be traded themselves as a group. Genetic algorithms are also

useful for developing rules for integrating multiple system components

Genetic algorithms were invented by John Holland during the mid-1970s

and indicators. These are only a few of the possibilities. Letâ€™s now discuss

to solve hard optimization problems. This method uses natural selection,

each component and step of a genetic algorithm in more detail.

â€śsurvival of the fittest,â€ť to solve optimization problems using computer

software.

There are three main components to a genetic algorithm:

DEVELOPING THE CHROMOSOMES

1. A way of describing the problem in terms of a genetic code, like a

Letâ€™s first review some of the biology-based terminology we will use. The

DNA chromosome.

initial step in solving a problem using genetic algorithms is to encode the

2. A way to simulate evolution by creating offspring of the chromo-

problem into a string of numbers called a â€śchromosome.â€ť These numbers

somes, each being slightly different than its parents. can be binary real numbers or discrete values. Each element on the chro-

3. A method to evaluate the goodness of each of the offspring. mosome is called a â€śgene.â€ť The value of each gene is called an â€śallele.â€ť

The position of the gene on the chromosome is called the â€ślocus.â€ť The

This process is shown in Table 11 S, which gives˜ an overview of the string of numbers created must contain all of the encoded information

steps involved in a genetic solution. needed to solve the problem.

lb2 An Overview of Advanced Technolo+ lb3

Statisticallv Based Market Prediction

Drawdown, and Winning percentage on each rule and then evaluate their

As an example of how we can translate a problem to a chromosome,

fitness using a simple formula:

letâ€™s suppose we would like to develop trading rules using genetic

algorithms. We first need to develop a general form for our rules, for

= (Net Profit/Drawdown)*Winning percentage

Fitness

example:

The goal of the genetic algorithm in this case would be to maximize

If Indicator (Length) > Trigger and Indicator (Length)[l] < Trig-

this function.

ger, then Place order to open and exit N days later.

Note: Items bold are being encoded into chromosomes

in

INITIALIZING THE POPULATION

We could also have encoded into the chromosomes the > and < opera-

tors, as well as the conjunctive operator â€śANDâ€ť used in this rule template. Next, we need to initialize the population by creating a number of chro-

Letâ€™s see how we can encode a rule of this form. We can assign an in- mosomes using random values for the allele of each gene. Each numeri-

teger number to each technical indicator ;â€śâ€ť would like to use. For exam- cal value for the chromosomes is randomly selected using valid values

ple: RSI = 1,SlowK = 2, and so on. Trigger would be a simple real for each gene. For example, gene one of our example chromosome would

number. Place order could be 1 for a buy and -1 for a sell. N is the num- contain only integer values. We must also limit these values to integers

ber of days to hold the position. that have been assigned to a given indicator. Most of the time, these pop-

Letâ€™s see how the following rule could be encoded. ulations contain at least 50 and sometimes hundreds of members.

If Râ€˜%(9) > 30 and RSI(9)(1] c 30, then Buy at open and Exit 5 days

later. THE EVOLUTION

The chromosome for the above rule would be: 1,9,30,1,9,30,1,5. Reproduction is the heart of the genetic algorithm. The reproductive pro-

Having explained the encoding of a chromosome, I now discuss how to cess involves two major steps: (1) the selection of a pair of chromosomes

develop a fitness function. to use as parents for the next set of children, and (2) the process of com-

bining these genes into two children. Letâ€™s examine each of the steps in

more detail.

EVALUATING FITNESS The first issue in reproduction is parent selection. A popular method

of parent selection is the roulette wheel method,* shown in Table 11.6.

We will now have the two parents produce children. Two major oper-

A fitness function evaluates chromosomes for their ability or fitness for

ations are involved in mating: (1) crossover and (2) mutation. (Mating is

solving a given problem. Letâ€™s discuss what would be required to develop

not the only way to produce members for the next generation. Some ge-

a fitness function for the chromosome in the above example. The first

netic algorithms will occasionally clone fit members to produce children.

step is to pass the values of the chromosomeâ€™s genes to a function that can

This is called â€śElitism.â€ś)

use these values to evaluate the rule represented by the chromosome. We

will evaluate this rule for each record in ou training. We will then col-

lect statistics for the rule and evaluate those statistics using a formula

that can return a single value representing how fit the chromosome is for

solving the problem tit hand. Fork example, we can collect Net Profit, *Using this method, we will select two parents who will mate and produce children.

An Overview of Advanced Technologies

Statistically Based Market Prediction 165

164

TABLE 11.6 PARENT SELECTION. The Two-Point Crossover

A two-point crossover is similar to the one-point method except that two

I. Sum the fitness of all the population members, and call that sum X.

2. Generate a random number between 0 and X. cuts are made in the parents, and the genes between those cuts are ex-

3. Return the first population member whose fitness, when added to the iitness changed to produce children. See Figure 11.5 for an example of a two-

of the preceding population member, is greater than or equal to the random point crossover.

number from step 2.

The Uniform Crossover

There are three popular crossover methods or types: (1) one-point (,sin- In the uniform crossover method, we randomly exchange genes between

gle-point). (2) two-point, and (3) uniform. All of these methods have their the two parents, based on some crossover probability. An example of a

own strengths and weaknesses. Letâ€™s now take a closer look at how the uniform crossover appears in Figure 11.6.

various crossover methods work. All three of our examples showed crossovers using binary operators.

You might wonder how to perform crossovers when the genes of the chro-

mosomes are real numbers or discrete values. The basics of each of the

The One-Point Crossover

crossover methods are the same. The difference is that, once we have se-

The one-point crossover randomly selects two adjacent genes on the chro- lected the genes that will be affected by the crossover? we develop other

mosome of a parent and severs the link between the pair so as to cut the operators to combine them instead of just switching them. For example,

chromosome into two parts. We do this to both parents. We then create when using real-number genes, we can use weighted averages of the two

one child using the left-hand side of parent 1 and the right-hand side of parents to produce a child. We can use one set of weighting for child 1 and

parent 2. The second child will be just the reverse. Figure 1 I .4 shows how another for child 2. For processing discrete values, we can just randomly

a one-point crossover works. select one of the other classes.

FIGURE 11.5 a two-point crossover.

An example of

FIGURE 11.4 of a˜one:pdint crbssover

An example

166 Statistically Based Market Prediction An Overview of Advanced Technologies 167

genes, defining schemata requires the symbols O,l˜ and *, where 0 and 1

are just binary digits, and * means donâ€™t care. Figure 11.8 examines a

chromosome and two different schemata.

Schema 1 is a template that requires genes 1,2,3,6 to be 1. Schema 2 re-

quires a 0 in gene 4 and a 1 in gene five. Our sample chromosome fits both

schemata, but this is not always the case. Letâ€™s say that, in a population

of 100 chromosomes, 30 fit schema 1 and 20 fit schema 2. One of the

major concepts of genetic algorithms then applies,

Letâ€™s suppose that the average fitness of the chromosomes belonging

to schema 1 is 0.75, and the average fitness of those in schema 2 is 0.50.

The average fitness of the whole population is 0.375. In this case, schema

1 will have an exponentially increased priority in subsequent generations

FIGURE 11.6 An examDIe a uniform crossover.

of of reproduction, when compared to schema 2.

Schemata also affect crossovers. The longer a schema, the more eas-

ily it can get disrupted by crossover. The length of a schema is measured

Mutation as the length between the innermost and outermost 0 or 1 for a binary

chromosome. This distance is called the â€śdefining length.â€ť In Figure 11.8,

Mutation is a random changing of a gene on a chromosome. Mutations

schema 1 has a longer defining length.

occur with a low probability because, if we mutated all the time, then the

Different crossovers also have different properties that affect com-

evolutionary process would be reduced to a random search.

bining the schemata. For example, some schemata cannot be combined

Figure 11.7 shows an example of a mutation on a binary chromosome.

without disrupting them using either a single-point or a two-point

If we were working with real-number chromosomes, we could add a

crossover. On the other hand, single-point and two-point crossovers are

small random number (ranging from f 10 percent of the average for that

good at not disrupting paired genes used to express a single feature. The

gene) to its value to produce a mutation.

uniform crossover method can combine any two schemata that differ by

Several concepts are important to genetic algorithms. We will overview

these concepts without covering the mathematics behind them.

The first concept we must understand is the concept of similar tem-

plates of chromosomes, called schemata. If we are working with binary

PIPIâ€™ Iâ€™ 10101

I I I I I

I I

Before mutation

After mutation

FIGURE 11.8 An example of a schema.

FIGURE 11.7 An example of mutation

168 Statistically Based Market Prediction An Overview of Advanced Technoloeies 169

one or more genes, but has a higher probability of disrupting schemata the more the measured length curves. It can also be a fractional number,

that require paired genes to express a feature. This point must be kept in such as 1.85. The fractal dimension of the data is important because sys-

mind when selecting encoding or crossover methods for solving problems. tems with similar fractal dimensions have been found to have similar

properties. The market will change modes when the fractal dimension

changes. A method called resealed range analysis can give both an indi-

UPDATING A POPULATION cator that measures whether the market is trending or not (similar to ran-

dom walk), and the fractal dimension on which the financial data is

After the genetic algorithm has produced one or more children, we apply calculated.

the fitness function to each child produced, to judge how well the child Thanks to Einstein, we know that in particle physics the distance that

solves the problem it was designed for. We compare the fitness of the new a random particle covers increases with the square root of the time it has

children to that of the existing population, and delete a randomly selected been traveling. In equation form, if we denote by R the distance covered

member whose fitness is less than that of a child we have evaluated. We and let T be a time index, we see that:

then add this child to the population. We repeat this process for each child

produced, until we reach a stopping number of generation or time. R = constant x To.5

Genetic algorithms are an exciting technology to use in developing

trading-related applications. To use them effectively, it is important to Letâ€™s begin with a time series of length M. We will first convert this

understand the basic theory and to study case material that offers other time series of length N = M - 1 of logarithmic ratios:

applications in your domain or in similar domains. You donâ€™t need to un-

derstand the mathematics behind the theory, just the concepts.

Nj=Loy%i=1,2,3 ,.._ (M-l)

CHAOS THEORY We now need to divide our time period of length N into A contiguous

subperiods of length n so that A x II = N. We then label each of these sub-

periods I,, with a = 1,2,3....A. We then label each of these elements Z, as

Chaos theory is an area of analysis that describes complex modes in

N,,a such that k = 1,2,3...n. We then define OK mean e by taking the time

which not all of the variables or initial conditions are known. One exam-

series of accumulated departures (X,,,) from the mean value e in the fol-

ple is weather forecasting; predictions are made using an incomplete se-

lowing form:

ries of equations. Chaos theory is not about randomness; itâ€™s about how

real-world problems require understanding not only the model but also

the initial conditions. Even small numerical errors due to round off can

lead to a large error in prediction over very short periods of time. When

studying these types of problems, standard geometry does rx)t work. Summing over i = 1 to k, where k = 1,2,3,. ,n. The range is defined as

For example, suppose we want to measure the sea shore. If we measure the maximum minus the minimum value of X, for each subperiod Ia:

the shore line using a yardstick, we get one distance. If we measured it

using a flexible tape measure, we get a longer distance; the length depends

on how and with what tools we make the measurement.

Benoit Mandelbrot tried to solve this problem by creatingfracral geom- where 1 < = k < = n. This adjusted range is the distance that the under-

err-y. The fractal dimension is ˜a measure of how â€śsquiggly a given line lying system travels for time index M. We then calculate the standard de-

is.â€ť This number can take values of I or higher-the higher the number, viation of the sample for each subperiod Ia.

An Overview of Advanced Technologies 171

Statisticallv Based Market Prediction

170

covered than in a completely random time series. In trading terms, this

is called a â€śtrading range.â€ť

We know that the Hurst exponent can be used to tell if the market is

where 1 5 k < n. This standard deviation is used to normalize the range trending or if it is in a trading range. The question is: Can changes in the

R. Hurst generalized Einsteinâ€™s relation to a time series whose distribu- Hurst exponent be used to predict changes in the correlation between mar-

kets or in the technical nature of the market?

tion is unknown by dividing the adjusted range by the standard deviation,

showing that:

STATISTICAL PATTERN RECOGNITION

Statistical pattern recognition uses statistical methods to analyze and

classify data. Statistical pattern recognition is not just one method; it is

Now we can calculate H using the relationship

a class of methods for analyzing data.

0 Constant

R One example of statistical pattern recognition is called case-based

= nH

reasoning (CBR). CBR compares a library of cases to the current case. It

7â€ť

then reports a list of similar cases. This idea is used by traders such as

where n is a time index and H is a power called the Hurst exponent, which Paul Tutor Jones and even by Moore Research in their monthly publica-

can lie anywhere between 0 and 1. We can calculate the (R/S) equation tion. This process requires developing an index for the cases, using meth-

ods such as C4.5 or various statistical measures. One of the most common

for subperiod II and create a moving Hurst exponent. This can now be

methods for developing these indexes is â€śnearest neighbor matching.â€ť

used like any other indicator to develop trading rules.

Letâ€™s see how this matching is done.

This is a simplified version of Peterâ€™s published methodology, yet still

gives a good estimate of H.* The normal process for calculating H re- If the database depends on numerical data, calculate the mean and the

standard deviation of all fields stored in the database. For each record in

quires you to do a least squares on all (R/Sjn. We can skip this step and

the database, store how far each field is from the mean for that field in

use the raw H value,to develop indicators which are of value in trading

terms of standard deviation. For example, if the mean is 30 with a Stan-

systems. This does make H noisier but removes much of the lag. Peters

dard deviation of 3, an attribute with a value of 27 would be -1.0 stan-

suggests that the Hausdorff dimension can be approximated by the fol-

dard deviation from the mean. Make these calculations for each attribute

lowing relationship:

and case in the database. When a new case is given, develop a similarity

DH=2-H score. First, convert the new case in terms of standard deviation from the

mean and standard deviation used to build the index. Next, compare each

where DH is the fractal dimension and His the Hurst exponent. of the attributes in the new case to the standardized index values, and se-

The Hurst exponent H is of interest to traders since a value of 0.5 is lect the cases that are the nearest match. An example of a closeness func-

simply a random time series. If H is above 0.5, then the series has a mem- tion is shown in Table 11.7.

ory; in tradersâ€™ terms this is called â€śtrending.â€ť If H is less than 0.5 the Apply this function to each record in the database, and then report the

lower scoring cases as the most similar. Use these methods to find simi-

market is an antipersistent time series, one in which less distance is

lar patterns for automatic pattern recognition.

Similarity analysis can also be done using Pearsonâ€™s correlation or an-

*See Peters, Edgar E. (1994). Froctal Marker Analysis. (New York: John Wiley &

other type of correlation called â€śSpearman ranked correlation.â€ť

SOlIS).

An Overview of Advanced Technoloeies

Statistically Based Market Prediction 173

172

functions for each variableâ€™s attributes. We need to develop fuzzy mem-

TABLE 11.7 CLOSENESS FUNCTION.

bership functions for the height attributes of the mother, father, and child.

For our example, these attributes are tall, normal, and short. We have de-

(New case attribute-Stored case attributes)x Weight

Closeness = fined generic membership functions for these height attributes as follows

c. Total weights

(SD = standard deviation):

New case attribute is the normalized value of a given attribute for new cases.

Stored case attribute is the normalized value of a given attribute for the current Tall=maximum(O,min( 1,(X-Average Height)/(SD of height))).

database case being measured.

Short=maximum(O,min( l,(Average Height-X)/(SD of height))).

Total weights is the sum of all of the weighting factors.

Normal=maximum(O,(l-(abs(X-Average Height)/(SD of height)))),

When using these membership functions, substitute the following val-

ues for average height and standard deviation for the mother, father, and

Statistical pattern recognition can also be used to develop subgroups

child.

of similar data. For example, we can subclassify data based on some sta-

tistical measure and then develop a different trading system for each Mother: average height 65 inches, SD 3 inches.

class.

Father: average height 69 inches, SD 4 inches

Statistical pattern recognition is a broad area of advanced methods,

Cl˜ild: average height (12 months) 30 inches, SD 2 inches.

and this brief explanation only touches the surface. I showed the nearest

neighbor matching method because it is simple and useful in developing

Having developed the membership functions, we can now develop

analogs of current market conditions to previous years.

our fuzzy rules. These rules and their supporting facts are shown in

Table 11.8.

Using the above facts and our fuzzy membership functions for both

FUZZY LOGIC

the mother and father, we calculate the following output values for each

membership function:

Fuzzy logic is a powerful technology that allows us to solve problems that

require dealing with vague concepts such as tall or short. For example, a

Motherâ€™s height short .66 normal .33 tall 0

person who is 6 feet tall might b e considered tall compared to the gen-

eral population, but short if a member of a basketball team. Another issue Fatherâ€™s height short S normal .5 tall 0

is: How would we describe a person who is 5 feet 11 inches, if 6 feet is

considered tall? Fuzzy logic can allow us to solve both of these problems.

Fuzzy logic operators are made of three parts: (1) membership func-

TABLE 11.8 RULES FOR CHILDâ€™S HEIGHT.

tion(s), (2) fuzzy rule logic, and (3) defuzzifier(s). The membership func-

tion shows how relevant data are to the premise of each rule. Fuzzy rule These two fuzzy rules are in our expert system:

logic performs the reasoning within fuzzy rules. The defuzzifier maps If Mother-Short and Father-Short, then Child-Short

1.

the fuzzy information back into real-world answers. 2. If Mother-Short and Father_Normal, then Child_Normal

Letâ€™s see how fuzzy logic works, using a simple height example. We

We also have the following facts:

want to develop fuzzy rules that predict a one-year-old male childâ€™s height Mother is 63 inches tall.

in adulthood, based on the height of his mother and father. The first step F&h&r is 67 inches tall.

in developing a fuzzy logic application is to develop fuzzy membership

174 Statisticallv Bared Market Prediction An Overview of Advanced Technologies 175

Letâ€™s see what happens if we rerun the fuzzy rules using these facts. functions. This will convert the fuzzy output back into a real height for

our one-year-old male child:

1. If Mother-Short (.66) and Father-Short(S), then

(S x 28 + .5 x 30 + 0 x 32)/(.5 + S) = 29 inches tall

Child-Short (S).

2. If Mother-Short (.66) and Father-Normal (S), then Child-Normal To see how these membership functions interact for the height of our

(.5). one-year-old child, look at Figure 11.9.

This chapter has given an overview of different advanced technologies

Using the roles of fuzzy logic, we take the minimum of the values as-

that are valuable to traders. We will use these technologies in many ex-

sociated with the conditions when they are joined by an â€śand.â€ť If they amples in the remaining chapters. Now that we have domain expertise in

are joined by an â€śor,â€ť we take the maximum. both analyzing the markets and using many different advanced tech-

As you can see, the child is both short and normal. We will now use nologies, we are ready to design state-of-the-art trading applications.

something called defuzzification to convert the results of these rules back

to a real height. First, find the center point of each of the membership

functions that apply to the height of the child. In our case, that is 28 for

short, 30 for normal, and 32 for tall. Next, multiply the output from the

rules associated with each membership function by these center point

values. Divide the result by the sum of the outputs from the membership

33.0 34.0

27.0 28.0 29.0 30.0 31 .o 32.0

Dekuifkation converts fuzzy rule output into numerical values.

FIGURE 11.9 An example of a siinple defuzzication function for height.

12

Part Three How to Make Subjective

Methods Mechanical

MAKING SUBJECTIVE

METHODS MECHANICAL

Ways of making subjective forms of analysis mechanical form one of the

hottest areas of research in trading systems development. There are two

key reasons for this concentrated activity. First, many people use sub-

jective methods and would like to automate them. Second, and more im-

portant, we can finally backtest these methods and figure out which ones

are predictive and which are just hype.

Based on both my research and the research of others such as Tom

Joseph, I have developed a general methodology for making subjective

trading methods mechanical. This chapter gives an overview of the pro-

cess. The next two chapters will show you how to make Elliott Wave and

candlestick recognition mechanical, using Omega TradeStation. Letâ€™s

NW discuss the general methodology I use to make subjective methods

mechanical.

The first step is to select the subjective method we wish to make me-

chanical. After we have selected the method, we need to classify it, based

on the following categories:

1. Total visual patterns recognition,

2. Subjective methods definition using fuzzy logic.

179

180 Making Subjective Methods Mechanical How to Make Subjective Methods Mechanical 181

3. Human-aided semimechanical methods. rules that are understood but may not be easily defined. My work has

shown that, in these types of subjective methods, the better approach is

4. Mechanical definable methods.

to identify only 15 percent to 40 percent of all cases, making sure that

each has been defined correctly. The reason is that the eye can identify

A subjective form of analysis will belong to one or more of these cat-

many different patterns at once.

egories. Letâ€™s now get an overview of each one.

For example, if we are trying to mechanize divergence between price

and an oscillator, we need to define a window of time in which a diver-

TOTALLY VISUAL PATTERNS RECOGNITION gence, once set up, must occur. We also need to define the types of di-

vergence we are looking for. The human eye can pick up many types of

divergences. that is, divergences based on swing highs and lows or on the

This class of subjective methods includes general chart patterns such as

angle between the swing high and the swing low.

triangles, head and shoulders, and so on. These are the hardest types of

Figure 12.1 shows several different types of divergence that can be

subjective methods to make mechanical, and some chart patterns cannot

picked up by the human eye. It also shows how a product called Diverg-

be made totally automated. When designing a mechanical method for this

EngineTM, by Inside Edge Systems, was able to identify several differ-

class of pattern, we can develop rules that either will identify a large per-

ent divergences during late 1994, using a five-period SlowK. One

centage of that pattern but with many false identifications, or will iden-

tify a small percentage of the pattern with a high percentage of accuracy.

In most cases, either approach can work, but developing the perfect def-

inition may be impossible.

SUBJECTIVE METHODS DEFINITION USING FUZZY LOGIC

Subjective methods that can be defined using fuzzy logic are much eas-

ier than methods that develop a purely visual type of pattern. Candle-

stick recognition is the best example of this type of subjective method.

Candlestick recognition is a combination of fuzzy-logic-based attributes

and attributes that can be defined 100 percent mechanically. Once you

have developed the fuzzy definitions for the size of the candlesticks, it is

very easy to develop codes to identify different candlestick patterns.

HUMAN-AIDED SEMIMECHANICAL METHODS

0.00

A human-aided semimechanical method is one in which the analyst is

JUI m oc!

SW NOâ€ť Dee

using general rules based on observations and is actually performing the

FIGURE Several different types of divergence can be picked up by

analysis of the chart. There are many classic examples of this method. 12.1

the human eye. A product called DivergEngineTM is able to identify simple

The first one that comes to mind is divergence between price and an os-

divergences automatically.

cillator. This type of pattern is often drawn on a chart by a human, using

182 Making Subjective Methods Mechanical How to Make Subjective Methods Mechanical 183

example is a divergence buy signal set up in late September and early MECHANICALLY DEFINABLE METHODS

October of 1994. (Divergences are shown by circles above the bars in

Figure 12.1.) In early November 1994, we had a sell signal divergence. Mechanically definable methods allow us to develop a mathematical for-

This divergence led to a 30-point drop in the S&P500 in less than one mula for the patterns we are trying to define. One example of these types

month. of patterns is the swing highs and lows that are used to define pivot-point

Another type of analysis that falls into this class is the method of draw- trades. Another example would be any gap pattern. There are many ex-

ing trend lines. When a human expert draws a trend line, he or she is con- amples of this class of methods, and any method that can be defined by a

necting a line between lows (or between highs). Important trend lines statement or formula falls into this class.

often involve more than two points. In these cases, an expertâ€™s drawn

trend line may not touch all three (or more) points. The subjective part of

drawing trend lines involves which points to connect and how close is MECHANIZING SUBJECTIVE METHODS

close enough when the points do not touch the trend line. Figure 12.2

shows an example of a hand-drawn major trend line for the S&P500 dur- Once you have classified the category that your method belongs to, you

ing the period from July to October 1994. Notice that not all of the lows need to start developing your mechanical rules. You must begin by iden-

touch the trend line. After the market gapped below this trend line, it col- tifying your pattern or patterns on many different charts-even charts

lapsed 20 points in about three weeks. using different markets.

After you have identified your subjective methods on your charts, you

are ready to develop attributes that define your patterns-for example, in

candlestick charts, the size and color of the candlestick are the key at-

tributes. With the attributes defined, you can develop a mathematical

definition or equivalent for each attribute. Definitions may use fuzzy

concepts, such as tall or short, or may be based on how different techni-

cal indicators act when the pattern exists. Next, you should test each of

your attribute definitions for correctness. This step is very important bt-

cause if these building blocks do not work, you will not be able to develop

an accurate definition for your patterns. After you have developed your

building blocks, you can combine them to try to detect your pattern.

When using your building blocksâ€™ attributes to develop your patterns for

making your subjective method mechanical, it is usually better to have

many different definitions of your pattern, with each one identifying only

10 percent of the cases but with 90 percent correctness.

Making subjective methods mechanical is not easy and should continue

to be a hot area of research for the next 5 to 10 years. Given this outline

of how to make a subjective method mechanical, I will mw)w show you two

/ examples: (1) Elliott Wave analysis and (2) candlestick charts. These will

Jun AdI Od MV Eec

*w se13

be shown in the next two chapters, respectively.

FIGURE 12.2 An example of an S&P500 trend line, drawn between July

and October 1994.

Building the Wave 185

13

3 5 c

Building the Wave

4

1 a

2

P b

A

Failed breakout

Normal five-wave Double top five-wave

sequence sequence

Elliott Wave analysis is based on the work of R. N. Elliott during the Historically, these two patterns occur 70% of the time

1930s. Elliott believed that the movements of the markets follow given

patterns and relationships based on human psychology. Elliott Wave

FIGURE 13.1 Three possible five-wave Elliott Wave patterns.

analysis is a complex subject and has been discussed in detail in many

books and articles. Here, we will not go into it in detail but will provide

an overview so that you can understand (1) why I think Elliott Wave is over, the market sells off, creating wave two. Wave two ends when the

analysis is predictive, and (2) how to make it mechanical so that it can be market fails to make new lows and retraces at least 50 percent, but less

used to predict the markets. than 100 percent, of wave one. Wave two is often identified on a chart by

a double-top or head-and-shoulders pattern. After this correction, the

market will begin to rally again-slowly at first, but then accelerating as

AN OVERVIEW OF ELLIOTT WAVE ANALYSIS

it takes out the top of wave one. This is the start of wave three. As another

sign of wave three, the market will gap in the direction of the trend. Com-

Elliott Wave theory is based on the premise that markets will move in ra-

mercial traders begin building their long position when the market fails

tios and patterns that reflect human nature. The classic Elliott Wave pat-

to make new lows. They continue to build this position during wave three

tern consists of two different types of waves:

as the market continues to accelerate. One of the Elliott Wave rules is

that wave three cannot be the shortest wave and is, in fact, normally at

1. A five-wave sequence called an impulse wave.

least 1.618 times longer than wave.one. This 1.618 number was not se-

2. A three-wave sequence called a corrective wave.

lected out of thin air. It is one of the Fibonacci numbers-a numerical se-

quence that occurs often in nature. In fact, many of the rules of Elliott

The classic five-wave patterns and the three-wave corrective wave are

Wave relate to Fibonacci numbers.

shown in Figure 13.1. Normally, but not always, the market will move in

At some point, profit taking will set in and the market will sell off.

a corrective wave after a five-wave move in the other direction.

This is called wave four. There are two types of wave four: (1) simple

Letâ€™s analyze a classic five-wave sequence to the upside. Wave one is

and (2) complex. The type of wave four to expect is related to the type of

usually a weak rally with only a feti traders participating. When wave one

184

Building the Wave 187

186 Makine Subiective Methods Mechanical

wave two that occurred. If wave two was simple, wave four will be com- TABLE 13.1 TRADING THE ELLIOTT WAVE.

plex. If wave two was complex, wave four will be simple. After the wave- We can trade the basic five-wave pattern as follows:

four correction, the market rallies and usually makes new highs, but the

1. Enter wave three in the direction of the trend.

rally is fueled by small traders and lacks the momentum of a wave-three

2. Stay out of market during wave four.

the

rally. This lack of momentum, as prices rally to new highs or fall to new

3. Enter the wave-five rally in the direction of the trend.

lows, creates divergence using classic technical indicators. After the five

4. Take a countertrend trade at the top of wave five.

waves are finished, the market should change trend. This trend change

will be either corrective or the start of a new five-wave pattern. The mir-

ror image of this pattern exists for a five-wave move to the downside.

example, if we identify a wave three on both the weekly and daily charts,

Elliott Wave patterns exist on each time frame, and the waves relate to

we have a low-risk, high-profit trading opportunity. If we are in a five-

each other the same way. For example, a five-wave pattern can be found

wave downward sequence on a weekly chart but a wave-three upward pat-

on a monthly, weekly, daily, or intraday chart. You must be in the same

tern on a daily chart, the trade would be a high-risk trade that may not be

wave sequence in each time frame. For example, in a five-wave down-

worth taking. When trading Elliott Waves, it is important to view the

ward pattern, you would be in a wave four in a monthly or weekly time

count on multiple time frames.

frame, and in a wave three to the upside on a daily or intraday time frame.

When you study an Elliott Wave pattern closely, you will see that each

wave is made up of similar patterns. Many times, in a five-wave pattern,

wave one, or three, or five will break down into additional five-wave pat- USING THE ELLIOTT WAVE OSCILLATOR TO IDENTIFY

THE WAVE COUNT

terns. This is called an extension.

Elliott Wave analysis has many critics because it is usually a subjec-

Letâ€™s now learn how to objectively identify the classic five-wave pattern.

tive form of analysis. This chapter will show you how to make the most

In 1987, Tom Joseph, of Trading Techniques, Inc., discovered that using

important part of Elliott Wave analysis-the pattern of waves three, four,

a five-period moving average minus a thirty-five-period moving average

and five--objective and totally mechanical.

of the (High + Low)/2 produced an oscillator that is useful in counting

Elliott Waves. He called this discovery the Elliott Wave oscillator. Using

this oscillator and an expert system containing the rules for Elliott Wave,

TYPES OF FIVE-WAVE PATTERNS

he produced software called Advanced GET, published by Trading Tech-

niques, Inc. Advanced GETTM also has many Gannâ€™methods for trading,

The three possible five-wave patterns have been shown in Figure 13.1.

and seasonality and pattern matching are built into the package. GET

The first two are the classic five-wave sequence and the double-top mar-

does a good job of objectively analyzing Elliott Waves. It is available for

ket. The mirror image of these patterns exists on the downside and, ac-

MS-DOS. Windows, and for TradeStation. Tom agreed to share some of

cording to Tom Joseph, these two patterns account for 70 percent of all

h!s research with us so that we can begin to develop our own TradeSta-

possible historical cases. Finally, when the market fails to hold its trend

and the trend reverses, we have afailed breakour sequence pattern. The tion utility for Elliott Wave analysis.

The Elliott Wave oscillator produces a general pattern that correlates

first two five-wave patterns consist of a large rally; then consolidation

to where you are in the Elliott Wave count. Based on the research of

occurs, followed by a rally that tests the old highs or sets new ones. The

Tom Joseph, we can explain this pattern by identifying a five-wave se-

failed breakout pattern occurs 30 percent of the time and is unpredictable.

quence to the upside. We start this sequence by first detecting the end

The classic five-way pattern can be traded as shown in Table 13.1.

of a five-wave sequence to the downside. The first rally that occurs after

Trading the five-wave pattern sounds easy, but the problem is that

the market makes new lows but the Elliott Wave oscillator does not is

the current wave count depends on the time frame being analyzed. For

188 Makine Subiective Methods Mechanical Buildine the Wave 189

called wave one. After the wave-one rally, the market will have a cor- USER FUNCTIONS FOR ELLIOTT WAVE TOOL.

TABLE 13.2

rection but will fail to set new lows. This is wave two, which can be one

Copyright 0 1996 Ruggiero Associates. This code for the Elliott Wave oscillator

of two types. The first is simple; it may last for a few bars and have lit- is only for personal use and is not to be used to create any commercial product.

tle effect on the oscillator. The second, less common type is a complex

wave two. It will usually last longer, and the oscillator will pull back Inputs: DataMNumeric)

Vars: Osc535(O),Price(O);

significantly. There is a relationship between wave two and wave four.

Price=(H of Data(DataSet)+L of Data(DataSetIY2:

If wave two is simple, wave four will be complex. If wave two is com-

If Average(Price,35)oO then begin

plex, wave four will be simple. After wave two is finished, both the mar-

Osc535=Average(Price,S)-Average(Price,35);

ket and the oscillator will begin to rise. This is the start of wave three.

end;

This move will accelerate as the market takes out the top of wave one.

ElliottWaveOsc=Osc535;

A characteristic of wave three is that both the market and the Elliott

Wave oscillator reach new highs. After wave three, there is a profit- Copyright 1996 Ruggiero Associates. This code for the Elliott trend indicator is

taking decline-wave four. After wave four, the market will begin to only for personal use and is not to be used to create any commercial product.

rally and will either create a double top or set new highs, but the Elliott

Inputs: DataSet(Numeric),Len(Numeric),Trigger(Numeric);

Wave oscillator will fail to make new highs. This divergence is the clas-

Vars: Trend(O),Osc(O);

sic sign of a wave five. The oscillator and prices could also make new

Osc=ElliottWaveOsc(DafaSet);

highs after what looks like a wave four. At this point, we have to rela-

If Osc=Highest(Osc,Len) and Trend=0 then Trend=l;

bel our wave five a wave three. Another important point is that wave

If Osc=Lowest(Osc,Len) and Trend=0 then Trend=-1:

five can extend for a long time in a slow uptrend. For this reason, we

If Lowest(Osc,LenkO and Trend=-1 and 0˜0.1 *Trigger*Lowest(Osc,Len) then

cannot be sure the trend has changed until the Elliott Wave oscillator

Trend=1 ;

has retraced more than 138 percent of its wave-five peak.

If Highest(Osc,Len)>O and Trend=1 and 0x-l *Trigger*Highest(Osc,Len) then

Trend=-1 ;

ElliottTrend=Trend;

TRADESTATION TOOLS FOR COUNTING ELLIOTT WAVES

The first step in developing our Elliott Wave analysis software is to de- stand-alone system; it gives up too much of its trading profit on each

velop the Elliott Wave oscillator. The code for this oscillator and a user trade before reversing. Even with this problem, it is still predictive and

function to tell us whether we are in a five-way sequence to the upside is is profitable as a stand-alone system on many markets. Letâ€™s now use

shown in Table 13.2, coded in TradeStation EasyLanguage. this Elliott Trend indicator to build a series of functions that can be used

The user function in Table 13.2 starts with the trend set to zero. We to count the classic 3,4,5 wave sequence. The code for the functions that

initiated the trend based on which occurs first, the oscillator making a count a five-wave sequence to the upside is shown in Table 13.3, stated

â€śLenâ€ť bar high or making it low. If the trend is up, it remains up until the in TradeStationâ€™s EasyLanguage.

Elliott Wave oscillator retraces the â€śTriggerâ€ť percent of the Len bar high The code in Table 13.3 has five inputs. The first is the data series we

and that high was greater than 0. The inverse is also true if the current are applying the function to. For example, we could count the wave pat-

terns on both an intraday and a daily time frame by simply calling this

trend is down. It will remain down untiI the market retraces the Trigger

percent of the Len bar low as long as the low was less than 0. function twice, using different data series. Next, not wanting to call these

functions many times because they are computationally expensive, we

This trend indicator normally will change trend at the top of wave one

pass, in both the Elliott Wave oscillator and the Elliott Trend indicator.

or when wave three takes out the top of one. For this reason, it is not a

190 Making Subiective Methods Mechanical Build& the Wave 191

TABLE 13.3 SIMPLE ELLIOTT WAVE COUNTER FOR 3,4,5 UP. TABLE 13.3 (Continued˜

Copyright 0 1996 Ruggiero Associates. This code to count five waves up is only HiPrice=HiPrice2:

for personal use and is not t o be used to create any commercial product. HiOsc2=-999;

HiPriceZ=-999;

Inputs: DataSet(Numeric),Osc(NumericSeries),ET(NumericSeries),Len(Numeric),

end;

Trig(Numeric);

( If the trend changes in a wave 5 label this a -3 or a wave three down1

Vars: Price(O),Wave(O),HiOsc(-999),HiOsc2(-999),HiPrice(-999),HiPrice2(-999);

( and reset all variables)

Price=(High of Data(DataSet)+Low of Data(DataSet))/Z;

I f ET=-1 then begin

( Is current wave sequence up or down}

wave=-3;

I When we change from down to up label it a wave 31

HiOsc=-999;

I and save current high osc and pricet

HiPrice=-999;

I f ET=1 and ET[lI=-1 and 0˜00 then begin;

HiOsc2=-999;

HiOsc=Osc;

HiPrice2=-999;

HiPricePrice;

end:

wave=3;

wave345up=wave;

end;

I If wave 3 and oscillator makes new high save itl

if Wave=3 and HiOsc<Osc then HiOsc=Osc;

( if wave 3 and price makes new high save itl Our final two arguments are (1) the Len used for the window to identify

the wave counts and (2) the retracement level required to change the trend.

if Wave=3 and HiPricxPrice then HiPrice=Price;

[ If your in a wave 3 and the oscillator pulls back to zero Letâ€™s now use these functions to create the Elliott Wave counter. This

label it a wave 41 code is shown in Table 13.4.

if Wave=3 and Osc<=O and ET=1 then Wave=4; The code in Table 13.3 sets the wave value to a three when the trend

( If youâ€™re in a wave 4 and the oscillator pulls back above zero and prices changes from-l to 1. After that, it starts saving both the highest oscil-

break out then label it a wave 5 and set up second set of high oscillator and lator and price values. It continues to call this a wave three until the os-

price1 cillator retraces to zero and the trend is still up. At this point, it will

if Wave=4 and Price=Highest(Price,5) and Oso=O then begin

Wave=S;

HiOsc2=Osc; TABLE 13.4 SIMPLE ELLIOTT WAVE COUNTER USER FUNCTION

HiPriceZ=Price; FOR THE UP WAVE SEQUENCE.

end;

Copyright 0 1996 Ruggiero Associates. The codeâ€™for this Elliott Wave Counter is

if Wave=5 and HiOscZ<Osc then HiOsc2=0sc;

only for personal use and is not t o be used t(, create any commercial product.

if Wave=5 and HiPrice2<Price then HiPriceZ=Price;

1 If Oscillator sets a new high relabel this a wave 3 and reset wave 5 levelsl Inputs: DataSet(Numeric),Len(Numeric),Trig(Numeric);

I f HiOscZ>HiOsc and HiPrice2>HiPrice and Wave=5 and ET=1 then begin vars: WavCount(0);

Wave=3;

WavCount=Wave345Up(DataSet,EIliottWaveOsc(DataSet),EIIiottTrend(DataSet,

HiOsc=HiOscZ; Len,Trig).Len,Trig);

Elliott345=WavCount;

192 Making Subjective Methods Mechanical Buildine the Wave 193

label it a wave four. If we are currently in a wave four and the oscilla- TABLE 13.6 ELLIOTT WAVE COUNTER

tor pulls above zero and the (High + Low)/2 makes a five-day high, we SYSTEM RESULTS D-MARK.

label this a wave five. We then set a second set of peak oscillator val-

Net profit $35,350.00

ues. If the second peak is greater than the first, we change the count Trades 57

back to a wave three. Otherwise, it stays a wave five until the trend in- Percent profitable 51%

dicator flips to -1. Average trade $690.35

Letâ€™s see how to use our functions to develop a simple Elliott Wave Drawdown -$10,237.50

Profit factor 2.10

trading system. The code for this system is shown in Table 13.5.

Our Elliott Wave system generates a buy signal when the wave count

changes to a wave three. We reenter a long position when we move from This was not just an isolated case: over 80 percent of the cases we

a wave four to a wave five. Finally, we reenter a long position if the wave tested in the above range produced profitable results.

count changes from wave five back to wave three. Our exit is the same After developing these parameters on the D-Mark, we tested them on

for all three entries, when the Elliott Wave oscillator retraces to zero. the Yen. Once again, we used type 67/99 continuous contracts supplied

The entries of this system are relatively good, but if this were a real by Genesis Financial Data Services. We used data for the period from

trading system, we would have developed better exits. We tested this sys- 8/U76 to 308196. The amazing results (with $50.00 deducted for slippage

tem of the D-Mark, using 67/99 type continuous contracts in the period and commissions) are shown in Table 13.7.

from 2/13/75 to 3/18/96, and it performed well. Because our goal is to This same set of parameters did not work only on the D-Mark and Yen,

evaluate Elliott Wave analysis as a trading tool, we optimized the system it also worked on crude oil and coffee as well as many other commodities.

across the complete data set in order to see whether the system was These results show that Elliott Wave analysis is a powerful tool for use in

robust. We optimized across a large set of parameters (ranging from 20 developing trading systems. The work done in this chapter is only a start-

to 180 for length, and from .5 to 1 .O for trig) and found that a broad range ing point for developing mechanical trading systems based on Elliott

of parameters performed very well. The set of parameters using a length Waves. Our wave counter needs logic added to detect the wave one

of 20 and a trigger of .66 produced the results shown in Table 13.6 for the and wave two sequence as well as adding ratio analysis of the length of

period from 2/13/75 to 3/18/96 (with $50.00 deducted for slippage and each wave. Our system does not detect the top of wave three and wave

commissions). five. If we can add that feature to the existing code and do even a fairjob

of detecting the end of both wave three and wave five, we may signifi-

cantly improve our performance. We could also trade the short side of

TABLE 13.5 CODE FOR ELLIOTT WAVE the market. Even with these issues, our basic mechanical Elliott Wave

COUNTER TRADING SYSTEM.

Inputs: Len(SO),Trig(.7);

TABLE 13.7 THE ELLIOTT WAVE COUNTER

Vars: WavCount(O),Osc(O);

SYSTEM RESULTS ON THE YEN.

Osc=ElliottWaveOsc(l);

Net profit $89,800.00

WavCount=Elliott345(1 ,Len,Trig);

Trades 51

If WavCount=3 and WavCount[ll<=O then buy at open;

Percent profitable 51%

If WavCount=5 and WavCount[ll=4 then buy at open; Average trade $1,760.70

If WavCounk3 and WavCount[ll=5 then buy at open; Drawdown -$5,975.00

If Osc<O then exitlong at open; Profit factor 4.16

194 Buildine the Wave 19s

Making Subjective Methods Mechanical

system shows that Elliott Wave analysis does have predictive value and

can be used to develop filter trading systems that work when applied to

various commodities.

EXAMPLES OF ELLIOTT WAVE SEQUENCES USING

ADVANCED GET

We will discuss some examples using charts and Elliott Wave counts gen-

erated from Tom Josephâ€™s Advanced GET software. Elliott Wave analy-

sis can be applied to both the Futures markets and individual Stocks.

In the first example (Figure 13.2), the March 1997, British Pound is

shown. From mid-September 1996 through November 1996, the British

Pound traded in a very stong Wave Three rally. Then the market enters

into a profit-taking stage followed by new highs into January 1997. How-

ever, the new high in prices fails to generate a new high in Tom Josephâ€™s

FIGURE 13.3 Boise Cascade--Daily Stock Chart.

Elliott Oscillator, indicating the end of a Five Wave sequence. Once a

Five Wave sequence is completed, the market changes its trend.

The daily chart of Boise Cascade is shown on Figure 13.3 trading in a

Five Wave decline. The new lows in Wave Five does not generate a new

FIGURE 13.4 British Pound with Profit Taking Index (PTI).

British Pound March˜l997.

FIGURE 13.2

Making. Subiective Methods Mechanical

196

low in Tom Josephâ€™s Elliott Oscillator, indicating the end of a Five Wave

sequence. Once a Five Wave sequence is completed, the market changes

its trend.

14

Using the Profit-Taking Index (PIT)

Mechanically

When a Wave Four is complete, the major question confronting the trader

is whether the market will make a new high in Wave Five. Tom Joseph and

Identifying and Testing

his staff at Trading Techniques Inc., has devised a model that will predict

the potential for a new high. This model is called the Profit Taking Index

Candlestick Patterns

(PTI). The PTI is calculated by measuring the area under Wave Three

and comparing it with the area under Wave Four. If the PTI is greater

than 35, a new high is expected (Figure 13.4).

If the PTI is less than 35, the market fails to make a new high and will

usually result in a failed Fifth Wave or Double Top (Figure 13.5).

Candlestick chart analysis is a subjective form of analysis. The analyst

must first identify the patterns and then judge their significance. For ex-

ample, a white hammer pattern is more bullish after a major downtrend.

Several software vendors have developed software to automatically iden-

tify candlestick patterns. Some of these products also generate mechan-

ical trading signals. Generally, these packages do well at identifying the

patterns, but they have mixed results in using their mechanical trading

signals.

In this chapter, we will use fuzzy logic to identify several candlestick

patterns using TradeStation. We will also show you how to integrate other

forms of technical analysis with candlesticks to develop mechanical trad-

ing signals.

FIGURE 13.5 Weekly Boise Cascade Stock. Double Top. HOW FUZZY LOGIC JUMPS OVER THE CANDLESTICK

Letâ€™s now see how fuzzy logic can be used to analyze candlestick charts.

Trading Techniques Inc. provides free information on mechanically counting Elliott

In our height example, we saw that the first step in developing a fuzzy

Waves and other studies. They can be contacted at (330) 645.0077 or download from

logic application is to list the variables involved and then develop a list of

their web site www.tradingtech.com.

197

Mechanicallv Identifviw and Testine Candlestick Patterns 199

198 Making Subjective Methods Mechanical

FUZZY PRIMITIVES FOR CANDLESTICKS

TABLE 14.1 A CANDLEâ€™S ATTRIBUTES.

Color A single candlestick has the following characteristics: color, shape, upper

White or black

shadow size, and lower shadow size. Not all characteristics require fuzzy

Shape logic. As noted above, color does not require fuzzy logic. Letâ€™s look at an

Long, small, or about equal :..

example of a fuzzy logic function that identifies a candle with a long

Upper Shadow Size shape. The code for this function in TradeStationâ€™s EasyLanguage is

Long, small, or about none

shown in Table 14.2.

Lower Shadow Size The function in Table 14.2 will return a 1 when the current candle size

Lone. small, or about none

is greater than or equal to On&of times the average candle size over the

last lookback days, and a zero when it is less than ZeroCof times the av-

erage size. When the candle size is between these range limits, it returns

a scaled value between 0 and 1. This function can also handle a case

attributes for each variable. For a single candlestick, the attributes are as

where the previous candle was very long and the next candle should also

shown in Table 14.1.

be long, but, using the rule based on the average size, the candle would

Not all variables require fuzzy logic. In our list, color does not, be-

cause color is simply the sign of the close minus the open. We will now

develop a membership function for each of these variables. The â€śshapeâ€ť

candlestick variable is represented graphically in Figure 14.1. TABLE 14.2 CODE FOR FUZZY LONG FUNCTION.

Inputs: OPrice˜NUMERICSERIES˜,CPrice˜NUMERlCSERlES˜,LBack˜NUMERlC˜,

OneCof(NUMERIC),ZeroCof(NUMERIC);

Vars: PrevLong˜O˜,CRangefO˜,AveRange˜O˜.ZTrig˜O˜,OneTrig˜O˜,TalIfO˜,Scale˜O˜:

1 .oo I Calculate the range for the candle]

CRange=absvalue(OPrice-CPrice);

I Calculate what level represents a 01

Measuring stick

.?5 ZTrig=Average(CRang$,LBack)*ZeroCof;

1 Calculate what level represents a 1)

1

OneTrig=Average(CRange,LBack)â€˜OneCof;

.50 - long function I Calculate the diiference between the zero and one level]

Scale=OneTrig-ZTrig:

1 If One Level and Zero Level are the same set to 99.99 50 it can be a large bar]

.25 -

if Scale=0 then Scale=99.99;

â€˜.

( Calculate the furry membership to tall]

Tall=maxlist(O,minlist(l ,(CRange-OneTrig)/(Scale)));

.oo

I If previous bar is big relax requirements)

Average*+ one trigger:

Average

Average,- zero triggers:

if Tall[li=l and CRange[ll-ZTrigoO then Tall=maxlist(O,minlist(l.(CRange-

uTallâ€ť Is two times the average height. CRangeIll)/KRange[l I-ZTrig)));

FuzzvLone-Tall:

A fuzzy logic function that identifies tall candlesticks.

FIGURE 14.1

200 Making Subjective Methods Mechanical 201

Mechanically Identifying and Testing Candlestick Patterns

TABLE 14.3 CANDLESTICK PRIMITIVE FUNCTIONS.

not have been identified correctly. We also handle divide-by-zero condi-

tions that occur when the open, high, low, and close are all the same. Candlestick Color

To identify most of the common candlestick patterns, we need func- CandleColotfOpen,Close˜

tions that can classify all of the attributes associated with a candlestick. Shape

Candlestick

The shape of a candlestick can be long, small, or doji. The upper and FuzzyLongfOpen,Close,LookBack,OneTrigger,ZeroTrigger)

lower wick can be large, small, or none. We also need to be able to iden- FuzzySmali(Open,Close,LookBack,OneTrigger,ZeroTrigger)

tify whether there are gaps or whether one candle engulfs another. After Miscellaneous Functions

we have developed functions to identify these attributes, we can start to EnCulfingfOpen,Close,RefBarJ

identify more complex patterns. WindowDown(Open.High,Low,Close,LookBack)

WindowLJpfOpen,High,Low,Close,LookBack)

DEVELOPING A CANDLESTICK RECOGNITION

UTILITY STEP-BY-STEP Letâ€™s now discuss some of the inputs to these functions, beginning with

the parameter LookBack. This is the period used to calculate a moving

The first step in developing a candlestick recognition tool is to decide average of the body size of each candlestick. The moving average is used

what patterns we want to identify. In this chapter, we will identify the as a reference point to compare how small or large the current candle is,

following patterns: dark cloud, bullish engulf, and evening star. Next, we relative to recent candles.

need to develop a profile of each of these patterns. The plates for the pat- The OneTrigger is the percentage of the average candle size that will

terns have been illustrated by Steve Nison in his first book, Japanese Can- cause the function to output a 1, and the ZeroTrigger is the percentage of

dlestick Charring Techniques, published by John Wiley & Sons, Inc., 1990. the average candle size for outputting a zero. The RefBar parameter is

Letâ€™s now describe each of these three patterns, beginning with the used by the engulfing function to reference which candlestick the current

dark cloud cover. The dark cloud cover consists of two candlesticks: candlestick needs to engulf.

(1) a white candle with a significant body and (2) a black candle that Another important issue when using these functions is that the

opens above the high of the white candle but closes below the midpoint OneTrigger is smaller than the ZeroTrigger for functions that identify

of the white candle. This is a bearish pattern in an uptrend. small or doji candles. When using the long candle size function, the

The bullish engulfing pattern also consists of two candlesticks. The OneTrigger is larger than the ZeroTrigger.

first is a black candle. The second is a white candle that engulfs the black The engulfing function returns a 1 if the current candle engulfs the

candle. This is a bullish sign in a downtrend. RefBar candle. The window-up and window-down functions return a

Our final pattern is an evening star. This pattern is a little more com- number greater than zero when there is a gap in the proper direction. The

plex. It consists of three candles: (1) a significant white candle, (2) a exact return value from these functions is based on the size of the gap

relative to the average candle size over the past LookBack days.

small candle of either color, and (3) a black candle. The middle candle

gaps above both the white and black candlesticks. The black candle opens Letâ€™s now see how to combine these functions to identify the three

candlestick formations discussed earlier in the chapter. We will start with

higher than the close of the white but then closes below the midpoint of

the da& cloud.

the white. This is a bearish pattern in an uptrend.

Letâ€™s now see how we can translate the candlestick definitions into The dark cloud is a bearish formation. Many times, it signals a top

in the market or at least the end of a trend and the start of a period of

TradeStationâ€™s EasyLanguage code.

consolidation. The EasyLanguage code for the dark cloud is shown in

The primitive functions that we will use in identifying the dark cloud,

Table 14.4.

bullish engulf, and evening star are shown in Table 14.3.

Mechanically Identifying and Testing Candlestick Patterns

202 Making Subjective Methods Mechanical 203

TABLE 14.5 CODE FOR BULLISH ENGULF PATTERN.

TABLE 14.4 CODE FOR DARK CLOUD FORMATION.

Inputs: LookBack(Numeric),OneCof(Numeric),ZeroCof(Numeric);

Inputs: LookBack(Numeric),OneCof(Numeric),ZeroCof(Numeric);

Vars: Colot(O),SBody(O),LBody(O);

vars: color(o),SBody(O);

Color=CandleColor(O,C);

Vars: FuzzyRange( Return(O);

SBody=FuzzySmall(O,C,LookBack,OneCof*.3,ZeroCof*l);

Color=CandleColor(O,Ci;

LBody=FuzzyLong(O,C,LookBack,OneCof*2,ZeroCof*l);

(Furry Small has the following arguments

if EnGulfing(O,C,l)=l and Color=1 and Colorlll=-1 then BullEngulf=

FuzzySmall(Lookback,OneCof,ZeroCoi))

minIist6BodyIl1,LBody)

[We reversed On&of and ZeroCof 50 that we can test for Not Small as input to

l&e

the dark cloud function1

BullEngulf=O;

SBody=FurrySmall(O,C,LookBack,ZeroCof*.3,OneCof*l);

Return=O;

FuzzyRange=Close-˜Open˜1l+Close˜ll˜/2;

if Color=-1 and Color[ll=l and open>High[lI and FuzzyRange< then begin logic functions for candle size. We take a fuzzy â€śâ€˜ANDâ€ť between the

Return=1 -SBody[ll; membership of the previous candle in the Small class and the membership

end; of the current candle in the Large class. This value measures the impor-

DarkCloud = Return; tance of the engulfing pattern. If the pattern does not qualify as a bull-

ish engulfing pattern, we return a 0.

Table 14.6 shows the code of an evening star. The code first tests the

color of each candle as well as its membership in the Small class. Next,

Letâ€™s walk through the code in Table 14.4. First, we save the color of

we test to see where the close of the current candle falls in the range of

each candlestick. Next, we save the membership of each candlestick to the first candle in the formation.

the Fuzzy Small set. Notice that we inverted the OneCof and ZeroCof

arguments. (The dark cloud requires the first white candle to have a sig-

nificant body.) We did this by inverting the small membership function.

TABLE 14.6 CODE FOR THE EVENING STAR PATTERN.

If we had used the long membership function, we would have missed

many dark clouds because the first candle was significant but not quite Inputs: LookBack(Numeric),OneCof(Numeric),ZeroCof(Numeric);

long. Next, we calculate whether the second candlestick is black and falls Vars: Color(O),SBody(O);

below the midpoint of the first candle that is white. The second candle Vars:FurzyRange(O),Return(O);

must also open above the high of the first. Color=CandleColor(O,C);

If the candle qualifies as a dark cloud, we return the fuzzy inverse SBody=FuzzySmall(O,C,LookBack,OneCof*.3,ZeroCof*l);

membership of the first candle to the class Fuzzy Small as the value of Return=O;

the fuzzy dark cloud. FuzzyRange=Close-˜CIose˜2l+Open˜21˜/2;

How do we identify a bullish engulfing pattern? The EasyLanguage if Color=-1. and Color[21=1 and WindowUp(O,H,˜,C,1)[1]>0 and

code for this pattern is shown in Table 14.5. open>open[%l and FuzzyRange< then begin

When identifying a bullish engulf, the first thing we do is to evaluate Return=minList(SBody[l ],I-SBody[21);

the color and size of each candle, If the current candle is white and en- end;

gulfs the first candle that is black, we have a possible bullish engulf. The EveningStar=Return;

significance of the bullish engulf pattern is measured by using our fuzzy

Makine Subiective Methods Mechanical Mechanically Identifying and Testing Candlestick Patterns

204 205

For an evening star, we need a black current candle and a white can- These results are horrible and cannot even cover slippage and com-

dle two candlesticks ago. Next, we need the second candle in the forma- missions. Letâ€™s now see how combining the correlation analysis between

tion to have gapped higher. Finally, the current candle must open higher the CRB and gold can improve the performance of the dark cloud cover

than the middle candle but must close at or below the middle of the first pattern for trading Comex gold. In Chapter 10, we showed that gold nor-

candle. All of these requirements must be met in order for the formation mally trends only when it is highly correlated to the CRB. Letâ€™s use this

to qualify as an evening star. We then return the fuzzy â€śANDâ€ť of one information to develop two different exit rules for our dark cloud, com-

candle ago to the class Small and the â€śANDâ€ť of two candles ago to the in- bined with a lo-day RSI pattern. We will now exit using a 5-day high

verse of the Small class. If the formation does not qualify as an evening when the 50&y correlation between gold and the CRB is above .50. We

star, the function returns a 0. will use a limit order at the entry price (-2 x average(range,lO)) when

We can repeat this process to identify any candlestick patterns we the correlation is below .50. According to the theory, we use a trend type

wish. Once we have written a code, we need to test it. To test the codes exit when the market should trend, and a limit order when the market has

given here, we used the plates from Nisonâ€™s book Japanese Candlestick a low probability of trending. The code we are using for these rules, and

Charting Techniques, and tested our routines on the same charts. If you the results without slippage and commissions, are shown in Table 14.8.

are not identifying the patterns you want, you can adjust the LookBack The performance of the dark cloud is incredibly improved by simply

period as well as the scaling coefficients. In general, mechanical identi- using intermarket analysis to select an exit method. There are not enough

fication will miss some patterns that can be detected by the human eye. trades to prove whether this is a reliable trading method for gold. It is

After you have developed routines to identify your candlestick patterns, used only as an example of how to use candlestick patterns to develop

you can use them to develop or improve various trading systems. mechanical trading systems.

How can we use candlesticks to develop mechanical trading strate-

gies? We will test use of the dark cloud cover pattern on Comex gold dur-

ing the period from S/1/86 to 12/26/95. We will go short on the next open

TABLE 14.8 CODE AND RESULTS OF COMBINING

when we identify a dark cloud cover pattern and have a lo-day RSI

INTERMARKET ANALYSIS AND CANDLESTICKS.

greater than 50. We will then exit at a five-day high.

Table 14.7 shows the code and results, without slippage and commissions. Vars: DC˜O˜,Correl˜O˜.CRB˜O˜,CC˜O˜;

CRB=Close of Data2:

CC=ClOW;

TABLE 14.7 CODE AND RESULTS FOR

Correl=RACorrel(CRB,GC,SO˜;

SIMPLE DARK CLOUD SYSTEM.

DGDarkCloud(l5,l ,I);

Van: DC(O); If DC>.5 and RSl(close,10)>50 then sell at open;

DC=DarkCloudil5,1,1); If CoveI>. then exitshort at highest(high,5) stop;

If DC>.5 and RSl(close,10b50 then sell at open; If Correlc.5 then exitshort at entryprice-l*averagefrange,lO)

exitshort at high&high,51 stop; limit;

$150.00

Net profit Net profit $5,350.00

13

Trades Trades 12

4

Wins Wins 9

9

Losses LOSE!5 3

31

Win% Win% 75

$11.54

Average trade Average trade $445.83

206 Making Subjective Methods Mechanical

Combining candlesticks with other Western methods and with inter-

market analysis is a hot area of research for developing mechanical trad-

ing systems. The ability to backtest candlestick patterns will help answer

Part Four

the question of how well any given candlestick pattern works in a given

market. By evaluating candlestick patterns objectively, at least we know

how well they worked in the past and how much heat we will take when

trading them.

TRADING SYSTEM

DEVELOPMENT

AND TESTING

â€™3

210 Trading System Development and Testing Developing a Trading System 211

TABLE 15.1 STEPS IN DEVELOPING A SYSTEM. length of 10 days, whereas a simple channel breakout system has an av-

1. Decide what market and time frame you want to trade. erage trade length of 50 to 80 days.

2. Develop a premise that you will use to design your trading system. Other issues also have an effect on your choice-how much money you

3. Collect and organize the historical market data needed to develop your have to trade, and your own risk-reward criteria. For example, if you only

model into development, testing, and out-of-sample sets. have $lO,OOO.OO, you would not develop an S&P500 system that holds an

4. Based on the market you want to trade, and your premise, select trading overnight position. A good choice would be T-Bonds.

methods that are predictive of that market and meet your own risk-reward

criteria.

5. Design your entries and test them using simple exits.

DEVELOPING A PREMISE

6. Develop filters that improve your entry rules.

7. After you have developed your entries, design more advanced exit methods

The second and most important step in developing a trading system is to

that will improve your systemâ€™s performance.

develop a premise or theory about the market you have selected. There

8. When selecting the parameters that will be used in your trading system,

are many rich sources for theories that are useful in developing systems

base your selections not only on system performance but aI50 on

for various markets. Some of these sources are listed in Table 15.2. Many

robustness.

of those listed were covered in earlier chapters of this book.

9. After you have developed your rules and selected your parameters, test the

system on your testing set to see how the system works on new data. if the

system works well, continue to do more detailed testing. (This is covered in

Chapter 16.1 DEVELOPING DATA SETS

10. Repeat steps 3 through 3 until you have a system that you want to test

further.

After you select the market and time frame you want to trade, you need

to collect and organize your historical data into three sets: (1) the devel-

opment set, which is used to develop your trading rules; (2) a test set,

system. After you have selected your market(s). it is very important to de- where your data are used to test these rules; and (3) a blind or out-of-

cide what time frame you want to trade on and to have an idea of how sample set, which is used only after you have selected the final rules and

long you would like your average trade to last. parameters for your system. III developing many systems over the years,

Selecting a time frame means deciding whether to use intraday, daily, I have found that one of the important issues involved in collecting these

or weekly data. Your decision on a time frame should be based on both

how often you want to trade and how long each trade should last. When

you use a shorter time frame, trading frequency increases and length of

TABLE 15.2 PREMISES FOR TRADING SYSTEMS.

trades decreases.

1. Intermarket analysis.

Another little discussed issue is that each time frame and each market

2. Sentiment indicators.

has its own traits. For example, on the intraday S&P500, the high and/or

3. Market internals-for example, for the S&P500, we would use data such as

low of the day is most likely to occur in the first or last 90 minutes of the

breadth, arm index, the premium between the cash and futures, and so on.

day.

4. Mechanical models of subjective methods.

When using daily data, you can have wide variation in trade length-

5. Trend-based models.

from a few days to a year, depending on what types of methods are used

6. Seasonality, day of week, month of year. and so on.

to generate your signals. For example, many of the intermarket-based

7. Models that analyze technical indicators or price-based patterns.

methods for trading T-Bonds,˜ shown in Chapter lo, have an average trade

Developing a Tradinp. System 213

212 Trading System Development and Testing

TABLE 15.3 TYPES OF TRADING METHODS

data is whether we should use individual contracts or continuous con- AND IMPLEMENTATION.

tracts for the futures data we need to use. It is much easier to use

Premise Implementation

continuous contracts, at least for the development set, and the continuous Pro/Cons

contracts should b-e back adjusted. Depending on your premise, you might Trend following Moving averages All trend-following methods work

need to develop data sets not only for the market you are trading but also Channel breakout badly in nontrending markets. The

Consecutive closes

for related markets; that is, if we were developing a T-Bond system using channel breakout and consecutive

closes are the most robust

intermarket analysis, we would want to also have a continuous contract for

implementations. You will win only

the CRB futures. 30 percent to 50 percent of your

The next important issue is how much data you need. To develop reli- trades.

able systems, you should have at least one bull and one bear market in

Countertrend Oscillator divergence Donâ€™t trade often. These offer

your data set. Ideally, you should have examples of bull and bear markets methods and cycle-based higher winning percentages than

in both your development set and your test set. I have found that having methods trend-following methods, but they

at least 10 yearsâ€™ daily data is a good rule of thumb. can suffer large losing trades.

Cause and effect Technical analysis, These systems can work only on

(intermarket and comparing two or the markets they were developed

SELECTING METHODS FOR DEVELOPING fundamental rrwre data series to trade. They can suffer high

A TRADING SYSTEM analysis) drawdowns but yield a good

winning percentage.

Pattern and

After you have developed your premise and selected the technologies that Simple rules of three Donâ€™t trade often. These need to

statistically based or more conditions be tested to make sure they are not

can be used to support it, you must prove that your premise is valid and

methods curve-fitted. They have a good

that you actually can use it to predict the market you are trying to trade. winning percentage and drawdown

Another issue that you need to deal with is whether a system that is if thev are robust.

based on your premise will fit your trading personality. This issue is often

overlooked but is very important. Letâ€™s suppose your premise is based

on the fact that the currency markets trend. This approach will mt work

for you if you cannot accept losing at least half your trades or are unable

to handle watching a system give up half or more of its profit on a given

inflation. As in Chapter 1, several different commodities can be used as

trade.

measures on inflation; for example, the CRB index, copper, and gold

In another example, you might have a pattern-based premise that pro-

can all be used to predict T-Bonds. Next, you need to test your premise

duced a high winning percentage but traded only 10 times a year. Many that inflation can be used to predict T-Bonds. When inflation is rising

people, needing more action than this, would take other trades not based

and T-Bonds are also rising, you sell T-Bonds. If inflation is falling and

on a system and would start losing money.

so are T-Bonds, you buy T-Bonds. Youxan then test this simple diver-

Letâ€™s now look at several different premises and how they are imple-

genci: premise using either the price momentum or prices relative to a

mented. This information, plus the pros and cons of each premise, is moving average. Once you have proven that your premise is predictive,

shown in Table 15.3. you can use a simple reversal system and start to develop your entries

After you have selected your method, you need to test it. Letâ€™s sup- and exits.

pose your premise was that T-Bond prices can be predicted based on

214 Trading System Development and Testing 215

Developing a Trading System

DESIGNING ENTRIES very large losing trades. They increase the drawdown and can make the

system untradable.

The simplest type of system is the classic reversal, which has been shown The top traders in the world use these types of complex entries; for

several times in this book. For example, the channel breakout system example, many of Larry Williamsâ€™s patterns are based on buying or sell-

shown in Chapter 4 is a reversal-type system. This type of system can be ing on a stop based on the open, plus or minus a percentage of yester-

either long or short-a feature that creates problems because sometimes dayâ€™s range when a condition is true.

a reversal-type system can produce very large losing trades. Now that you have a basic understanding of developing entries, you

Having discussed the simplest type of system entry, letâ€™s now examine need to learn how to test them. When testing entries, you should use sim-

the entries themselves. There are two major types of entries: (1) simple ple exits. Among the simple exits I use are: holding a position for N bars,

and (2) complex. Simple entries signal a trade when a condition is true. using target profits, and exiting on first profitable opening. Another test

Complex entries need an event to be true, and another â€śtriggerâ€ť event of entries is to use a simple exit and then lag when you actually get into

must occur to actually produce the entry. A simple entry will enter a trade the trade. I normally test lags between 1 to 5 bars. The less a small lag af-

on the next open or on todayâ€™s close, if a given event is true. For exam- fects the results, the more robust the entry. Another thing you can learn

ple, the following rule is a simple entry: is that, sometimes, when using intermarket or fundamental analysis, lag-

ging your entries by a few bars can actually help performance. Testing

If today = Monday and T-Bonds > T-Bonds[5], then buy S&P500 at your entries using simple exits will help you not only to develop better en-

open. tries but also to test them for robustness.

When you have found entry methods that work well, see whether there

The ruleâ€™s conditional part, before the â€śthen,â€ť can be as complex as are any patterns that produce either substandard performance or superior

needed, but the actual order must always occur when the rule is true. performance. These patterns can be used as filters for your entry rules.

Complex entries use a rule combined with a trigger, for example:

If today = Monday and T-Bonds > T-Bonds[S], then buy S&P500 at DEVELOPING FILTERS FOR ENTRY RULES

open + .3 x range stop.

Developing filters for entry rules is a very important step in the system

This is a complex entry rule because we do not buy only when the rule development process. Filters are normally discovered during the process

is true. We require the trigger event (being 30 percent of yesterdayâ€™s of testing entries; for example, you might find out that your entry rules

range above the open) to occur, or we will not enter the trade. do not work well during September and October. You can use this infor-

For both types of entries, the part of the rule before the â€śthenâ€ť states mation to filter out trades in those months.

the events that give a statistical edge. In the above examples, Mondays in Another popular type of filter is a trend detection indicator, like ADX,

which T-Bonds are in an uptrend have a statistically significant upward for filtering a trend-following system. The goal in using these types of fil-

bias. Triggers are really filters that increase the statistical edge. Suppose, ters is to filter out trades that have lower expectations than the overall

for day-trading the S&P500, we use a trigger that buys a breakout of 30 trades produced by a given pattern. For extimple, our correlation filter in

percent of yesterdayâ€™s range above the open. This works because once a Chapte; 8 was applied to our buy on Monday, when T-Bonds were above

market moves 30 percent above or below the open in a given direction, the their 26.day moving-average rule. It succeeded in filtering out about 60

chance of a large-range day counter to that move drops significantly. This trades that produced only $6.00 a trade. With the average trade over

is important because the biggest problem a trading system can have is $200.00, filtering out these trades greatly improved the results of this

Developing a Trading System 217

216 Trading System Development and Testing

ńňđ. 4 |