<<

. 5
( 6)



>>


assured by one of the fundamental theorems of probability called the
law of large numbers. This theorem asserts that, for any > 0,
x
’ p| < ]
Pr[|
n
tends to 1 as n increases inde¬nitely.
It is important to understand what this theorem says and what it
does not say. Let us illustrate its meaning in the case of coin tossing.
We are going to toss a coin n times and we want the probability to be
very high, say greater than .99, that the fraction of heads which turn
up will be very close, say within .00l of the value .5. The law of large
numbers assures us that we can have this if we simply choose n large
enough. The theorem itself gives us no information about how large n
must be. Let us however consider this question.
To say that the fraction of the times success results is near p is the
same as saying that the actual number of successes x does not deviate
too much from the expected number np. To see the kind of deviations
which might be expected we can study the value of Pr[|x ’ np| ≥ d]. A
table of these values for p = .3 and various values of n and d are given
in Figure 4.19. Let us ask how large d must be before a deviation as
large as d could be considered surprising. For example, let us see for
each n the value of d which makes Pr[|x ’ np| ≥ d] about .04. From
the table, we see that d should be 7 for n = 50, 9 for n = 80, 10 for
n = 100, etc. To see deviations which might be considered more typical
we look for the values of d which make Pr[|x ’ np| ≥ d] approximately
1
. Again from the table, we see that d should be 3 or 4 for n = 50, 4
3
or 5 for n = 80, 5 for n = 100, etc. The answers to these two questions
148 CHAPTER 4. PROBABILITY THEORY

are given in the last two columns of the table. An examination of these
numbers shows us that deviations which we would consider surprising

are approximately √ while those which are more typical are about
n
one-half as large or n/2.

This suggests that n, or a suitable multiple of it, might be taken
as a unit of measurement for deviations. Of course, we would also have
to study how Pr[|x ’ np| ≥ d] depends on p. When this is done, one

¬nds that npq is a natural unit; it is called a standard deviation. It
can be shown that for large n the following approximations hold.

Pr[|x ’ np| ≥ npq] ≈ .3174

Pr[|x ’ np| ≥ 2 npq] ≈ .0455

Pr[|x ’ np| ≥ 3 npq] ≈ .0027
That is, a deviation from the expected value of one standard devi-
ation is rather typical, while a deviation of as much as two standard
deviations is quite surprising and three very surprising. For values of p
√ 1
not too near 0 or 1, the value of pq is approximately 2 . Thus these
approximations are consistent with the results we observed from our
table.
√ √
For large n, Pr[x ’ np ≥ k npq] or Pr[x ’ np ¤ ’k npq] can be
shown to be approximately the same. Hence these probabilities can be
1
estimated for k = 1, 2, 3 by taking 2 the values given above.

Example 4.19 In throwing an ordinary coin 10,000 times, the ex-
pected number of heads is 5000, and the standard deviation for the
11
number of heads is 10, 000( 2 )( 2 ) = 50. Thus the probability that
the number of heads which turn up deviates from 5000 by as much as
one standard deviation, or 50, is approximately .317. The probability
of a deviation of as much as two standard deviations, or 100, is ap-
proximately .046. The probability of a deviation of as much as three
™¦
standard deviations, or 150, is approximately .003.

Example 4.20 Assume that in a certain large city, 900 people are
chosen at random and asked if they favor a certain proposal. Of the
900 asked, 550 say they favor the proposal and 350 are opposed. If, in
fact, the people in the city are equally divided on the issue, would it be
unlikely that such a large majority would be obtained in a sample of 900
of the citizens? If the people were equally divided, we would assume
that the 900 people asked would form an independent trials process
149
4.10. THE LAW OF LARGE NUMBERS

1 1
with probability 2 for a “yes” answer and 2 for a “no” answer. Then
the standard deviation for the number of “yes” answers in 900 trials is
900( 2 )( 1 ) = 15. Then it would be very unlikely that we would obtain
1
2
a deviation of more than 45 from the expected number of 450. The
fact that the deviation in the sample from the expected number was
100, then, is evidence that the hypothesis that the voters were equally
divided is incorrect. The assumption that the true proportion is any
value less than 1 would also lead to the fact that a number as large
2
as 550 favoring in a sample of 900 is very unlikely. Thus we are led to
1
suspect that the true proportion is greater than 2 . On the other hand,
if the number who favored the proposal in the sample of 900 were 465,
we would have only a deviation of one standard deviation, under the
assumption of an equal division of opinion. Since such a deviation is
not unlikely, we could not rule out this possibility on the evidence of
™¦
the sample.


Example 4.21 A certain Ivy League college would like to admit 800
students in their freshman class. Experience has shown that if they
admit 1250 students they will have acceptances from approximately
800. If they admit as many as 50 too many students they will have
to provide additional dormitory space. Let us ¬nd the probability that
this will happen assuming that the acceptances of the students can be
considered to be an independent trials process. We take as our estimate
800
for the probability of an acceptance p = 1250 . Then the expected num-
ber of acceptances is 800 and the standard deviation for the number of

acceptances is 1250 · .64 · .36 ≈ 17. The probability that the number
accepted is three standard deviations or 51 from the mean is approx-
imately .0027. This probability takes into account a deviation above
the mean or below the mean. Since in this case we are only interested
in a deviation above the mean, the probability we desire is half of this
or approximately .0013. Thus we see that it is highly unlikely that the
college will have to have new dormitory space under the assumptions
™¦
we have made.

We ¬nish this discussion of the law of large numbers with some ¬nal
remarks about the interpretation of this important theorem.
Of course no matter how large n is we cannot prevent the coin from
coming up heads every time. If this were the case we would observe a
fraction of heads equal to 1. However, this is not inconsistent with the
1
theorem, since the probability of this happening is ( 2 )n which tends to
150 CHAPTER 4. PROBABILITY THEORY

0 as n increases. Thus a fraction of 1 is always possible, but becomes
increasingly unlikely.
The law of large numbers is often misinterpreted in the following
manner. Suppose that we plan to toss the coin 1000 times and after
500 tosses we have already obtained 400 heads. Then we must obtain
less than one-half heads in the remaining 500 tosses to have the fraction
come out near 1 . It is tempting to argue that the coin therefore owes
2
us some tails and it is more likely that tails will occur in the last 500
tosses. Of course this is nonsense, since the coin has no memory. The
point is that something very unlikely has already happened in the ¬rst
500 tosses. The ¬nal result can therefore also be expected to be a result
not predicted before the tossing began.
We could also argue that perhaps the coin is a biased coin but this
would make us predict more heads than tails in the future. Thus the
law of averages, or the law of large numbers, should not give you great
comfort if you have had a series of very bad hands dealt you in your
last 100 poker hands. If the dealing is fair, you have the same chance
as ever of getting a good hand.
Early attempts to de¬ne the probability p that success occurs on
a single experiment sounded like this. If the experiment is repeated
inde¬nitely, the fraction of successes obtained will tend to a number p,
and this number p is called the probability of success on a single exper-
iment. While this fails to be satisfactory as a de¬nition of probability,
the law of large numbers captures the spirit of this frequency concept
of probability.

Exercises
1. If an ordinary die is thrown 20 times, what is the expected number
of times that a six will turn up? What is the standard deviation
for the number of sixes that turn up?
10 5
[Ans. ; .]
33


2. Suppose that an ordinary die is thrown 450 times. What is the
expected number of throws that result in either a three or a four?
What is the standard deviation for the number of such throws?

3. In 16 tosses of an ordinary coin, what is the expected number
of heads that turn up? What is the standard deviation for the
number of heads that occur?
151
4.10. THE LAW OF LARGE NUMBERS

[Ans. 8;2.]

4. In 16 tosses of a coin, ¬nd the exact probability that the number
of heads that turn up di¬ers from the expected number by (a)
as much as one standard deviation, and (b) by more than one
standard deviation. Do the same for the case of two standard
deviations, and for the case of three standard deviations. Show
that the approximations given for large n lie between the values
obtained, but are not very accurate for so small an n.

[Ans. .454; .210; .077; .021; .004; .001.]

5. Consider n independent trials with probability p for success. Let r
and s be numbers such that p < r < s. What does the law of large
x
numbers say about Pr[r < n < s] as we increase n inde¬nitely?
Answer the same question in the case that r < p < s.

6. A drug is known to be e¬ective in 20 per cent of the cases where
it is used. A new agent is introduced, and in the next 900 times
the drug is used it is e¬ective 250 times. What can be said about
the e¬ectiveness of the drug?

7. In a large number of independent trials with probability p for
success, what is the approximate probability that the number of
successes will deviate from the expected number by more than
one standard deviation but less than two standard deviations?

[Ans. .272.]

8. What is the approximate probability that, in 10,000 throws of an
ordinary coin, the number of heads which turn up lies between
4850 and 5150? What is the probability that the number of heads
lies in the same interval, given that in the ¬rst 1900 throws there
were 1600 heads?

9. Suppose that it is desired that the probability be approximately
.95 that the fraction of sixes that turn up when a die is thrown n
1
times does not deviate by more than .01 from the value 6 . How
large should n be?

[Ans. Approximately 5555.]
152 CHAPTER 4. PROBABILITY THEORY

10. Suppose that for each roll of a fair die you lose $1 when an odd
number comes up and win $1 when an even number comes up.
Then after 10,000 rolls you can, with approximately 84 per cent
con¬dence, expect to have lost not more than $(how much?).

11. Assume that 10 per cent of the people in a certain city have
cancer. If 900 people are selected at random from the city, what
is the expected number which will have cancer? What is the
standard deviation? What is the approximate probability that
more than 108 of the 900 chosen have cancer?

[Ans. 90;9;.023.]

12. Suppose that in Exercise 11, the 900 people are chosen at random
from those people in the city who smoke. Under the hypothesis
that smoking has no e¬ect on the incidence of cancer, what is the
expected number in the 900 chosen that have cancer? Suppose
that more than 120 of the 900 chosen have cancer, what might be
said concerning the hypothesis that smoking has no e¬ect on the
incidence of cancer?

13. In Example 4.20, we made the assumption in our calculations
that, if the true proportion of voters in favor of the proposal
were p, then the 900 people chosen at random represented an
independent trials process with probability p for a “yes” answer,
and 1 ’ p for a “no” answer. Give a method for choosing the 900
people which would make this a reasonable assumption. Criticize
the following methods.

(a) Choose the ¬rst 900 people in the list of registered Republi-
cans.
(b) Choose 900 names at random from the telephone book.
(c) Choose 900 houses at random and ask one person from each
house, the houses being visited in the mid-morning.

14. For n throws of an ordinary coin, let tn be such that
x1
’ < tn ] = .997,
Pr[’tn <
n2
where x is the number of heads that turn up. Find tn for n = 104 ,
n = 106 , and n = 1020 .
4.11. INDEPENDENT TRIALS WITH MORE THAN TWO OUTCOMES153

[Ans. .015; .0015; .000,000,000,15.]

15. Assume that a calculating machine carries out a million opera-
tions to solve a certain problem. In each operation the machine
gives the answer 10’5 too small, with probability 1 , and 10’5 too
2
1
large, with probability 2 . Assume that the errors are independent
of one another. What is a reasonable accuracy to attach to the
answer? What if the machine carries out 1010 operations?

[Ans. ±.01; ±1.]

16. A computer tosses a coin 1 million times, and obtains 499,588
heads. Is this number reasonable?


4.11 Independent trials with more than
two outcomes
By extending the results of Section 4.8, we shall study the case of
independent trials in which we allow more than two outcomes. We
assume that we have an independent trials process where the possible
outcomes are a1 , a2 , . . . , ak , occurring with probabilities p1 , p2 , . . . , pk ,
respectively. We denote by

f (r1 , r2 , . . . , rk ; p1 , p2 , . . . , pk )

the probability that, in n = r1 + r2 + . . . + rk such trials, there will be r1
occurrences of a1 , r2 occurrences of a2 , etc. In the case of two outcomes
this notation would be f (r1 , r2 ; p1 , p2 ). In Section 4.8 we wrote this as
f (n, r + 1; p) since r2 and p2 are determined from n, r1 , and p1 . We
shall indicate how this probability is found in general, but carry out
the details only for a special case. We choose k = 3, and n = 5 for
purposes of illustration. We shall ¬nd f (1, 2, 2; p1 , p2 , p3 ).
We show in Figure 4.20 enough of the tree for this process to indicate
the branch probabilities for a path (heavy lined) corresponding to the
outcomes a2 , a3 , a1 , a2 , a3 . The tree measure assigns weight p2 · p3 · p1 ·
p2 · p3 = p1 · p2 · p2 to this path.
2 3
There are, of course, other paths through the tree corresponding to
one occurrence of a1 , two of a2 , and two of a3 . However, they would all
be assigned the same weight p1 · p2 · p2 , by the tree measure. Hence to
2 3
154 CHAPTER 4. PROBABILITY THEORY




Figure 4.20: ™¦

¬nd f (l, 2, 2; p1 , p2 , p3 ) we must multiply this weight by the number of
paths having the speci¬ed number of occurrences of each outcome.
We note that the path a2 , a3 , a1 , a2 , a3 can be speci¬ed by the three-
cell partition [{3}, {1, 4}, {2, 5}] of the numbers from 1 to 5. Here the
¬rst cell shows the experiment which resulted in a1 , the second cell
shows the two that resulted in a2 , and the third shows the two that
resulted in a3 . Conversely, any such partition of the numbers from 1
to 5 with one element in the ¬rst cell, two in the second, and two in
the third corresponds to a unique path of the desired kind. Hence the
number of paths is the number of such partitions. But this is
5 5!
=
1, 2, 2 1!2!2!
(see 3.4), so that the probability of one occurrence of a1 , two of a2 , and
two of a3 is
5
· p1 · p2 · p2 .
2 3
1, 2, 2
The above argument carried out in general leads, for the case of
independent trials with outcomes a1 , a2 , . . . , ak occurring with proba-
bilities p1 , p2 , . . . , pk , to the following.
The probability for r1 occurrences of a1 , r2 occurrences of
a2 , etc., is given by
n
p r1 · p r2 · . . . · p rk .
f (r1 , r2 , . . . , rk ; p1 , p2 , . . . , pk ) =
r1 , r2 , . . . , rk 1 2 k
4.11. INDEPENDENT TRIALS WITH MORE THAN TWO OUTCOMES155



Example 4.22 A die is thrown 12 times. What is the probability that
each number will come up twice? Here there are six outcomes, 1, 2,
3, 4, 5, 6 corresponding to the six sides of the die. We assign each
1
outcome probability 6 . We are then asked for

111111
f (2, 2, 2, 2, 2, 2; , , , , , )
666666
which is
12 111111
( )2 ( )2 ( )2 ( )2 ( )2 ( )2 = .0034.
2, 2, 2, 2, 2, 2 6 6 6 6 6 6

™¦


Example 4.23 Suppose that we have an independent trials process
with four outcomes a1 , a2 , a3 , a4 occurring with probability p1 , p2 . p3 ,
p4 , respectively. It might be that we are interested only in the proba-
bility that r1 occurrences of a1 and r2 occurrences of a2 will take place
with no speci¬cation about the number of each of the other possible
outcomes. To answer this question we simply consider a new exper-
iment where the outcomes are a1 , a2 , a3 . Here a3 corresponds to an
¯ ¯
occurrence of either a3 or a4 in our original experiment. The corre-
sponding probabilities would be p1 , p2 and p3 with p3 = p3 + p4 . Let
¯ ¯
r3 = n ’ (r1 + r2 ) Then our question is answered by ¬nding the prob-
¯
ability in our new experiment for r1 occurrences of a1 , r2 of a2 , and r3 ¯
of a3 , which is
¯
n
pr1 · pr2 · p3 r¯3 .
¯
r1 , r2 , r 3 1 2
¯
™¦
The same procedure can be carried out for experiments with any
number of outcomes where we specify the number of occurrences of
such particular outcomes. For example, if a die is thrown ten times
the probability that a one will occur exactly twice and a three exactly
three times is given by

10 114
( )2 ( )2 ( )5 = .043.
2, 3, 5 6 6 6
156 CHAPTER 4. PROBABILITY THEORY

Exercises
1. Suppose that in a city 60 per cent of the population are Democrats,
30 per cent are Republicans, and 10 per cent are Independents.
What is the probability that if three people are chosen at random
there will be one Republican, one Democrat, and one Independent
voter?

[Ans. .108.]

2. Three horses, A, B, and C, compete in four races. Assuming
that each horse has an equal chance in each race, what is the
probability that A wins two races and B and C win one each?
What is the probability that the same horse wins all four races?
41
[Ans. ; .]
27 27

3. Assume that in a certain large college 40 per cent of the students
are freshmen, 30 per cent are sophomores, 20 per cent are juniors,
and 10 per cent are seniors. A committee of eight is chosen at
random from the student body. What is the probability that
there are equal numbers from each class on the committee?
4. Let us assume that when a batter comes to bat, he or she has
probability .6 of being put out, .1 of getting a walk, .2 of getting
a single, .1 of getting an extra base hit. If he or she comes to bat
¬ve times in a game, what is the probability that
(a) He gets two walks and three singles?
[Ans. .0008.]
(b) He gets a walk, a single, an extra base hit (and is out twice)?
[Ans. .043.]
(c) He has a perfect day (i.e., never out)?
[Ans. .010.]
1
5. Assume that a single torpedo has a probability 2 of sinking a
ship, probability 1 of damaging it, and probability 4 of missing.
1
4
Assume further that two damaging shots sink the ship. What
is the probability that four torpedos will succeed in sinking the
ship?
4.11. INDEPENDENT TRIALS WITH MORE THAN TWO OUTCOMES157

251
[Ans. .]
256


6. Jones, Smith, and Green live in the same house. The mailman has
observed that Jones and Smith receive the same amount of mail
on the average, but that Green receives twice as much as Jones
(and hence also twice as much as Smith). If he or she has four
letters for this house, what is the probability that each resident
receives at least one letter?

7. If three dice are thrown, ¬nd the probability that there is one six
and two ¬ves, given that all the outcomes are greater than three.

1
[Ans. .]
9


8. An athlete plays a tournament consisting of three games. In each
1 1 1
game he or she has probability 2 for a win, 4 for a loss, and 4 for a
draw, independently of the outcomes of other games. To win the
tournament he or she must win more games than he or she loses.
What is the probability that he or she wins the tournament?

9. Assume that in a certain course the probability that a student
chosen at random will get an A is .1, that he or she will get a B
is .2, that he or she will get a C is .4, that he or she will get a D
is .2, and that he or she will get an F is .1. What distribution of
grades is most likely in the case of four students?

[Ans. One B, two C™s, one D.]

10. Let us assume that in a World Series game a batter has probability
1 1 1
of getting no hits, 2 for getting one hit, 4 for getting two hits,
4
assuming that the probability of getting more than two hits is
negligible. In a four-game World Series, ¬nd the probability that
the batter gets

(a) Exactly two hits.
7
[Ans. .]
64

(b) Exactly three hits.
7
[Ans. .]
32

(c) Exactly four hits.
158 CHAPTER 4. PROBABILITY THEORY

35
[Ans. .]
128

(d) Exactly ¬ve hits.
7
[Ans. .]
32

(e) Fewer than two hits or more than ¬ve.
23
[Ans. .]
128

11. Gypsies sometimes toss a thick coin for which heads and tails are
1
equally likely, but which also has probability 5 of standing on
edge (i.e., neither heads nor tails). What is the probability of
exactly one head and four tails in ¬ve tosses of a gypsy coin?

12. A family car is driven by the father, two sons, and the mother.
The fenders have been dented four times, three times while the
mother was driving. Is it fair to say that the mother is a worse
driver than the men?


4.12 Expected value
In this section we shall discuss the concept of expected value. Although
it originated in the study of gambling games, it enters into almost any
detailed probabilistic discussion.
De¬nition. If in an experiment the possible outcomes are numbers,
a1 , a2 , . . . , ak , occurring with probability p1 , p2 , . . . , pk , then the expected
value is de¬ned to be

E = a 1 p1 + a 2 p2 + . . . + a k pk .

The term “expected value” is not to be interpreted as the value that
will necessarily occur on a single experiment. For example, if a person
bets $1 that a head will turn up when a coin is thrown, he or she may
either win $1 or lose $1. His expected value is (1)( 1 ) + (’1)( 1 ) = 0,
2 2
which is not one of the possible outcomes. The term, expected value,
had its origin in the following consideration. If we repeat an experiment
with expected value E a large number of times, and if we expect a1 a
fraction p1 of the time, a2 a fraction p2 of the time, etc., then the average
that we expect per experiment is E. In particular, in a gambling game
E is interpreted as the average winning expected in a large number
of plays. Here the expected value is often taken as the value of the
game to the player. If the game has a positive expected value, the
159
4.12. EXPECTED VALUE

game is said to be favorable; if the game has expected value zero it
is said to be fair; and if it has negative expected value it is described
as unfavorable. These terms are not to be taken too literally, since
many people are quite happy to play games that, in terms of expected
value, are unfavorable. For instance, the buying of life insurance may
be considered an unfavorable game which most people choose to play.

Example 4.24 For the ¬rst example of the application of expected
value we consider the game of roulette as played at Monte Carlo. There
are several types of bets which the gambler can make, and we consider
two of these.
The wheel has the number 0 and the numbers from 1 to 36 marked
on equally spaced slots. The wheel is spun and a ball comes to rest
in one of these slots. If the player puts a stake, say of $1, on a given
number, and the ball comes to rest in this slot, then he or she receives
from the croupier 36 times the stake, or $36. The player wins $35
with probability 37 and loses $1 with probability 36 . Hence his or her
1
37
expected winnings are

1 36
36 · ’1· = ’.027.
37 37
This can be interpreted to mean that in the long run the player can
expect to lose about 2.7 per cent of his or her stakes.
A second way to play is the following. A player may bet on “red”
or “black”. The numbers from 1 to 36 are evenly divided between the
two colors. If a player bets on “red”, and a red number turns up, the
player receives twice the stake. If a black number turns up, the player
loses the stake. If 0 turns up, then the wheel is spun until it stops on
a number di¬erent from 0. If this is black, the player loses; but if it is
red, the player receives only the original stake, not twice it. For this
type of play, the player wins $1 with probability 18 , breaks even with
37
probability 2 · 37 , and loses $1 with probability 37 + 1 · 37 . Hence his
1 1 18 1
2
or her expected winning is

18 1 37
1· +0· ’1· = ’.0135.
37 74 74
In this case the player can expect to lose about 1.35 per cent of his
or her stakes in the long run. Thus the expected loss in this case is
™¦
only half as great as in the previous case.
160 CHAPTER 4. PROBABILITY THEORY

Example 4.25 A player rolls a die and receives a number of dollars
corresponding to the number of dots on the face which turns up. What
should the player pay for playing, to make this a fair game? To answer
this question, we note that the player wins 1, 2, 3, 4, 5 or 6 dollars,
1
each with probability 6 . Hence, the player™s expected winning is

1 1 1 1 1 1
1( ) + 2( ) + 3( ) + 4( ) + 5( ) + 6( ) = 3.5.
6 6 6 6 6 6
™¦
Thus if the player pays $3.50, the expected winnings will be zero.

Example 4.26 What is the expected number of successes in the case
1
of four independent trials with probability 3 for success? We know that
4
the probability of x successes is x ( 3 )x ( 2 )4’x . Thus
1
3


4 1024 4 1123 4 1222
E = 0· ( ) ( ) +1· ( ) ( ) +2· ( )( ) +
033 133 233
4 1321 4 1420
3· ( ) ( ) +4· ( )( )
333 433
32 48 24 4 108 4
= 0+ + + + = =.
81 81 81 81 81 3
In general, it can be shown that in n trials with probability p for success,
™¦
the expected number of successes is np.

Example 4.27 In the game of craps a pair of dice is rolled by one of
the players. If the sum of the spots shown is 7 or 11, he or she wins.
If it is 2, 3, or 12, he or she loses. If it is another sum, he or she must
continue rolling the dice until he or she either repeats the same sum or
rolls a 7. In the former case he or she wins, in the latter he or she loses.
Let us suppose that he or she wins or loses $1. Then the two possible
outcomes are +1 and ’1. We will compute the expected value of the
game. First we must ¬nd the probability that he or she will win.
We represent the possibilities by a two-stage tree shown in Figure
4.21. While it is theoretically possible for the game to go on inde¬-
nitely, we do not consider this possibility. This means that our analysis
applies only to games which actually stop at some time.
The branch probabilities at the ¬rst stage are determined by think-
ing of the 36 possibilities for the throw of the two dice as being equally
likely and taking in each case the fraction of the possibilities which
161
4.12. EXPECTED VALUE




Figure 4.21: ™¦


correspond to the branch as the branch probability. The probabilities
for the branches at the second level are obtained as follows. If, for
example, the ¬rst outcome was a 4, then when the game ends, a 4 or 7
must have occurred. The possible outcomes for the dice were

{(3, 1), (1, 3), (2, 2), (4, 3), (3, 4), (2, 5), (5, 2), (1, 6), (6, 1)}.

Again we consider these possibilities to be equally likely and assign to
the branch considered the fraction of the outcomes which correspond to
this branch. Thus to the 4 branch we assign a probability 9 = 1 . The
3
3
other branch probabilities are determined in a similar way. Having the
tree measure assigned, to ¬nd the probability of a win we must simply
add the weights of all paths leading to a win. If this is done, we obtain
244
. Thus the player™s expected value is
495

244 251 7
1·( ) + (’1) · ( )=’ = ’.0141.
495 495 495
Hence the player can expect to lose 1.41 per cent of his or her stakes
in the long run. It is interesting to note that this is just slightly less
™¦
favorable than the losses in betting on “red” in roulette.
162 CHAPTER 4. PROBABILITY THEORY

Exercises
1. Suppose that A tosses two coins and receives $2 if two heads
appear, $1 if one head appears, and nothing if no heads appear.
What is the expected value of the game to A?

[Ans. $1.]

2. Smith and Jones are matching coins. If the coins match, Smith
gets $1, and if they do not, Jones get $1.
(a) If the game consists of matching twice, what is the expected
value of the game for Smith?
(b) Suppose that Smith quits if he or she wins the ¬rst round he
or she quits, and plays the second round if he or she loses the
the ¬rst. Jones is not allowed to quit. What is the expected
value of the game for Smith?
3. If ¬ve coins are thrown, what is the expected number of heads
that will turn up?
5
[Ans. .]
2

4. A coin is thrown until the ¬rst time a head comes up or until
three tails in a row occur. Find the expected number of times the
coin is thrown.
5. A customer wishes to purchase a ¬ve cent newspaper. The cus-
tomer has in his or her pocket one dime and ¬ve pennies. The
news agent o¬ers to let the customer have the paper in exchange
for one coin drawn at random from the customer™s pocket.
(a) Is this a fair proposition and, if not, to whom is it favorable?
[Ans. Favorable to customer.]
(b) Answer the same question assuming that the news agent
demands two coins drawn at random from the customer™s
pocket.
[Ans. Fair proposition.]
6. A bets 50 cents against B™s x cents that, if two cards are dealt
from a shu¬„ed pack of ordinary playing cards, both cards will be
of the same color. What value of x will make this bet fair?
163
4.12. EXPECTED VALUE

7. Prove that if the expected value of a given experiment is E, and
if a constant c is added to each of the outcomes, the expected
value of the new experiment is E + c.

8. Prove that, if the expected value of a given experiment is E, and
if each of the possible outcomes is multiplied by a constant k, the
expected value of the new experiment is k · E.

9. A gambler plays the following game: A card is drawn from a
bridge deck; if it is an ace, the gambler wins $5; if it is a jack, a
queen or a king, he or she wins $2; for any other card he or she
loses $1. What is the expected winning per play?

10. An urn contains two black and three white balls. Balls are suc-
cessively drawn from the urn without replacement until a black
ball is obtained. Find the expected number of draws required.

11. Using the result of Exercises 13 and 14 of Section 4.6, ¬nd the
expected number of games in the World Series (a) under the as-
sumption that each team has probability 1 of winning each game
and (b) under the assumption that the stronger team has proba-
bility .6 of winning each game.

[Ans. 5.81; 5.75.]

12. Suppose that we modify the game of craps as follows: On a 7 or
11 the player wins $2, on a 2, 3, or 12 he or she loses $3; otherwise
the game is as usual. Find the expected value of the new game,
and compare it with the old value.

13. Suppose that in roulette at Monte Carlo we place 50 cents on
“red” and 50 cents on “black”. What is the expected value on
the game? Is this better or worse than placing $1 on “red”?

14. Betting on “red” in roulette can be described roughly as follows.
We win with probability .49, get our money back with probability
.01, and lose with probability .50. Draw the tree for three plays
of the game, and compute (to three decimals) the probability of
each path. What is the probability that we are ahead at the end
of three bets?

[Ans. .485.]
164 CHAPTER 4. PROBABILITY THEORY

15. Assume that the odds are r : s that a certain statement will be
true. If a gambler receives s dollars if the statement turns out
to be true, and gives r dollars if not, what is his or her expected
winning?

16. Referring to Exercise 9 of Section 4.3, ¬nd the expected number
of languages that a student chosen at random reads.

17. Referring to Exercise 5 of Section 4.4, ¬nd the expected number
of men who get their own hats.

[Ans. 1.]

18. A pair of dice is rolled. Each die has the number 1 on two opposite
faces, the number 2 on two opposite faces, and the number 3 on
two opposite faces. The “roller” wins a dollar if

(i) the sum of four occurs on the ¬rst roll; or

(ii) the sum of three or ¬ve occurs on the ¬rst roll and the same
sum occurs on a subsequent roll before the sum of four occurs.

Otherwise he or she loses a dollar.

(a) What is the probability that the person rolling the dice wins?
23
[Ans. .]
45

(b) What is the expected value of the game?
1
[Ans. .]
45



4.13 Markov chains
In this section we shall study a more general kind of process than the
ones considered in the last three sections.
We assume that we have a sequence of experiments with the fol-
lowing properties. The outcome of each experiment is one of a ¬nite
number of possible outcomes a1 , a2 , . . . , ar . It is assumed that the prob-
ability of outcome aj on any given experiment is not necessarily inde-
pendent of the outcomes of previous experiments but depends at most
upon the outcome of the immediately preceding experiment. We as-
sume that there are given numbers pij which represent the probability
165
4.13. MARKOV CHAINS




Figure 4.22: ™¦


of outcome aj on any given experiment, given that outcome ai oc-
curred on the preceding experiment. The outcomes a1 , a2 , . . . , ar are
called states, and the numbers pij are called transition probabilities. If
we assume that the process begins in some particular state, then we
have enough information to determine the tree measure for the process
and can calculate probabilities of statements relating to the over-all se-
quence of experiments. A process of the above kind is called a Markov
chain process.
The transition probabilities can be exhibited in two di¬erent ways.
The ¬rst way is that of a square array. For a Markov chain with states
a1 , a2 , and a3 , this array is written as

« 
p11 p12 p13
¬ ·
P =  p21 p22 p23  .
p31 p32 p33

Such an array is a special case of a matrix. Matrices are of fundamental
importance to the study of Markov chains as well as being important
in the study of other branches of mathematics. They will be studied in
detail in ??.
A second way to show the transition probabilities is by a transition
diagram. Such a diagram is illustrated for a special case in Figure
4.22. The arrows from each state indicate the possible states to which
a process can move from the given state.
The matrix of transition probabilities which corresponds to this
166 CHAPTER 4. PROBABILITY THEORY

diagram is the matrix
a1 a2 a3
«
a1 010
1·.
¬ 1
a2 0 2 2
1
02
a3 3 3

An entry of 0 indicates that the transition is impossible.
Notice that in the matrix P the sum of the elements of each row is
1. This must be true in any matrix of transition probabilities, since the
elements of the ith row represent the probabilities for all possibilities
when the process is in state ai .
The kind of problem in which we are most interested in the study
of Markov chains is the following. Suppose that the process starts in
state i. What is the probability that after n steps it will be in state
(n)
j? We denote this probability by pij . Notice that we do not mean by
this the nth power of the number pij . We are actually interested in this
probability for all possible starting positions i and all possible terminal
positions j. We can represent these numbers conveniently again by a
matrix. For example, for n steps in a three-state Markov chain we write
these probabilities as the matrix
« 
(n) (n) (n)
p p12 p13
¬ 11 (n) ·
= ¬ p(n) p(n) p23 · .
P (n)  21 
22
(n) (n) (n)
p31 p32 p33

Example 4.28 Let us ¬nd for a Markov chain with transition proba-
bilities indicated in Figure 4.22 the probability of being at the various
possible states after three steps, assuming that the process starts at
state a1 . We ¬nd these probabilities by constructing a tree and a tree
measure as in Figure 4.23.
(3)
The probability p13 , for example, is the sum of the weights assigned
by the tree measure to all paths through our tree which end at state
a3 . That is,
11 12 7
(3)
p13 = 1 · · + 1 · · = .
22 23 12
Similarly
11 1
(3)
p12 = 1 · · =
22 4
and
11 1
(3)
p11 = 1 · · = .
23 6
167
4.13. MARKOV CHAINS




Figure 4.23: ™¦

By constructing a similar tree measure, assuming that we start
(3) (3) (3)
at state a2 , we could ¬nd p21 ,p22 ,and p23 . The same is true for
(3) (3) (3)
p31 ,p32 ,and p33 . If this is carried out (see Exercise 7) we can write
the results in matrix form as follows:
a1 a2 a3 
« 1 1 7
a1
P (3) = 6 4 12 .
¬ ·
7 7 37
a2  
36 24 72
4 7 25
a3 27 18 54

Again the rows add up to 1, corresponding to the fact that if we start
at a given state we must reach some state after three steps. Notice
now that all the elements of this matrix are positive, showing that it is
possible to reach any state from any state in three steps. In the next
chapter we will develop a simple method of computing P (n) . ™¦

Example 4.29 example:4.13.2 Suppose that we are interested in study-
ing the way in which a given state votes in a series of national elec-
tions. We wish to make long-term predictions and so will not consider
conditions peculiar to a particular election year. We shall base our
predictions only on the past history of the outcomes of the elections,
Republican or Democratic. It is clear that a knowledge of these past
results would in¬‚uence our predictions for the future. As a ¬rst ap-
proximation, we assume that the knowledge of the past beyond the last
election would not cause us to change the probabilities for the outcomes
on the next election. With this assumption we obtain a Markov chain
with two states R and D and matrix of transition probabilities
RD
1’a
R a .
1’b
D b
168 CHAPTER 4. PROBABILITY THEORY

The numbers a and b could be estimated from past results as follows.
We could take for a the fraction of the previous years in which the
outcome has changed from Republican in one year to Democratic in
the next year, and for b the fraction of reverse changes.
We can obtain a better approximation by taking into account the
previous two elections. In this case our states are RR, RD, DR, and
DD, indicating the outcome of two successive elections. Being in state
RR means that the last two elections were Republican victories. If the
next election is a Democratic victory, we will be in state RD. If the
election outcomes for a series of years is DDDRDRR, then our process
has moved from state DD to DD to DR to RD to DR, and ¬nally to RR.
Notice that the ¬rst letter of the state to which we move must agree
with the second letter of the state from which we came, since these
refer to the same election year. Our matrix of transition probabilities
will then have the form,
RR DR RD DD
« 
1’a
RR 0 a 0
¬ ·
1’b
DR b 0 0 .
¬ ·
¬ ·
1’c
RD 0 0 c
 
1’d
DD 0 d 0
Again the numbers a, b, c, and d would have to be estimated. The
™¦
study of this example is continued in ??.

Example 4.30 The following example of a Markov chain has been
used in physics as a simple model for di¬usion of gases. We shall see
later that a similar model applies to an idealized problem in changing
populations.
We imagine n black balls and n white balls which are put into
two urns so that there are n balls in each urn. A single experiment
consists in choosing a ball from each urn at random and putting the ball
obtained from the ¬rst urn into the second urn, and the ball obtained
from the second urn into the ¬rst. We take as state the number of
black balls in the ¬rst urn. If at any time we know this number, then
we know the exact composition of each urn. That is, if there are j black
balls in urn 1, there must be n ’ j black balls in urn 2, n ’ j white
balls in urn 1, and j white balls in urn 2. If the process is in state j,
then after the next exchange it will be in state j ’ 1, if a black ball is
chosen from urn 1 and a white ball from urn 2. It will be in state j if a
ball of the same color is drawn from each urn. It will be in state j + 1
169
4.13. MARKOV CHAINS

if a white ball is drawn from urn 1 and a black ball from urn 2. The
transition probabilities are then given by (see Exercise 12)

j
pjj’1 = ( )2 , j > 0
n
2j(n ’ j)
pjj =
n2
n’j 2
pjj+1 = ( ) , j<n
n
pjk = 0 otherwise.

A physicist would be interested, for example, in predicting the compo-
sition of the urns after a certain number of exchanges have taken place.
Certainly any predictions about the early stages of the process would
depend upon the initial composition of the urns. For example, if we
started with all black balls in urn 1, we would expect that for some time
there would be more black balls in urn 1 than in urn 2. On the other
hand, it might be expected that the e¬ect of this initial distribution
would wear o¬ after a large number of exchanges. We shall see later,
™¦
in ??, that this is indeed the case.


Exercises
1. Draw a state diagram for the Markov chain with transition prob-
abilities given by the following matrices.
« 
1 1
0
2 2
¬
0 1 0 ·,
 
1 1
02
2

« 
1 1 1
3 3 3
¬ ·
1 1 1
,
 3 3 3
1 1 1
3 3 3

01
,
10
« 
0 100
¬ 0 0 0·
1
¬ ·
1 ·.
¬ 1
0 0 2 2

011
0 2 2
170 CHAPTER 4. PROBABILITY THEORY




Figure 4.24: ™¦

2. Give the matrix of transition probabilities corresponding to the
transition diagrams in Figure 4.24.

3. Find the matrix P (2) for the Markov chain determined by the
matrix of transition probabilities
1 1
2 2
P= .
1 2
3 3


5 7
12 12
[Ans. .]
7 11
18 18


4. What is the matrix of transition probabilities for the Markov
chain in Example 4.30, for the case of two white balls and two
black balls?

5. Find the matrices P (2) , P (3) , P (4) for the Markov chain deter-
mined by the transition probabilities

10
.
01

Find the same for the Markov chain determined by the matrix

01
.
10
171
4.13. MARKOV CHAINS

6. Suppose that a Markov chain has two states, a1 and a2 , and
transition probabilities given by the matrix
1 2
3 3 .
1 1
2 2

By means of a separate chance device we choose a state in which
1
to start the process. This device chooses a1 with probability 2
1
and a2 with probability 2 . Find the probability that the process
is in state a1 after the ¬rst step. Answer the same question in the
1
case that the device chooses a1 with probability 3 and a2 with
2
probability 3 .

54
[Ans. ; .]
12 9


7. Referring to the Markov chain with transition probabilities indi-
cated in Figure 4.22, construct the tree measures and determine
the values of
(3) (3) (3)
p21 , p22 , p23
and
(3) (3) (3)
p31 , p32 , p33 .

8. A certain calculating machine uses only the digits 0 and 1. It is
supposed to transmit one of these digits through several stages.
However, at every stage there is a probability p that the digit
which enters this stage will be changed when it leaves. We form a
Markov chain to represent the process of transmission by taking
as states the digits 0 and 1. What is the matrix of transition
probabilities?

9. For the Markov chain in Exercise 8, draw a tree and assign a tree
measure, assuming that the process begins in state 0 and moves
through three stages of transmission. What is the probability
that the machine after three stages produces the digit 0, i.e., the
correct digit? What is the probability that the machine never
changed the digit from 0?

10. Assume that a man™s profession can be classi¬ed as professional,
skilled laborer, or unskilled laborer. Assume that of the sons
of professional men 80 per cent are professional, 10 per cent are
skilled laborers, and 10 per cent are unskilled laborers. In the
172 CHAPTER 4. PROBABILITY THEORY




Figure 4.25: ™¦

case of sons of skilled laborers, 60 per cent are skilled laborers, 20
per cent are professional, and 20 per cent are unskilled laborers.
Finally, in the case of unskilled laborers, 50 per cent of the sons
are unskilled laborers, and 25 per cent each are in the other two
categories. Assume that every man has a son, and form a Markov
chain by following a given family through several generations. Set
up the matrix of transition probabilities. Find the probability
that the grandson of an unskilled laborer is a professional man.

[Ans. .375.]

11. In Exercise 10 we assumed that every man has a son. Assume
instead that the probability a man has a son is .8. Form a Markov
chain with four states. The ¬rst three states are as in Exercise
10, and the fourth state is such that the process enters it if a man
has no son, and that the state cannot be left. This state repre-
sents families whose male line has died out. Find the matrix of
transition probabilities and ¬nd the probability that an unskilled
laborer has a grandson who is a professional man.

[Ans. .24.]

12. Explain why the transition probabilities given in Example 4.30
are correct.

Supplementary exercises.

13. Five points are marked on a circle. A process moves clockwise
2
from a given point to its neighbor with probability 3 , or counter-
clockwise to its neighbor with probability 1 .
3
173
4.13. MARKOV CHAINS

(a) Considering the process to be a Markov chain process, ¬nd
the matrix of transition probabilities.
(b) Given that the process starts in a state 3, what is the prob-
ability that it returns to the same state in two steps?

14. In northern New England, years for apples can be described as
good, average, or poor. Suppose that following a good year the
probabilities of good, average, or poor years are respectively .4, .4,
and .2. Following a poor year the probabilities of good, average,
or poor years are .2, .4, and .4 respectively. Following an average
year the probabilities that the next year will be good or poor are
each .2, and of an average year, .6.

(a) Set up the transition matrix of this Markov chain.
(b) 1965 was a good year. Compute the probabilities for 1966,
1967, and 1968.
[Ans. For 1967: .28, .48, .24.]
1
15. In Exercise 14 suppose that there is probability 4 for a good
year, 2 for an average year, and 1 for a poor year. What are the
1
4
probabilities for the following year?

16. A teacher in an oversized mathematics class ¬nds, after grading
all homework papers for the ¬rst two assignments, that it is nec-
essary to reduce the amount of time spent in such grading. He
therefore designs the following system: Papers will be marked
satisfactory or unsatisfactory. All papers of students receiving a
mark of unsatisfactory on any assignment will be read on each of
the two succeeding days. Of the remaining papers, the teacher
will read one-¬fth, chosen at random. Assuming that each paper
has a probability of one-¬fth of being classi¬ed “unsatisfactory”,

(a) Set up a three-state Markov chain to describe the process.
(b) Suppose that a student has just handed in a satisfactory
paper. What are the probabilities for the next two assign-
ments?

17. In another model for di¬usion, it is assumed that there are two
urns which together contain N balls numbered from 1 to N . Each
second a number from 1 to N is chosen at random, and the ball
with the corresponding number is moved to the other urn. Set
174 CHAPTER 4. PROBABILITY THEORY

up a Markov chain by taking as state the number of balls in urn
1. Find the transition matrix.


4.14 The central limit theorem
We continue our discussion of the independent trials process with two
outcomes. As usual, let p be the probability of success on a trial, and
f (n, p; x) be the probability of exactly x successes in n trials.
In Figure 4.26 we have plotted bar graphs which represent f (n, .3; x)
for n = 10, 50, 100, and 200. We note ¬rst of all that the graphs are
drifting o¬ to the right. This is not surprising, since their peaks occur
at np, which is steadily increasing. We also note that while the total
area is always 1, this area becomes more and more spread out.
We want to redraw these graphs in a manner that prevents the
drifting and the spreading out. First of all, we replace x by x ’ np,
assuring that our peak always occurs at 0. Next we introduce a new
unit for measuring the deviation, which depends on n, and which gives
comparable scales. As we saw in Section 4.10, the standard deviation

npq is such a unit.
We must still insure that probabilities are represented by areas in
the graph. In Figure 4.26 this is achieved by having a unit base for each
rectangle, and having the probability f (n, p; x) as height. Since we are
now representing a standard deviation as a single unit on the horizontal

axis, we must take f (n, p; x) npq as the heights of our rectangles. The
resulting curves for n = 50 and 200 are shown in Figures 4.27 and 4.28,
respectively.
We note that the two ¬gures look very much alike. We have also
shown in Figure 4.28 that it can be approximated by a bell-shaped
curve. This curve represents the function

1 2
f (x) = √ e’x /2


and is known as the normal curve. It is a fundamental theorem of
probability theory that as n increases, the appropriately rescaled bar
graphs more and more closely approach the normal curve. The theorem
is known as the Central Limit Theorem, and we have illustrated it
graphically.
More precisely, the theorem states that for any two numbers a and
175
4.14. THE CENTRAL LIMIT THEOREM




Figure 4.26: ™¦
176 CHAPTER 4. PROBABILITY THEORY




Figure 4.27: ™¦




Figure 4.28: ™¦
177
4.14. THE CENTRAL LIMIT THEOREM




Figure 4.29: ™¦

b, with a < b,
x ’ np
Pr[a < √ < b]
npq
approaches the area under the normal curve between a and b, as n
increases. This theorem is particularly interesting in that the normal
curve is symmetric about 0, while f (n, p; x) is symmetric about the
1
expected value np only for the case p = 2 . It should also be noted that
we always arrive at the same normal curve, no matter what the value
of p is.
In Figure 4.29 we give a table for the area under the normal curve
between 0 and d. Since the total area is 1, and since it is symmetric
about the origin, we can compute arbitrary areas from this table. For
example, suppose that we wish the area between ’1 and +2. The area
between 0 and 2 is given in the table as .477. The area between ’1
and 0 is the same as between 0 and 1, and hence is given as .341. Thus
the total area is .818. The area outside the interval (’1, 2) is then
178 CHAPTER 4. PROBABILITY THEORY

1 ’ .818 = .182.

Example 4.31 Let us ¬nd the probability that x di¬ers from the ex-
pected value np by as much as d standard deviations.
x ’ np

Pr[|x ’ np| ≥ d npq] = Pr[| √ ≥ d].
npq

and hence the approximate answer should be the area outside the
interval (’d, d) under the normal curve. For d = 1, 2, 3 we obtain
1 ’ (2 · .341) = .318, 1 ’ (2 · .477) = .046, 1 ’ (2 · .4987) = .0026, re-
spectively. These agree with the values given in Section 4.10, to within
rounding errors. In fact, the Central Limit Theorem is the basis of
™¦
those estimates.

Example 4.32 In Example 4.19 we considered the example of throw-
ing a coin 10,000 times. The expected √ number of heads that turn up
is 5000 and the standard deviation is 10, 000 · 2 · 1 = 50. We ob-
1
2
served that the probability of a deviation of more than two standard
deviations (or 100) was very unlikely. On the other hand, consider the
probability of a deviation of less than .1 standard deviation. That is,
of a deviation of less than ¬ve. The area from 0 to .1 under the normal
curve is .040 and hence the probability of a deviation from 5000 of less
than ¬ve is approximately .08. Thus, while a deviation of 100 is very
unlikely, it is also very unlikely that a deviation of less than ¬ve will
™¦
occur.

Example 4.33 The normal approximation can be used to estimate
the individual probabilities f (n, x; p) for large n. For example, let us
estimate f (200, 65; .3). The graph of the probabilities f (200, x; .3) was
given in Figure 4.28 together with the normal approximation. The
desired probability is the area of the bar corresponding to x = 65. An
inspection of the graph suggests that we should take the area under the
normal curve between 64.5 and 65.5 as an estimate for this probability.
In normalized units this is the area between
4.5

200(.3)(.7)
and
5.5

200(.3)(.7)
179
4.14. THE CENTRAL LIMIT THEOREM

or between .6944 and .8487. Our table is not ¬ne enough to ¬nd this
area, but from more complete tables, or by machine computation, this
area may be found to be .046 to three decimal places. The exact value
to three decimal places is .045. This procedure gives us a good estimate.
If we check all of the values of f (200, x; .3) we ¬nd in each case
that we would make an error of at most .001 by using the normal
approximation. There is unfortunately no simple way to estimate the
error caused by the use of the Central Limit Theorem. The error will
clearly depend upon how large n is, but it also depends upon how near
1
™¦
p is to 0 or 1. The greatest accuracy occurs when p is near 2 .

Example 4.34 Suppose that a drug has been administered to a num-
ber of patients and found to be e¬ective a fraction p of the time. Assum-
ing an independent trials process, it is natural to take p as an estimate
for the unknown probability p for success on any one trial. It is useful
to have a method of estimating the reliability of this estimate. One
method is the following. Let x be the number of successes for the drug
given to n patients. Then by the Central Limit Theorem
x ’ np
| ¤ 2] ≈ .95.
Pr[| √
npq

This is the same as saying

x/n ’ p
| ¤ 2] ≈ .95.
Pr[|
pq/n

Putting p = x/n, we have
¯

Pr[|¯ ’ p| ¤ 2 pq/n] ≈ .95.
p

Using the fact that pq < 4 (see Exercise 12) we have
1
Pr[|¯ ’ p| ¤ √ ] ≥ .95.
p
n

This says that no matter what p is, with probability ≥ .95, the true
1
value will not deviate from the estimate p by more than √n It is cus-
tomary then to say that
1 1
p’ √ ¤p¤p+ √
¯ ¯
n n
180 CHAPTER 4. PROBABILITY THEORY

with con¬dence .95. The interval
1 1
[¯ ’ √ , p + √ ]
p ¯
n n
is called a 95 per cent con¬dence interval. Had we started with
x ’ np
| ¤ 3] ≈ .99,
Pr[| √
npq
we would have obtained the 99 per cent con¬dence interval
3 3
[¯ ’ √ , p + √ ]
p ¯
2n 2n
For example, if in 400 trials the drug is found e¬ective 124 times,
or .31 of the times, the 95 per cent con¬dence interval for p is
1 1
[.31 ’ , .31 + ] = [.26, .36]
20 20
and the 99 per cent con¬dence interval is
3 3
[.31 ’ , .31 + ] = [.235, .385].
40 40
™¦

Exercises
1. Let x be the number of successes in n trials of an independent
trials process with probability p for success. Let x = x’np For

npq
large n estimate the following probabilities.
(a) Pr[x < ’2.5].
[Ans. .006.]
(b) Pr[x < 2.5].
(c) Pr[x ≥ ’.5].
(d) Pr[’1.5 < x < 1].
[Ans. .774.]
2. A coin is biased in such a way that a head comes up with proba-
bility .8 on a single toss. Use the normal approximation to esti-
mate the probability that in a million tosses there are more than

<<

. 5
( 6)



>>