Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
41 views174 pages

Probability Distribution Lecture 7

Uploaded by

Dan Cuenca
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views174 pages

Probability Distribution Lecture 7

Uploaded by

Dan Cuenca
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 174

Probability Distribution

Lecture 7

Prepared by: RN Capuyan 1


Discrete Probability Distribution

Prepared by: RN Capuyan 2


We now look at problems that can be put into
probabilistic framework by assessing the probabilities of
certain events from actual past data, and then we can
consider specific probability models that fit our problems.

Ophthalmology Retinitis pigmentosa is a progressive


ocular disease that in some cases eventually results in
blindness. The three main genetic types of the disease
are dominant, recessive, and sex-linked. Each genetic
type has a different rate of progression, with the
dominant mode being the slowest to progress and the
sex-linked mode the fastest.

Prepared by: RN Capuyan 3


Suppose the prior history of disease in a family is
unknown, but one of the two male children is affected,
and the one female child is not affected. Can this
information help identify the genetic type?

The binomial distribution can be applied to calculate the


probability of this event occurring (one of two males
affected, none of one female affected) under each of the
genetic types mentioned, and these results can then be
used to infer the most likely genetic type. In fact, this
distribution can be used to make an inference for any
family for which we know k1 of n1 male children are
affected and k2 of n2 female children are affected.

Prepared by: RN Capuyan 4


Cancer A second example of a commonly used probability
model concerns a cancer scare in Woburn,
Massachusetts. A news story reported an ‘excessive”
number of cancer deaths in young children in this town
and speculated about whether this high rate was due to
the dumping of industrial wastes in the northeastern part
of town. Suppose 12 cases of leukemia were reported in a
town where 6 would normally be expected. Is this
enough evidence to conclude that the town has an
excessive number of leukemia cases?

Prepared by: RN Capuyan 5


The Poisson distribution can be used to calculate the
probability of 12 or more cases if this town had typical
national rates for leukemia. If this probability were small
enough, we would conclude that the number was
excessive; otherwise, we would decide that longer
surveillance of the town was needed before arriving at a
conclusion.

Prepared by: RN Capuyan 6


Random Variable

Prepared by: RN Capuyan 7


Random Variable

 is a function that assign numeric values to different


events in a sample space.
 exists a discrete set of numeric values is a discrete
random variable.

Examples

Let X be the random variable that represents the number


of episodes of otitis media in the first 2 years of life. Then
X is a discrete random variable, which takes on the values
0, 1, 2, and so on.

Prepared by: RN Capuyan 8


2. Let X be the number of patients of four whose blood
pressure are brought under control. Then X is a
discrete random variable, which takes on the values 0,
1, 2, 3, 4.

A random variable whose possible values cannot be


enumerated is a continuous random variable.

Environmental Health Possible health effects on workers


of exposure to low levels of radiation over long periods of
time are of public health interest. One problem in
assessing this issue is how to measure the cumulative
exposure of a worker.
Prepared by: RN Capuyan 9
A study was performed at the Portsmouth Naval
Shipyard, where each exposed worker wore a badge, or
dosimeter, which measured annual radiation exposure in
rem. The cumulative exposure over a worker’s lifetime
could then be obtained by summing the yearly exposures.
Cumulative lifetime exposure to radiation is a good
example of a continuous random variable because it
varied in this study from 0.000 to 91.414 rem; this would
be regarded as taking on an essentially infinite number of
values, which cannot be enumerated.

Prepared by: RN Capuyan 10


A probability-mass function is a mathematical
relationship, or rule, that assigns to any possible value r
of a discrete random variable X the probability P(X=r).
This assignment is made for all values r that have positive
probability. The probability-mass function is sometimes
also called a probability distribution.

The probability-mass function can be displayed in a table


giving the values and their associated probabilities, or it
can be expressed as a mathematical formula giving the
probabilities of all possible values.

Prepared by: RN Capuyan 11


Example

Suppose from previous experience with the hypertension


drug, the drug company expects that for any clinical
practice the probability,

Probability-mass function for the hypertension-control


example

P (X=r) .008 .076 .265 .411 .240


r 0 1 2 3 4

Prepared by: RN Capuyan 12


Notice that for any probability-mass function, the
probability of any particular value must be between 0 and
1 and the sum of the probabilities of all values must
exactly equal 1. Thus 0 < P(X=r) ≤ 1, Ʃ P(X=r) = 1, where
the summation is taken over all possible values that have
positive probability.

Prepared by: RN Capuyan 13


Relationship of Probability Distributions to Frequency
Distributions

• Frequency distribution is described as a list of each


value in the data set and a corresponding count of
how frequently the value occurs.
• If each count is divided by the total number of points
in the sample, then the frequency distribution can be
considered as a sample analog to a probability
distribution.
• A probability distribution can be thought of as a model
based on an infinitely large sample, giving the fraction
of data points in a sample that should be allocated to
each specific value.

Prepared by: RN Capuyan 14


• Frequency distribution gives the actual proportion of
points in a sample that correspond to specific values;
the appropriateness of the model can be assessed by
comparing the observed sample-frequency
distribution with the probability distribution.
• The formal statistical procedure for making this
comparison, called a goodness-of-fit test.

Prepared by: RN Capuyan 15


Comparison of the sample-frequency distribution and the
theoretical-probability distribution for the hypertension-
control example

Number of Probability
hypertensives under distribution Frequency
control = r P(X=r) distribution
0 .008 .000=0/100
1 .076 .090=9/100
2 .265 .240=24/100
3 .411 .480=48/100
4 .240 .190=19/100

Prepared by: RN Capuyan 16


The role of statistical inference is to compare the two
distributions to judge whether differences between the
two can be attributed to chance or whether real
differences exist between the drug’s performance in
actual clinical practice and expectations from previous
drug-company experience.

Where a probability-mass function comes from? In some


instances, previous data can be obtained on the same
type of random variable being studied, and the
probability-mass function can be computed from these
data. In other instances, previous data may not be
available, but the probability-mass function from some
well-known distribution can be used to see how well it
fits actual sample data.
Prepared by: RN Capuyan 17
In the comparison table from the sample of 100 physician
practices, the probability-mass function was derived from
the binomial distribution.

Prepared by: RN Capuyan 18


Expected Value of a Discrete Random Variable

• If a random variable has a large number of values with


positive probability, the probability-mass function is
not a useful summary measure. Indeed, we face the
same problem as in trying to summarize a sample by
enumerating each data value.
• Measures of location and spread can be developed for
a random variable in much the same way as they were
developed for samples.

Prepared by: RN Capuyan 19


It is obtained by multiplying each possible value by its
respective probability and summing these over all the
values that have positive (that is, nonzero) probability.

  =  =  =

where the xi’s are the values the random variable


assumes with positive probability. Note that the sum in
the definition of  is over R possible values. R may be
either finite or infinite. In either case, the individual
values must be distinct from each other.

Prepared by: RN Capuyan 20


Examples

1. Find the expected value for the random variable


shown in the hypertension-control example.

  = 0 .008 + 1 .076 + 2 .265 + 3 .411 + 4 .240 = 2.80

Thus on average about 2.8 hypertensives would be


expected to be brought under control for every 4 who are
treated.

Prepared by: RN Capuyan 21


2. What is the expected number of episodes of otitis
media in the first 2 years of life? Suppose this random
variable has a probability-mass function as given in
the table:
P (X=r) .129 .264 .271 .185 .095 .039 .017
r 0 1 2 3 4 5 6

  = 0 .129 + 1 .264 + 2 .271 + 3 .185 + 4 .095


+5 .039 + 6(.017) = 2.038

Thus, on average a child would be expected to have


about two episodes of otitis media in the first 2 years of
life.
Prepared by: RN Capuyan 22
The probability-mass function for the random variable
representing the number of previously untreated
hypertensives brought under control was compared with
the actual number of hypertensives brought under
control in 100 clinical practices. In much the same way,
the expected value of a random variable can be
compared with the actual sample mean in a data set ().

Example

Compare the average number of hypertensives brought


under control in the 100 clinical practices () with the
expected number of hypertensives brought under control
() per 4-patient practice.
Prepared by: RN Capuyan 23
0 0 + 1 9 + 2 24 + 3 48 + 4(19)
 = = 2.77
100

hypertensives controlled per 4-patient clinical practice,


while  = 2.80. This agreement is rather good. The
specific methods for comparing the observed average
value and expected value of a random variable ( and )
are covered in the statistical inference. Notice that 
could be written in the form

0 9 24 48 19
 = 0 +1 +2 +3 + 4( )
100 100 100 100 100

Prepared by: RN Capuyan 24


that is, a weighted average of the number of
hypertensives brought under control, where the
weights are the observed probabilities. The
expected value, in comparison, can be written as a
similar weighted average, where the weights are
the theoretical probabilities:

 = 0 .008 + 1 .076 + 2 .265 + 3 .411 + 4(.240)

Prepared by: RN Capuyan 25


The Variance of a Discrete Random Variable

• The analog to the sample variance (  ) for a random


variable is called the variance of the random variable,
or population variance, and is denoted by  !() or
".
• It represents the spread, relative to the expected
value, of all values that have positive probability.
• It is obtained by multiplying the squared distance of
each possible value from the expected value by its
respective probability and summing over all the values
that have positive probability.

Prepared by: RN Capuyan 26


 !  = " =   −   =

where the xi’s are the values for which the random variable
takes on positive probability. The standard deviation of a
random variable X, denoted by $%()or ", is defined by the
square root of its variance.

The population variance can also be expressed in a different


(“short”) for as follows:

" =   −   =    =  − 

Prepared by: RN Capuyan 27


Example

Compute the variance and standard deviation for the


random variable of otitis media episode. We know that 
= 2.038. Furthermore.

  =

= 0 .129 + 1 .264 + 2 .271 + 3 .185 + 4 .095


+ 5 .039 + 6 (.071)
= 0 .129 + 1 .264 + 4 .271 + 9 .185 + 16 .095
+ 25 .039 + 36(.071)
= 6.12
Prepared by: RN Capuyan 28
Thus,  !  = "  = 6.12 − 2.038  = 1.967. The
standard deviation of X is " = 1.967 = 1.402.

How can we interpret the standard deviation of a


random variable? The following often-used principle is
true for many, but not all, random variables:

Approximately 95% of the probability falls within two


standard deviations (2") of the means of a random
variable.

If 1.96" is substituted for 2", this statement holds


exactly for normally distributed random variables and
approximately certain for other random variables.
Prepared by: RN Capuyan 29
Example

Find a, b such that approximately 95% of infants will have


between a and b episodes of otitis media in the first 2
years of life.

Recall that the random variable has mean () = 2.038 and
standard deviation (") = 1.402. The interval  ± 2" is
given by

2.038 ± 2 1.402 = 2.038 ± 2.805

Prepared by: RN Capuyan 30


or -0.77 to 4.84. Because only positive-integer values are
possible for this random variable, the valid range is from
a = 0 to b = 4 episodes. As depicted in the previous
probability-mass function table, the probability of having
≤ 4 episodes as

.129 + .264 + .271 + .185 + .095 = .944

The rule lets us quickly summarize the range of values


that have most of the probability mass for a random
variable without specifying each individual value.

Prepared by: RN Capuyan 31


The Cumulative-Distribution Function of a Discrete
Random Variable

• Many random variables are displayed in tables or


figures in terms of a cumulative distribution function
rather than a distribution of probabilities of individual
values.
• The basic idea is to assign to each individual value the
sum of probabilities of all values that are no larger
than the value being considered.
• The cdf of random variable X is denoted by F(X) and,
for a specific value x of X, is defined by  ≤  and
denoted by F(x).
Prepared by: RN Capuyan 32
Example

The cdf for the otitis media random variable is given by


'  =0 if <0
'  = .129 if 0≤<1
'  = .393 if 1≤<2
'  = .664 if 2≤<3
'  = .849 if 3≤<4
'  = .944 if 4≤<5
'  = .983 if 5≤<6
'  = 1.0 if ≥6

Prepared by: RN Capuyan 33


For a discrete random variable, the cdf looks like a series
of steps and is sometimes called step function. For a
continuous random variable, the cdf is a smooth curve.
As the number of values increases, the cdf for discrete
random variable approaches that of a smooth curve.

Prepared by: RN Capuyan 34


Cumulative-distribution function for the number of
episodes of otitis media in the first 2 years of life

Prepared by: RN Capuyan 35


For a discrete random variable, the cdf looks like a series
of steps and is sometimes called step function. For a
continuous random variable, the cdf is a smooth curve.
As the number of values increases, the cdf for discrete
random variable approaches that of a smooth curve.
Assignment
Number of boys in families with 4 children
X P(X=x) Questions:
0 1/16 1. What is the expected value of X?
1 1/4 2. What is the standard deviation of X?
2 3/8 3. What is the cdf of X?
3 1/4
4 1/16 Interpret numbers 1 & 2

Prepared by: RN Capuyan 36


Permutations

Prepared by: RN Capuyan 37


Permutations

• To study the binomial distribution, permutations and


combinations are important topics in probability.
• How many ways can k objects be selected out of n
where the order of selection matters?
• Note that the first object can be selected in any one of
n = (n + 1) – 1 ways.
• Given that the first object has been selected, the
second object can be selected in any one of n – 1 =
(n+1) – 2 ways,…; the kth object can be selected in any
one of n – (k – 1) = n – k + 1 = (n + 1) – k ways.

Prepared by: RN Capuyan 38


The number of permutations of n things taken k at a time
is
nPk = n (n – 1)x…x(n – k + 1)

It represents the number of ways of selecting k items of


n, where the order of selection is important.

Example
1. Suppose we identify 5 men ages 50-59 with
schizophrenia in a community, and we wish to match
these subjects with normal controls of the same sex
and age living in the same community. Suppose we
want to employ a matched-pair design, where each
Prepared by: RN Capuyan 39
case is matched with a normal control of the same sex
and age. Five psychologists are employed by the study,
each of whom interviews a single case and his
matched control. If there are 10 eligible 50- to 59-
year-old male controls in the community (labeled A, B,
… , J), then how many ways are there of choosing
controls for the study if a control can never be used
more than once?

The first control can be chosen in 10 ways (can be of any


A to J). Once the first control is chosen, he can no longer
be selected as the second control. The second control can
be chosen in 9 ways and so on. Thus, the first two
controls can be chosen 10 x 9 = 90 ways and so on.
Prepared by: RN Capuyan 40
In total, there are 10 x 9 x 8 x 7 x 6 = 30,240 ways of
choosing the 5 controls.

For example, one possible selection is ACDFE. This means


control A is matched to the first case, control C to the
second case, and so on. The selection order of the
controls is important because different psychologists may
be assigned to interview each matched pair. Thus, the
selection ABCDE differs from CBAED, even though the
same group of controls is selected.

Prepared by: RN Capuyan 41


2. Suppose 3 schizophrenic women ages 50-59 and 6
eligible controls live in the same community. How
many ways are there of selecting 3 controls?

Consider the number of permutations of 6 things taken 3


at a time. 6P3 = 6 x 5 x 4 = 120.
Thus, there are 120 ways of choosing the controls. For
example, one way is to match control A to case 1, control
B to case 2, and control C to case 3 (ABC). Another way
would be to match control F to case 1, control C to case 2,
and control D to case 3 (FCD). The order of selection is
important because, for example, the selection differs
from the selection BCA.
Prepared by: RN Capuyan 42
In some instances, we are interested in a special type of
permutation: selecting n objects out of n, where order of
selection matters (ordering n objects). By the preceding
principle,

nPn = n(n - 1) x … x [n – (n – 1)] = n(n – 1) x…x 2 x 1

The special symbol generally used for this quantity is n!.

Example:

5! = 5 x 4 x 3 x 2 x 1 = 120
Prepared by: RN Capuyan 43
The quantity 0! Has no intuitive meaning, but for
consistency it will be defined as 1.

An alternative formula expressing permutations in terms


of factorials is given by

nPk = n! / (n – k)!

Prepared by: RN Capuyan 44


Example 1

Suppose 4 schizophrenic women and 7 eligible controls


live in the same community. How many ways are there of
selecting 4 controls?

The number of ways = 7P4 = 7(6)(5)(4) = 840

Alternatively, 7P4 = 7!/3! = 5040/6 = 840.

Prepared by: RN Capuyan 45


Example 2

In how many ways can the 5 starting positions on a


basketball team be filled with 8 men who can play any
position?

The number of ways = 8P5 = (8)(7)(6)(5)(4) = 6,720

Alternatively, 8P5 = 8!/3! = 40,320/6 = 6,720.

Prepared by: RN Capuyan 46


Combinations

Prepared by: RN Capuyan 47


Combinations

 The number of ways of selecting k objects out of n


without respect to order.
 This can be generalized to evaluate the number of
combinations of n things taken k at a time.
 Note that for every selection of k distinct items of n,
there are k(k – 1) x … x 2 x 1 = k! ways of ordering the
items among themselves.
 The number of combinations of n things taken k at a
time is
) ) )+ -… - ()+*/ )
nCk = *
=
*!
Prepared by: RN Capuyan 48
Example
Consider a somewhat different design for the study of
person with schizophrenia. Suppose an unmatched study
design, in which all cases and controls are interviewed by
the same psychologist, is used. If there are 10 eligible
controls, then how many ways are there of choosing 5
controls for the study?

In this case, because the same psychologist interviews all


patients, what is important is which controls are selected,
not the order of selection. Thus, the question becomes
how many ways can 5 of 10 eligible controls be selected,
where order is not important?
Prepared by: RN Capuyan 49
Note that for each set of 5 controls (say A, B, C, D, E),
there are 5 x 4 x 3 x 2 x 1 = 5! ways of ordering the
controls among themselves (e.g., ACBED and DBCAE are
two possible orders).

Thus, the number of ways of selecting 5 of 10 controls for


the study without respect to order = (number of ways of
selecting 5 controls of 10 where order is important)/5! =
10P5/5! = (10 x 9 x 8 x 7 x 6)/120 = 30,240/120 = 252 ways.

Thus, ABCDE and CDFIJ are two possible selections. Also,


ABCDE and BCADE are not counted twice.

Prepared by: RN Capuyan 50


Alternatively, we can express combination in terms of
factorials,
) )!
nCk = *
=
*! )+* !

It represents the number of ways of selecting k objects


out of n where the order of selection does not matter.

Example 1

Evaluate 7C3. 7C3 = 1 - 2 - 3


= 7  5 = 35
4--

Prepared by: RN Capuyan 51


Example 2

Suppose the population consists of N = 5 children: Janine,


Josiel, Jan, Eryll, and Eariel.

By definition of SRS, we must give all possible of size n the


same chances of selection. Suppose n = 2 and under
SRSWOR we allow for the possibility of including an
element more than once in our sample.

5C2 = 5!/2!(3!) = 120/12 = 10.

Recall, n/N = 2/10 which is the common probability.


Prepared by: RN Capuyan 52
Example 3

Find the number of ways of selecting the winning


numbers in super lotto 6/49

49C6 = 49!/6!(43!) = 13,983,816

Therefore, the probability of winning in super lotto is


1/13,983,816 = 0.000000072.

Prepared by: RN Capuyan 53


Henceforth, for consistency we will always use the more
common notation )* for combinations. In words, this is
expressed as “n choose k.”

A special situation arises upon evaluation )5 .By


definition )5 = n!/(0!n!), and 0! was defined as 1. Hence,
)
5
= 1 for any n.

Frequently, )* will need to be computed for k = 0, 1, …


,n. The combinatorials have the following symmetry
property, which makes this calculation easier that it
appears at first.
Prepared by: RN Capuyan 54
For any non-negative integers n, k, where n ≥ k,

6 6
=
7 6−7

If n – k is substituted for k in this expression,


) )!
*
= , then we obtain
*! )+* !
) )! )! )
)+*
= = = *
.
)+* ![)+ )+* ]! *! )+* !

Prepared by: RN Capuyan 55


Intuitively, this result makes sense because )*
represents the number of ways of selecting k objects of n
without regard to order. However, for every selection of k
objects, we have also, in a sense, identified the other n –
k objects that were not selected. Thus, the number of
ways of selecting k objects of n without regard to order
should be the same as the number of ways of selecting n
– k objects of n without regard to order.

Hence, we need only evaluate combinatorials )* for the


)
integers k ≤ n/2. If k >n/2, then the relationship )+* =
)
*
can be used.

Prepared by: RN Capuyan 56


Example

1 1 1
Evaluate 5
, ,…, 1
1 1 1 1(2) 1 1(2)(3)
5
= 1, = 7, 
= = 21, 4 = = 35
( ) 4()( )
1 1 1 1 1 1
;
= 4
= 35, 3
= 
= 21, 2
= = 7,
7 7
= =1
7 0

Prepared by: RN Capuyan 57


The Binomial Distribution

 A sample of n independent trials, each of which can


have only two possible outcomes, which are denoted
as “success” and “failure.”
 The probability of a success at each trial is assumed to
be some constant p, and hence the probability of a
failure at each trial is 1 – p = q.
 The term “success” is used in a general way, without
any specific contextual meaning.

Prepared by: RN Capuyan 58


 The distribution of the number of successes in n
statistically independent trials, where the probability of
success on each trial is p, is known as the binomial
distribution and has a probability-mass function given
by

6 * )+*
=7 = < = , 7 = 0,1, … , 6
7

Prepared by: RN Capuyan 59


Example 1

One of the most common laboratory test performed on any


routing medical examination is a blood count. The two main
aspects of a blood count are (1) counting the number of white
blood cells (the “white count”) and (2) differentiating the
white blood cells that do exist into five categories – namely,
neutrophils, lymphocytes, monocytes, eosinophils, and
basophils (called the “differential”). Both the white count and
the differential are used extensively in making clinical
diagnosis. We concentrate here on the differential,
particularly on the distribution of the number of neutrophils k
out of 100 white blood cells (which is the typical number
counted). The number of neutrophils follows a binomial
distribution.
Prepared by: RN Capuyan 60
Example 2

Reconsider the preceding example with 5 cells rather then


100, and ask the more limited question: What is the
probability that the second and fifth cells considered will be
neutrophils and the remaining cells non-neutrophils, given a
probability of .6 that any one cell is a neutrophil? If a
neutrophil is denoted by a “x” and a non-neutrophil by an “o”,
then the question being asked is: What is the probability of
the outcome “oxoox” = P(oxoox)? Because the probabilities of
success and failure are given, respectively, by .6 and .4, and
the outcomes for different cells are presumed to be
independent, then the probability is q x p x q x q x p =

< = 4 = .6 
.4 4
Prepared by: RN Capuyan 61
Example 3

Now consider the more general questions: What is the


probability that any 2 cells out of 5 will be neutrophils?
The arrangement “oxoox” is only one of 10 possible
orderings that result in 2 neutrophils.

Possible orderings for 2 neutrophils of 5 cells

xxooo oxxoo ooxox


xoxoo oxoxo oooxx
xooxo oxoox
xooox ooxxo
Prepared by: RN Capuyan 62
In terms of combinations, the number of orderings = the
number of ways of selecting 2 cells to be neutrophils out
of 5 cells = 3 = (5 x 4) / (2 x 1) = 10.

The probability of any of the orderings in the table is the


same as that for the ordering “oxoox”, namely, .6  .4 4 .
Thus the probability of obtaining 2 neutrophils in 5 cells is
3

.6  .4 4 = 10 .6  .4 4 = .230.

Suppose the neutrophil problem is now considered more


generally, with n trials rather than 5 trials, and the
question asked: What is the probability that the k
successes (rather than 2 successes) in these n trials?
Prepared by: RN Capuyan 63
The probability that the k successes will occur at k
specific trials within the n trials and that the
remaining trials will be failures is given by
<* 1 − < )+* . To compute the probability of k
successes in any of the n trials, this probability
must be multiplied by the number of ways in
which k trials for the successes and n – k trials for
the failures can be selected = )* – binomial
distribution.

Prepared by: RN Capuyan 64


Example 4

What is the probability of obtaining 2 boys out of 5


children if the probability of a boy is .51 at each birth and
the sexes of successive children are considered
independent random variables?

Use the binomial distribution with n = 5, p = .51, k = 2.

6 * )+*
=7 = < =
7
5 5(4)
=2 = .51 .49 =
 4 .51  .49 4
2 2(1)
Prepared by: RN Capuyan 65
= 10 .51 
.49 4
= .306

Example 5

An investigator notices that children develop chronic


bronchitis in the first year of life in 3 of 20 households in
which both parents have chronic bronchitis, as compared
with the national incidence of chronic bronchitis, which is
5% in the first year of life. Is this difference “real,” or can
it be attributed to chance? Specifically, how likely are
infants in at least 3 of 20 households to develop chronic
bronchitis if the probability of developing disease in any
one household is .05?
Prepared by: RN Capuyan 66
Suppose the underlying rate of disease in the offspring is
.05. Under this assumption, the number of households in
which the infants develop chronic bronchitis will follow a
binomial distribution with parameters n = 20, p = .05.
Thus among 20 households the probability of observing k
with bronchitic children is given by

20
.05 *
.95 5+*
, 7 = 0,1, … , 20
7

The question is: What is the probability of observing at


least 3 households with a bronchitic child?

Prepared by: RN Capuyan 67


≥3 =

5
20
 .05 *
.95 5+*
=
7
* 4


20
1− .05 *
.95 5+*
7
* 5

These three probabilities in the sum can be evaluated


using BINOMDIST provided by Microsoft Excel.

Prepared by: RN Capuyan 68


P(X = 0) = .3585 = BINOMDIST(0,20,.05,false)
P(X = 1) = .3774 = BINOMDIST(1,20,.05,false)
P(X = 2) = .1887 = BINOMDIST(2,20,.05,false)

Thus  ≥ 3 = 1 − .3585 + .3774 + .1887 = .0754

Thus  ≥ 3 is an unusual event, but not very unusual.


Usually .05 or less is the range of probabilities used to
identify unusual events. This criterion is discussed in
more detail on p-values. If 3 infants of 20 were to develop
the disease, it would be difficult to judge whether the
familial aggregation was real until a larger sample was
available.
Prepared by: RN Capuyan 69
One question sometimes asked is why a criterion of ( ≥
3 > $?$), rather than ( = 3 > $?$), was used to define
unusualness in chronic bronchitis example? The latter is what
we actually observe. An intuitive answer is that if the number
of household studied in which both parents had chronic
bronchitis were very large (for example, n = 1500), then the
probability of any specific occurrence would be small. For
example, suppose 75 cases occurred among 1500 households
in which both parents had chronic bronchitis. If the incidence
of chronic bronchitis were .05 in such families, then then the
probability of 75 cases among 1500 households would be

1500
.05 13 .95 ;3 = .047
75

Prepared by: RN Capuyan 70


This result is exactly consistent with the national incidence
rate (5% of households with cases in the first year of life) and
yet yields a small probability. This doesn’t make intuitive
sense. The alternative approach is to calculate the probability
of obtaining a result at least as extreme as the one obtained
(a probability of at least 75 cases out of 1500 households) if
the incidence rate of .05 were applicable to families in which
both parents had chronic bronchitis. This would yield a
probability of approximately .50 in the preceding example and
would indicate that nothing very unusual is occurring in such
families, which is clearly the correct conclusion. If this
probability were small enough, then it would cast doubt on
the assumption that the true incidence rate was .05 for such
families. This approach was used in the “at least” example.

Prepared by: RN Capuyan 71


One question that arises is what if the probability of
success on an individual trial (p) is greater than .5. Recall
that
6 6
=
7 6−7

Let X be a binomial random variable with parameters n


and p, and let Y be a binomial random variable with
parameters n and q = 1 – p. Then

6 * )+*
=7 = < =
7
can be rewritten as

Prepared by: RN Capuyan 72


6 * )+* 6
=7 = < = = = )+* <* = (@ = 6 − 7)
7 6−7

In words, the probability of obtaining k successes for a


binomial random variable X with parameters n and p is
the same as the probability of obtaining n – k successes
for a binomial random variable Y with parameters n and
q.

Prepared by: RN Capuyan 73


Example 1

Evaluate the probabilities of obtaining k neutrophils out


of 5 cells for k = 0, 1, 2, 3, 4, 5, where the probability that
any one cell is a neutrophil is .6.

Because p > .5, refer to the random variable Y with


parameters n = 5, p = 1 - .6 = .4.

6 * )+* 6
=7 = < = = = )+* <* = (@ = 6 − 7)
7 6−7

Prepared by: RN Capuyan 74


5 5
=0 = .6 5
.4 3
= .4 3
.6 5
= @ = 5 = .0102
0 5
P(X = 1) = P(Y = 4) = .0768 = BINOMDIST(4,5,.04,false)
P(X = 2) = P(Y = 3) = .2304 = BINOMDIST(3,5,.04,false)
P(X = 3) = P(Y = 2) = .3456 = BINOMDIST(2,5,.04,false)
P(X = 4) = P(Y = 1) = .2592 = BINOMDIST(1,5,.04,false)
P(X = 5) = P(Y = 0) = .0778 = BINOMDIST(0,5,.04,false)

Prepared by: RN Capuyan 75


Example 2

Compute the probability of obtaining exactly 75 cases of


chronic bronchitis and the probability of obtaining at
least 75 cases of chronic bronchitis in the first year of life
among 1500 families in which both parents have chronic
bronchitis, if the underlying incidence rate of chronic
bronchitis in the first year of life is .05.

P(X = 75) = .047 =BINOMDIST(75,1500,.05,FALSE)

P(X ≥ 75) = 1 – P(X ≤ 74) = 1 - .484 = .517


Prepared by: RN Capuyan 76
P(X ≤ 74) = .484 = BINOMDIST(74,1500,.05,TRUE)

Hence, obtaining 75 cases out of 1500 children is clearly not


unusual.

Prepared by: RN Capuyan 77


Expected Value and Variance of the Binomial
Distribution

General formula for the expected value of a discrete


random variable,

  =  =

In the special case of a binomial distribution, the only


values that take on positive probability are 0, 1, 2, … ,n,
and these values occur with probabilities
Prepared by: RN Capuyan 78
6 5 ) 6
< = , < =)+ , …
0 1

Thus
)
6 * )+*
  = 7 < =
7
* 5

This summation reduces to the simple expression np.

Prepared by: RN Capuyan 79


Similarly, using

 !  = " =   −   =

we can show

)
6 * )+*
 !  =  7 − 6< 
< = = 6<=
7
* 5

Prepared by: RN Capuyan 80


Remarks:

 The expected number of successes in n trials is simply


the probability of success on one trial multiplied by n,
which equals np.
 For a given number of trials n, the binomial distribution
has the highest variance when p = ½.
 The variance of the distribution decreases as p moves
away from ½ in either direction, becoming 0 when p =
0 or 1. Because when p = 0 there must be 0 successes
in n trials and when p = 1 there must be n successes in
n trials, and there is no variability in either instance.

Prepared by: RN Capuyan 81


 When p is near 0 or near 1, the distribution of the
number of successes is clustered near 0 and n,
respectively, and there is comparatively little variability
as compared with the situation when p = ½.
 For the same number of n trials, the distribution is,
skewed to the right when p = .05, skewed to the left
when p = .95, and normally distributed when p = .50.

Prepared by: RN Capuyan 82


The binomial distribution when p = .05 and n = 10

Prepared by: RN Capuyan 83


The binomial distribution when p = .95 and n = 10

Prepared by: RN Capuyan 84


The binomial distribution when p = .50 and n = 10

Prepared by: RN Capuyan 85


Assignment

The probability of a woman developing breast cancer


over a lifetime is about 1/9.

1. What is the probability that 2 women of 10 will


develop breast cancer over a lifetime?
2. What is the probability that at least 2 women of 10 will
develop breast cancer over a lifetime?

Prepared by: RN Capuyan 86


The Poisson Distribution

 Usually associated with rare events.


 Distribution of number of deaths attributed to typhoid
fever over 1 year (time). Assuming the probability of a
new death in any one day is very small and the number
of cases reported in any two distinct periods of time is
independent random variables.
 Rare events can also be on a surface area such as the
distribution of number of bacterial colonies growing on
an agar plate.

Prepared by: RN Capuyan 87


The probability of finding any bacterial colonies say in
a 100-cm2 agar plate at any one point a (or more
precisely in a small area around a) is very small, and
the events of finding bacterial colonies at any two
points a1, a2 are independent.

Consider the number of deaths attributed to typhoid


fever. Ask the question: What is the distribution of the
number of deaths caused by typhoid fever from time 0
to time t (where t is some long period of time, such as
1 year or 20 years)?

Prepared by: RN Capuyan 88


Three assumptions must be made about the incidence of
the disease. Consider any general small subinterval of the
time period t, denoted by ∆B.

1. Assume that
a. The probability of observing 1 death is directly
proportional to the length of the time interval ∆B.
That is, P(1 death) = λ∆B for some constant λ
b. The probability of observing 0 deaths over ∆B is
approximately 1 – λ∆B.
c. The probability of observing more than 1 death over
this time interval is essentially 0.

Prepared by: RN Capuyan 89


2. Stationarity is the assumption that the number of
deaths per unit time is the same throughout the entire
time interval t.

Thus, an increase in the incidence of the disease as


time goes on within the time period t would violate
this assumption. Note that t should not be overly long
because this assumption is less likely to hold as t
increases.

3. Independence means that if a death occurs within one


time sub-interval, then it has no bearing on the
probability of death in the next time subinterval.
Prepared by: RN Capuyan 90
This assumption would be violated in an epidemic
situation because if a new case of disease occurs, then
subsequent deaths are likely to build up over a short
period of time until after the epidemic subsides.

Given these assumptions, the Poisson probability


distribution can be derived:

The probability of k events occurring in a time period t for


a Poisson random variable with parameters λ is

Prepared by: RN Capuyan 91


 *
 = 7 = ? +C , 7 = 0,1,2, …
7!
Where  = λt and e is approximately 2.71828

Remarks:

 Poisson distribution depends on a single parameter  =


λt. The parameter λ represents the expected number
of events per unit time, whereas the parameter 
represents the expected number of events over time
period t.

Prepared by: RN Capuyan 92


 For a binomial distribution there are finite number of
trials n, and the number of events can be no larger
than n. For a Poisson distribution the number of trials
is essentially infinite and the number of events (or
number of deaths) can be indefinitely large, although
the probability of k events becomes very small as k
increases.

Prepared by: RN Capuyan 93


Example 1

Consider the typhoid-fever example. Suppose the


number of deaths from typhoid fever over a 1-year
period is Poisson distributed with parameter  = 4.6.
What is the probability distribution of the number of
deaths over a 6-month period? A 3-month period?

Let X = the number of deaths in 6 months. Because  =


4.6, t = 1 year, it follows that λ = 4.6 deaths per year. For a
6-month period we have λ = 4.6 deaths per year, t = .5
year. Thus  = λt = 2.3. Therefore,

Prepared by: RN Capuyan 94


 *
 = 7 = ? +C
7!

 = 0 = ? +.4 = .100
2.3 +.4
=1 = ? = .231
1!
2.3 +.4
=2 = ? = .265
2!
2.34 +.4
=3 = ? = .203
3!
2.3; +.4
=4 = ? = .117
4!
2.33 +.4
=5 = ? = .054
5!
Prepared by: RN Capuyan 95
≥6
= 1 − (.100 + .231 + .265 + .203 + .117 + .054)
= .030

In excel,

P(X = 0) = .100 = POISSON(0,2.3,false)


P(X = 1) = .231 = POISSON(1,2.3,false)
P(X = 2) = .265 = POISSON(2,2.3,false)
P(X = 3) = .203 = POISSON(3,2.3,false)
P(X = 4) = .117 = POISSON(4,2.3,false)
P(X = 5) = .054 = POISSON(5,2.3,false)
Prepared by: RN Capuyan 96
Let Y = the number of deaths in 3 months. For a 3-month
period, we have λ = 4.6 deaths per year, t = .25 year,  =
λt = 1.15. Therefore,

 *
 = 7 = ? +C
7!

 = 0 = ? + . 3 = .317
1.15 + . 3
=1 = ? = .364
1!
1.15 + . 3
=2 = ? = .209
2!
1.154 + . 3
=3 = ? = .080
3!
Prepared by: RN Capuyan 97
 ≥ 4 = 1 − (.317 + .364 + .209 + .080) = .030

In excel,

P(X = 0) = .317 = POISSON(0,1.15,false)


P(X = 1) = .080 = POISSON(1,1.15,false)
P(X = 2) = .080 = POISSON(2,1.15,false)
P(X = 3) = .080 = POISSON(3,1.15,false)

The distribution tends to become more symmetric as the


time interval increases or, more specifically, as 
increases.
Prepared by: RN Capuyan 98
Distribution of the number of deaths attributable to
typhoid fever, 6 months

Prepared by: RN Capuyan 99


Distribution of the number of deaths attributable to
typhoid fever, 3 months

Prepared by: RN Capuyan 100


Example 2

Assuming that the probability of finding 1 colony in an


area the size of ∆D at any point on the plate is λ∆D for
some λ and that the number of bacterial colonies found
at 2 different points of the plate are independent random
variables, then the probability of finding k bacterial
colonies in an area of size D is ? +C * ⁄7!, where  = λD.
Assuming D = 100cm2 and λ = .02 colonies per cm2,
calculate the probability distribution of the number of
bacterial colonies.

Prepared by: RN Capuyan 101


We have  = λD = 100(.02) = 2. Let X = the number of
colonies

 = 0 = ? + = .135
 = 1 = ? + 2 ⁄1! = 2? + = .271
 = 2 = ? + 2 ⁄2! = 2? + = .271
 = 3 = ? + 24 ⁄3! = 4/3? + = .180
 = 4 = ? + 2; ⁄4! = 2/3? + = .090
 ≥ 5 = 1 − ( ≤ 4)
= 1 − .135 + .271 + .271 + .180 + .090 = .053

Clearly, the larger λ is, the more bacterial colonies we


would expect to find.
Prepared by: RN Capuyan 102
Expected Values and Variance of the Poisson
Distribution

 Mean and variance is equal to the parameter .


 If we have a data set from a discrete distribution where
the mean and variance are about the same, then we
can preliminarily identify it as a Poisson distribution.

Prepared by: RN Capuyan 103


Example

A public-health issue arose concerning the possible


carcinogenic potential of food ingredients containing ethylene
dibromide (EDB). In some instances foods were removed from
public consumption if they were shown to have excessive
quantities of EDB. A previous study had looked at mortality in
161 white male employees of two plants in Texas and
Michigan who were exposed to EDB over the time period
1940 – 1975. Seven deaths from cancer were observed among
these employees. For this time period, 5.8 cancer deaths were
expected as calculated from overall mortality rates for U.S.
white men. Was the observed number of cancer deaths
excessive in this group?

Prepared by: RN Capuyan 104


Estimate the parameter  from the expected number of
cancer deaths from U. S. white male mortality rates; that
is,  = 5.8. Then calculate ( ≥ 7), where X is a
Poisson random variable with parameter  = 5.8. Use
the relationship

 ≥ 7 = 1 − ( ≤ 6)
where  = 7 = ? +3.G 5.8 * /7!.

Thus  ≥7 =1−  ≤ 6 = 1 − .638 = .362.

Clearly, the observed number of cancer deaths is not


excessive in this group.
Prepared by: RN Capuyan 105
Assignment

Suppose the expected number of deaths from bladder


cancer for all workers in a tire plant on January 1, 1964,
over the next 20 years (1/1/64 – 12/31/83) based on U.S.
mortality rates is 1.8. If the Poisson distribution is
assumed to hold and 6 reported deaths are caused by
bladder cancer among the tire workers, how unusual is
this event?

Prepared by: RN Capuyan 106


Poisson Approximation to the Binomial Distribution

 Binomial distribution for large n and small p.


 Mean and variance are np and npq, so if q = (is
approximately equal to) 1 for small p, and thus npq =
np.
 Conservative rule, n ≥ 100 and p ≤ .01.
 The binomial distribution with large n and small p can
be accurately approximated by a Poisson distribution
with parameter  = 6<.

Prepared by: RN Capuyan 107


 The rationale for using this approximation is that
Poisson distribution is easier to work with than
binomial distribution, it involves expressions such as
)
*
and 1 − < )+* , which are cumbersome for large

n.

Example

Suppose we are interested in the genetic susceptibility to


breast cancer. We find that 4 of 1000 women ages 40 –
49 whose mothers have had breast cancer also develop
breast cancer over the next year of life.

Prepared by: RN Capuyan 108


An example of the Poisson approximation to the binomial
distribution for n = 100, p = .01, k = 0, 1,..., 5.

Prepared by: RN Capuyan 109


We would expect from large population studies that 1 in
1000 women of this age group will develop a new case of
the disease over this period of time. How unusual is this
event?

The exact binomial probability could be computed by


letting n = 1000, p = 1/1000. Hence

 ≥ 4 = 1 − ( ≤ 3)
=1
1000 1000
−H .001 .999
5 555 + .001 .999 III
0 1
1000 1000
+ .001 .999
 IIG + .001 4 .999 II1 J
2 3
Prepared by: RN Capuyan 110
This is equal to 1 – (.3677 + .3681 + .1840 + .0613) =
.0189, the exact binomial probability of obtaining 4 or
more breast-cancer cases.

Instead, use Poisson approximation with  =


1000 .001 = 1, which is obtained as follows:
 ≥4 =1− =0 + =1 + =2 + =3

=0 = .3679
=1 = .3679
=2 = .1839
=3 = .0613

Prepared by: RN Capuyan 111


Thus,

 ≥ 4 = 1 − .3679 + .3679 + .1839 + .0613 =


1 − .9810 = .0190.

This event is indeed unusual and suggests a genetic


susceptibility to breast cancer among daughters of
women who have had breast cancer. The corresponding
exact binomial probability of obtaining 4 or more breast-
cancer cases is .0189, which agrees almost exactly with
the Poisson approximation of .0190.

Prepared by: RN Capuyan 112


Continuous Probability
Distribution

Prepared by: RN Capuyan 113


Continuous Probability Distributions

 The probability-density function (pdf) of the random


variable X is a function such that the area under the
density-function curve between any two points a and b
is equal to the probability that the random variable X
falls between a and b. Thus, the total area under the
density-function curve over the entire range of possible
values for the random variable is 1.
 The cumulative-distribution function (cdf) is defined
similarly to that for a discrete random variable. The cdf
for the random variable X evaluated at the point a is
defined as the probability that X will take on values ≤
Prepared by: RN Capuyan 114
a. It is represented by the area under the pdf to the
left of a.

Example 1

A pdf for DBP in 35- to 44-year-old men is shown in next


slide. Areas A, B, and C correspond to the probabilities of
being mildly hypertensive, moderately hypertensive, and
severely hypertensive, respectively. Furthermore, the
most likely range of values for DBP occurs around 80 mm
Hg, with the values becoming increasingly less likely as
we move farther away from 80.

Prepared by: RN Capuyan 115


The pdf of DBP in 35- to 44-year-old men

Prepared by: RN Capuyan 116


Example 2

Serum triglyceride level is an asymmetric, positively


skewed, continuous random variable whose pdf appears
below

Prepared by: RN Capuyan 117


Example 3

The pdf for the random variable representing the


distribution of birthweights in the general population is

Prepared by: RN Capuyan 118


The cdf evaluated at 88 oz = Pr(X ≤ 88) is represented by
the area under this curve to the left of 88 oz. The region X
≤ 88 oz has a special meaning in obstetrics because 88 oz
is the cutoff point obstetricians usually use for identifying
low-birthweight infants. Such infants are generally at
higher risk for various unfavorable outcomes, such as
mortality in the first year of life.

Prepared by: RN Capuyan 119


The expected value of a continuous random variable ,
denoted by (), or , is the average value taken on by
the random variable.

The variance of a continuous random variable , denoted


by  !(), or "  , is the average square distance of each
value of the random variable from its expected value,
which is given by   −   and can be re-expressed in
short form as    −  .

Prepared by: RN Capuyan 120


The Normal Distribution

• Most widely used continuous distribution.


• Frequently called the Gaussian distribution, after well
known mathematician Karl Friedrich Gauss.
• Many other distributions that are not themselves
normal can be made approximately normal by
transforming the data onto a different scale – log
transformation.
• Most estimation procedures and hypothesis tests
assume the random variable has an underlying normal
distribution.
• Approximating distribution to other distributions.
Prepared by: RN Capuyan 121
The normal distribution is generally more convenient
to work with than any other distribution, particularly
in hypothesis testing. Thus, if an accurate normal
approximation to some other distribution can be
found, we often will want to use it.

Prepared by: RN Capuyan 122


The normal distribution is defined by its pdf, which is
given by

The exp function merely implies that the quantity to the


right in brackets is the power to which “e” (≈2.71828) is
raised.

The pdf for normal distribution with  = 50 and "  =


100 is shown on the next slide.

Prepared by: RN Capuyan 123


The density function follows a bell-shaped curve, with the
mode at μ and the most frequently occurring values
around μ. The curve is symmetric about μ, with points of
inflec[on on either side of μ at μ − σ and μ + σ,
respectively.
Prepared by: RN Capuyan 124
A point of inflection is a point at which the slope of the
curve changes direction. In the figure, the slope of the
curve increases to the le^ of μ − σ and then starts to
decrease to the right of μ − σ and con[nues to decrease
until reaching μ + σ, after which it starts increasing again.
Thus, distances from μ to points of inflection provide a
good visual sense of the magnitude of the parameter σ.
You may wonder why the parameters μ and "  have been
used to define the normal distribution when the
expected value and variance of an arbitrary distribution
were previously defined as μ and "  . Indeed, from the
definition of the normal distribution it can be shown,
using calculus methods, that μ and "  are, respectively,
the expected value and variance of this distribution.
Prepared by: RN Capuyan 125
The area under any normal density function must be 1.

A normal distribution with mean  and variance "  will


generally be referred to as an L , "  distribution. The
second parameter is always the variance "  , not the
standard deviation σ.

Prepared by: RN Capuyan 126


Comparison of two normal distributions with the same
variance and different means

Prepared by: RN Capuyan 127


Comparison of two normal distributions with the same
means and different variances

Prepared by: RN Capuyan 128


A normal distribution with mean 0 and variance 1 is
called a standard, or unit, normal distribution. This
distribution is also called an N(0,1) distribution.

The pdf reduces to

This distribution is symmetric about 0, because f(x) = f(-


x).

Prepared by: RN Capuyan 129


It can be shown that about 68% of the area under the
standard normal density lies between +1 and -1, about
95% of the area lies between +2 and -2, and about 99%
lies between +2.5 and -2.5.

These relationship can be expressed more precisely by


saying that

P(-1 < X < 1) = .6827 P(-1.96 < X < 1.96) = .95


P(-2.576 < X < 2.576) = .99

Thus, the standard normal distribution slopes off very


rapidly, and absolute values greater than 3 are unlikely.
Prepared by: RN Capuyan 130
The pdf of a standard normal distribution

Prepared by: RN Capuyan 131


Empirical properties of the standard normal distribution

Prepared by: RN Capuyan 132


The cumulative-distribution function for a standard
normal distribution is denoted by

Φ(x) = P(X ≤ x)

where X follows an N(0, 1) distribution.

The symbol ~ is used as shorthand for the phrase “is


distributed as.” Thus, X ~ N(0, 1) means that the random
variable X is distributed as an N(0, 1) distribution.

Prepared by: RN Capuyan 133


The cdf [Φ(x)] for a standard normal distribution

Prepared by: RN Capuyan 134


Symmetry Properties of the Standard Normal
Distribution

Φ(- x) = P(X ≤ - x) = P(X ≥ x) = 1 – P(X ≤ x) = 1 – Φ(x)

where X follows an N(0, 1) distribution.

The symbol ~ is used as shorthand for the phrase “is


distributed as.” Thus, X ~ N(0, 1) means that the random
variable X is distributed as an N(0, 1) distribution.

Prepared by: RN Capuyan 135


Illustration of the symmetry properties of the normal
distribution for x = 1

Prepared by: RN Capuyan 136


Example

Calculate P(X ≤ -1.96) assuming X ~ N(0, 1).

P(X ≤ -1.96) = P(X ≥ 1.96) = .0250 using, =NORMSDIST(-


1.96) of excel.

Furthermore, for any numbers a, b we have P(a ≤ X ≤ b)


= P(X ≤ b) – P(X ≤ a) and thus, we can evaluate P(a ≤ X ≤
b) for any a, b.

Prepared by: RN Capuyan 137


Example 1

Compute P(-1 ≤ X ≤ 1.5) assuming X ~ N(0, 1).


= P(X ≤ 1.5) – P(X ≤ -1)= .9332 - .1587
=.7745

In excel,
=NORMSDIST(1.5) = .9332 and =NORMSDIST(-1) = .1587

Note:
P(X ≤ -1) = P(X ≥ 1)
=NORMSDIST(1) =.8413, 1 - .8413 = .1587
Prepared by: RN Capuyan 138
Example 2

Force vital capacity (FVC), a standard measure of


pulmonary function, is the volume of air a person can
expel in 6 seconds. Current research looks at potential
risk factors, such as cigarette smoking, air pollution,
indoor allergens, or the type of stove used in the home,
that may affect FVC in grade-school children. One
problem is that age, gender, and height affect pulmonary
function, and these variables must be corrected for
before considering other risk factors. One way to make
these adjustments for a particular child is to find the
mean  and standard deviation " for children of the same
age (in 1-year age groups), gender, and height (in 2-in.
Prepared by: RN Capuyan 139
height groups) from large national surveys and compute
and standardized FVC, which is defined as  −  /",
where  is the original FVC. The standardized FVC then
approximately follows an L 0,1 distribution, if the
distribution of the original FVC values was bell-shaped.
Suppose a child is considered in poor pulmonary health in
his or her standardized FVC <-1.5. What percentage of
children are in poor pulmonary health?

P(X < -1.5) = P(X > 1.5) = .0668


=NORMSDIST(-1.5) = .0668

Thus, about 7% of children are in poor pulmonary health.


Prepared by: RN Capuyan 140
Example 3

Suppose a child is considered to have normal lung growth


if his or her standardized FVC is within 1.5 standard
deviations of the mean. What proportion of children are
within the normal range?

P(-1.5 ≤ X ≤ 1.5) = .8664


= P(X ≤ -1.5) – P(X ≤ 1.5) = .9332 - . 0668 = .8664

In excel, 2007 and earlier version,


=NORMSDIST (1.5) = .9332) & =NORMSDIST(-1.5) = .0668
Prepared by: RN Capuyan 141
In excel, 2018 and present version,
=NORM.S.DIST(z,cumulative)
For the cumulative, select “TRUE” if it is the cdf,
otherwise, select “false” for the pmf.

Thus, about 87% of children have normal lung function.

The percentiles of a normal distribution are often


frequently used in statistical inference. For example, we
might be interested in the upper and lower fifth
percentiles of the distribution of FVC in children in order
to define a normal range of values.

Prepared by: RN Capuyan 142


Graphic display of the (100 x µ)th percentile of a standard
normal distribution (OC )

Prepared by: RN Capuyan 143


Definition

The (100 x µ)th percentile of a standard normal


distribution is denoted by OC . It is defined by the
relationship
P < OC = , where P~L(0,1)

The function OC is referred to as the inverse normal


function.

To obtain OC , reverse previous operation, then apply the


symmetry property of the standard normal distribution.
Prepared by: RN Capuyan 144
Example 4

Compute O.I13 , O.I3 , and O.53 .

In excel, 2007 and earlier version,


=NORMSINV(.975) =1.96
=NORMSINV(.95) =1.645
=NORMSINV(.5) = 0
=NORMSINV(.025) =-1.96

where O.I3 1.64 and 1.65 were interpolated to obtain


1.645
Prepared by: RN Capuyan 145
In excel, 2018 and present version,
=NORM.S.INV(probability)

Example 5

Compute the value x such that the area to the left of x


under a standard normal density = .85. In excel,

=NORM.S.INV(.85) = 1.036

Thus, the area to the left of 1.036 under a standard


normal density is .85.
Prepared by: RN Capuyan 146
Conversion from an Q R, ST Distribution to an Q U, V

Example

Suppose a mild hypertensive is defined as a person


whose DBP is between 90 and 100 mm Hg inclusive, and
the subjects are 35- to 44-year-old men whose blood
pressures are normally distributed with mean 80 and
variance 144. What is the probability that a randomly
selected person from this population will be a mild
hypertensive?

If ~L 80,144 , then what is 90 <  < 100 ?


Prepared by: RN Capuyan 147
If ~L , "  , then what is the <  < W for any a,
b? To solve this, we convert the probability statement
about an L , "  distribution to an equivalent
probability statement about L 0,1 distribution.
Consider the random variable P =  −  /". We can
show that the following relationship holds

If ~L , "  and P =  −  /", then Z ~L 0,1 .

If ~L , "  and P =  −  /" then

− W−
<<W = <<
" "
Prepared by: RN Capuyan 148
Evaluation of probabilities for any normal distribution
using standardization

Prepared by: RN Capuyan 149


Prepared by: RN Capuyan 150
Solution

The probability of being a mild hypertensive among the


group of 35-to 44-year-old men can now be calculated.

90 − 80 100 − 80
90 <  < 100 = <<
12 12
= .833 <  < 1.667 = .9522 - .7977 = .155.

Thus, about 15.5% of this population will have mild


hypertension

Prepared by: RN Capuyan 151


Definition

The pth percentile of a general normal distribution (x) can


also be written in terms of the percentiles of a standard
normal distribution as follows:

 =  + OX "

Example 1

Glaucoma is an eye disease that is manifested by high


intraocular pressure (IOP). The distribution of IOP in the
general population is approximately normal with mean =
Prepared by: RN Capuyan 152
16 mm Hg and standard deviation = 3 mm Hg. If the
normal range for IOP is considered to be between 12 and
20 mm Hg, then what percentage of the general
population would fall within this range?

Solution

Because IOP can only be measured to the nearest integer,


we will associate the recorded value of 12 mm Hg with a
range of actual IOP values from 11.5 to 12.5 mm Hg.
Similarly, we associate a recorded IOP value of 20 mm Hg
with a range of actual IOP values from 19.5 to 20.5 mm
Hg. Hence, we want to calculate P(11.5 ≤ X ≤ 20.5),
Prepared by: RN Capuyan 153
Where ~L(16, 9), as shown in the Figure. The process
of associating a specific observed value (such as 12 mm
Hg) with an actual range of value (11.5 ≤ X ≤ 12.5) is
called “incorporating a continuity correction.”

Prepared by: RN Capuyan 154


Calculation of the proportion of people with IOP in the
normal range

Prepared by: RN Capuyan 155


Where ~L(16, 9), as shown in the Figure. The process
of associating a specific observed value (such as 12 mm
Hg) with an actual range of value (11.5 ≤ X ≤ 12.5) is
called “incorporating a continuity correction.”

p1 = P[X ≤ 20.5|~L(16, 9)] =NORM.DIST(20.5, 16,


3,TRUE) = .9331

P2 = P[X ≤ 11.5|~L(16, 9)] =NORM.DIST(11.5, 16,


3,TRUE) = .0668

In excel, 2007 or earlier version,


=NORMDIST(x,mean,standard deviation,cumulative)
Prepared by: RN Capuyan 156
Thus, P(11.5 ≤ X ≤ 20.5) = p1 – p2 = .866. Thus, 86.6% of
the population has IOP in the normal range.

Example 2

Suppose the distribution of DBP is 30- to 44-year-old men


is normally distributed with mean = 80 mm Hg and
variance = 144 mm Hg. Find the upper and lower fifth
percentiles of this distribution.

Solution

.53 = 80 + O.53 (12)


Prepared by: RN Capuyan 157
.53 = 80 + O.53 (12) = 80 – 1.645(12) = 60.3 mm Hg
.I3 = 80 + O.I3 (12) = 80 + 1.645(12) = 99.7 mm Hg

In excel, either use =NORMSINV(probability) or


=NORM.S.INV(probability) where .05 and .95 are the
probabilities.

Prepared by: RN Capuyan 158


Normal Approximation to the Binomial Distribution

 If n is large, binomial distribution is very cumbersome


to work with.
 If n is moderately large, and p is either near 0 or near 1,
then the binomial distribution will be very positively or
negatively skewed.
 When n is small, for any p, the distribution tends to be
skewed.
 If n is moderately large and p is not too extreme, then
the binomial distribution tends to be symmetric.

Prepared by: RN Capuyan 159


Symmetry properties of the binomial distribution

Prepared by: RN Capuyan 160


Normal Approximation to the Binomial Distribution

If  is a binomial random variable with parameters n and


p, then the ≤  < W is approximated by the area
under an L 6<, 6<= curve from − to W + . This rule
 
implies that for the special case = W, the binomial
probability  = is approximated by the area under
the normal curve from − to W + . The only exception
 
to this rule is that  = 0 and  = 6 are
approximated by the area under the normal curve to the
left of and to the right of 6 − , respectively.
 

Prepared by: RN Capuyan 161


The normal approximation to the binomial distribution is
a special case of a very important statistical principle, the
Central-Limit Theorem. Under this principle, for large n, a
sum of n random variables is approximately normally
distributed even if the individual random variables being
summed are not themselves normal.

Definition

Let  be a random variable that takes on the value 1 with


probability p and the value 0 with probability q = 1 – p.
This type of random variable is referred to as a Bernoulli
trial. This is a special case of a binomial random variable
with n = 1.
Prepared by: RN Capuyan 162
We know from the definition of an expected value that
  = 1 < + 0 = = < and that    = 1 < +
0 = = <. Therefore,  !  =    −    =
< − < = < 1 − < = <=

Consider the random variable,

 = 

This random variable represents the number of successes


among the n trials.
Prepared by: RN Capuyan 163
Example 1

Interpret  ,…,) and  in the case of the number of


neutrophils among 100 white blood cells.

Solution

In this case, 6 = 100 and  = 1 if the ith white blood


cell is a neutrophil and  = 0 if the ith white blood cell is
not a neutrophil, where Y = 1, … , 100.  represents the
number of neutrophils among 6 = 100 white blood cells.

Prepared by: RN Capuyan 164


)

  =   = < + < + ⋯ + < = 6<

and
)

 !  =  ! 
)

=   !  = <= + <= + ⋯ + <= = 6<=

Approximating the distribution of X by a normal


distribution with mean = np and variance = npq.

Prepared by: RN Capuyan 165


Example 2

Suppose a binomial distribution has parameters 6 = 25,


< = .4. How can 7 ≤  ≤ 12 be approximated?

Solution

We have 6< = 25 .4 = 10, 6<= = 25 .4 .6 = 6.0.


Thus, this distribution is approximated by a normal
random variable @ with mean 10 and variance 6. We
specifically want to compute the area under this normal
curve from 6.5 to 12.5.

Prepared by: RN Capuyan 166


The approximation of the binomial random variable X
with parameters n = 25, p = .4 by the normal random
variable Y with mean = 10 and variance = 6.

Prepared by: RN Capuyan 167


.3+ 5 2.3+ 5
(6.5 ≤ @ ≤ 12.5) = −
2 2
= P(1.021) – P(-1.429)
= 0.846 – 0.077 = 0.770

In excel, =NORM.S.DIST (z,TRUE) or =NORMSDIST(z)

For comparison, using the =BINOMDIST function of excel,


P(7 ≤ X ≤ 12) = .773

Recall,
=BINOMDIST(x,n,p,false)
P(X ≥ 7) = 1 – P(X ≤ 6), x = 0,1,…,6
Prepared by: RN Capuyan 168
Assignment
1. Suppose we want to compute the probability that
between 50 and 75 of 100 white blood cells will be
neutrophils, where the probability that any one cell is
a neutrophil is .6. These values ate chosen as
proposed limits to the range of neutrophils on normal
people and we wish to predict what proportion of
people will be in the normal range according to this
definition. Please interpret the answer.

Prepared by: RN Capuyan 169


The exact probability is given by

13
100
 .6 *
.4 55+*
7
* 35

2. Suppose a neutrophil count is defined as abnormally


high if the number of neutrophils is ≥ 76 and
abnormally low if the number of neutrophils is ≤ 49.
Calculate the proportion of people whose neutrophil
counts are abnormally high or low.

Prepared by: RN Capuyan 170


Definition

The normal distribution with mean np and variance npq


can be used to approximate a binomial distribution with
parameters n and p when npq ≥ 5. This condition is
sometimes called “the rule of five.”

This condition is satisfied if n is moderately large and p is


not too small or too large. To illustrate this condition, the
binomial probability distributions for p = .1, n = 10, 20,
50, and 100 are plotted, and p = .2, n = 10, 20, 50, and
100 are also plotted.

Prepared by: RN Capuyan 171


Prepared by: RN Capuyan 172
Prepared by: RN Capuyan 173
Normal Approximation to the Poisson Distribution

Prepared by: RN Capuyan 174

You might also like