BITS Pilani
Pilani Campus
Course No: MATH F432
Applied Statistical Methods
Hypothesis Testing
(Lecture 7)
Sumanta Pasari
Assistant Professor,
BITS Pilani Department of Mathematics,
Pilani Campus BITS Pilani, Pilani Campus
Revise: Confidence Interval on p
Interval Estimate = Point Estimate
` + / Margin of Error
Recall that if np 5 and n 1 p 5, we get (using CLT)
X p 1 p p p
p ~ N p, ~ N 0,1
n
n
p 1 p
n
Taking two points z 2 symmetrically about the origin, we get
p p
P z 2 z 2 1
p 1 p
n
Here 1 is known as confidence level.
3 BITS Pilani, Pilani Campus
Confidence Interval on p
p 1 p p 1 p
P p z 2 p p z 2 1
n n
As p is unknown, above confidence bounds are not statistics. So replace p by
unbiased estimator p, and then the CI on p having confidence level 1 is
p 1 p p 1 p
p z 2 , p z 2 .
n n
The endpoints of the confidence interval is called confidence limits.
4 BITS Pilani, Pilani Campus
Sample Size for Estimating p
We can be 100(1-% sure that p and p differ by
at most E , where E is given by
p (1 p )
E z 2
n
Thus, sample size for estimating p, when prior
estimate available is
p (1 p )
n z 2
2
2
E
5 BITS Pilani, Pilani Campus
BITS Pilani
Pilani Campus
Testing of Hypothesis
Chapter 9
BITS Pilani, Pilani Campus
Objectives
• Understanding hypothesis testing
• Constructing null and alternative hypotheses
• Type I and Type II errors
• Power of a test
• Test for population mean and population
proportion
7 BITS Pilani, Pilani Campus
Testing of Hypothesis
• Often we end up with taking decisions based on samples: the decision may
be correct or it may be incorrect.
• Testing of hypothesis is used to verify whether a statement about the value
of a population parameter should be rejected or not.
• The statement will be verified based on the information available from
random samples.
• Either the statement will be rejected or the statement cannot be rejected
(that is, accepted) based on the information available from samples.
• Two types of statement: null hypothesis and alternative hypothesis
8 BITS Pilani, Pilani Campus
Testing of Hypothesis
• The null hypothesis, denoted by H0, is a tentative preconceived
assumption about population parameter. It always includes the
‘statement of equality’, that is, equality part always appears with H0.
• The alternative hypothesis, denoted by Ha or H1, is the opposite of
what is stated in the null hypothesis. The alternative hypothesis is
what the test is attempting to test or establish.
• If the information available from sample data contradicts the null
hypothesis, we shall reject it, otherwise, we say “we fail to reject”
null hypothesis (similar to accepting the alternative hypothesis).
9 BITS Pilani, Pilani Campus
Examples: Testing of Hypothesis
A criminal trial:
In a trial, jury must decide between two hypotheses. The null hypothesis
is
H0: The defendant is innocent
The alternative hypothesis or research hypothesis is
H1: The defendant is guilty
The jury do not know which hypothesis is true. They must make a
decision on the basis of evidence presented.
10 BITS Pilani, Pilani Campus
Examples: Testing of Hypothesis
In the language of statistics convicting the defendant is called rejecting the null
hypothesis in favor of the alternative hypothesis. That is, the jury is saying
that there is enough evidence to conclude that the defendant is guilty (i.e.,
there is enough evidence to support the alternative hypothesis).
If the jury acquits it is stating that there is not enough evidence to support the
alternative hypothesis. Notice that the jury is not saying that the defendant is
innocent, only that there is not enough evidence to support the alternative
hypothesis. In the same logic, we do not say that we accept the null
hypothesis, rather we say that “we fail to reject the null hypothesis” from
available information from sample.
11 BITS Pilani, Pilani Campus
Types of Errors
Decision H0 accepted H0 rejected
Reality
H0 true No error Type I error
(probability = α)
H0 false Type II error No error
(probability = β)
• H0: the null hypothesis and H1: the alternative hypothesis
• Type-I error: Rejecting null hypothesis when it is actually true; Prob(type-I) = α
• Type-II error: Failed to reject null hypothesis when it is false; Prob(type-II) = β
• Power of a test (1-β): Probability of rejecting null hypothesis when it is false
12 BITS Pilani, Pilani Campus
Types of Errors
Critical
Value
Accept H0 Reject H0
Reducing both type-I and type-II errors together is not possible.
Although, one can try to make either type of error reasonably small!
13 BITS Pilani, Pilani Campus
Level of Significance
• The decision depends on the value of the test
statistic on a sample and hence has randomness in
it.
• There is a chance that null hypothesis is rejected
when it is true, that is, we have committed type I
error.
• Probability of Type I error is
P[H0 is rejected|H0 is true].
• This is also called level of significance and denoted
by .
BITS Pilani, Pilani Campus
Type-I Error
A type I error is an error made when the null
hypothesis is rejected, in spite of it being true. The
probability of committing a type I error is called the
‘level of significance’ of the test and is denoted by ‘’.
The set of values of the test statistic that leads us to
reject the null hypothesis is termed as ‘Critical
Region’.
BITS Pilani, Pilani Campus
Type-II Error
We design the test so that the probability of
committing a type I error is approximately the value
we desire.
Sometimes, it might also happen that the observed
value of the test statistic does not fall on the rejection
region even though the null hypothesis is not true and
should be rejected. This is type-II error. The probability
of occurrence of this is given by beta (b).
BITS Pilani, Pilani Campus
Definition 8.3.3: Power of a Test
Consider a test of hypothesis. The probability that the null
hypothesis will be rejected when, in fact, the research theory
is true is called the power of the test (1-β).
Note: We will either fail to reject to the null hypothesis with
probability b or we reject the null hypothesis with probability
power, so
b + power = 1
Note: Our objective is always to keep α and β as small as
possible and the power of the test to be as high as possible.
This is usually achieved by choosing a appropriate sample
size.
BITS Pilani, Pilani Campus
Constructing null and alternative
hypotheses
• One-tailed and two-tailed test:
H 0 : 0 H 0 : 0 H 0 : 0
H 1 : 0 H 1 : 0 H 1 : 0
One-tailed One-tailed Two-tailed
(lower-tail) (upper-tail)
Or Or
(Left-tailed) (Right-tailed)
• Probability of Type I error is = P(H0 is rejected|H0 is true).
This is also called the level of significance.
• Probability of Type II error is b =P(H0 is accepted|H0 is false)
18 BITS Pilani, Pilani Campus
One-tailed or two-tailed?
Ex. From long experience of coca-cola company, it is known that
yield is normally distributed with mean of 500 units and standard deviation
96 units. For a modified process, yield is 535 units for a sample of size 50.
At 5% significance level, does the modified process increased the yield?
Sol. Here H 0 : 500 this specifies a single value for the parameter
Actually, we shall assume H 0 : 500
H1 : 500 this is what we want to test
one-tailed test; test for and known
Is calculated value
greater than the
critical value ?
BITS Pilani, Pilani Campus
One-tailed or two-tailed?
Ex. A department store manager determines that a new billing system will be
cost effective only if the mean monthly account is more than $170. A random sample
of 400 monthly accounts is drawn, for which the sample mean is $178. It is known
that the accounts are approximately normally distributed with s.d. of $65.
At 5%, can we conclude that the new system will be cost-effective?
Sol.
System is cost effective if the mean account balance for all customers (population)
is greater than $170, that is, if $170.
Our null hypothesis thus H 0 : 170
H1 : 170 this is what we want to test
one-tailed test; test for and known
BITS Pilani, Pilani Campus
One-tailed or two-tailed?
Ex. A drug is given to 10 patients, and the increments in their blood pressure
were recorded as 3, 6, 2, 4, 4, 1, 6, 0, 0, 2. Is it reasonable to believe that the
drug has no effect on change of the mean blood pressure? Test at 5% significance
level, assuming that the population is normal with variance 1.
Sol. Formulate the hypothesis: H 0 : 0
H1 : 0
Two-tailed test; test for and known.
Does calculated
value fall in the
rejection region of
Acceptance
H0 (that is beyond
region
the critical values)
?
BITS Pilani, Pilani Campus
One-tailed or two-tailed?
Ex. The mean weakly sales of a magazine was 146 units. After an advertisement
campaign, mean of weakly sales in 22 stores for a typical week increased to 154 with
a standard deviation of 17 units. Was the advertisement successful at 5% significance
level? It is given that the weakly sales of magazine follows normal distribution.
Sol. Formulate the hypothesis: H 0 : 146
H1 : 146
One-tailed test; test for and unknown.
Ex. A state highway patrol periodically samples vehicles speeds at various
locations on a particular highway. The sample of vehicle speeds is used to test the
hypothesis H 0 : 65. A sample of 64 vehicles shows a mean speed of 66.2 kmph
with a s.d. of 4.2 kmph. Use 0.05 to test H 0 . Assume normality of population.
Sol. Formulate the hypothesis: H 0 : 65
H1 : 65 One-tailed test; test for , unknown.
BITS Pilani, Pilani Campus
One-tailed or two-tailed?
Ex. A marketing research firm conducted a survey 10 years ago and found that
the average household income of Pilani is Rs. 12000. Mr. Agrawal, who has recently
joined the firm wants to verify the accuracy of data. For this, the firm decides to take
a random sample of 200 households. Sample mean and sample s.d. are Rs. 13000 and
Rs. 100. Verify Mr. Agrawal's doubt at 0.05, assuming normality of population.
Sol. Formulate the hypothesis: H 0 : 12000
H1 : 12000
Two-tailed test; test for and unknown sample size n 200 .
Ex. A CFL manufacturing company supplies its products to various retailers. The
company has received complaints from retailers that the average life of its CFL is not
24 months, as the company claims. For verifying, the company collected a random sample
of 150 CFLs and found that the average life is 23 months. Assuming 5 months, test the
average population life of CFLs at 0.08.
Sol. Formulate the hypothesis: H 0 : 24, H1 : 24 Two-tailed; test for , known.
BITS Pilani, Pilani Campus
One-tailed or two-tailed?
Ex. In a golf course, over the past years, 20% of the players were women. In an effort
to increase the proportion of women players, a special promotion was implemented. Now the
manager likes to see whether the promotion helped to increase the proportion of women
players. A random sample of 400 players was selected, and 100 of the players were women.
Test the hypothesis at 5% significance level.
Sol. Formulate the hypothesis: H 0 : p 0.20
H1 : p 0.20
One-tailed test; test for p sample size n 400 .
BITS Pilani, Pilani Campus
Steps of Hypothesis Testing
Step 1. Develop the null and alternative hypotheses; determine
appropriate statistical test.
Step 2. Specify the level of significance .
Step 3. Collect the sample data and compute the test statistic.
Step 4. Based on , identify critical values.
Step 5. Reject H0 if the calculated test statistic value falls in
the rejection region.
25 BITS Pilani, Pilani Campus
Test Statistics
1. Test Statistic for population mean:
(a) when population variance is known: Z X 0
/ n
(b) when population variance is unknown: Tn1 X 0
(requires normaility of population) S/ n
2. Test Statistic for population proportion:
p0 1 p0
Z pˆ p0
n
26 BITS Pilani, Pilani Campus
Lower-tailed test for population
mean (σ known)
H 0 : 0
H 1 : 0
27 BITS Pilani, Pilani Campus
Upper-tailed test for population
mean (σ known)
H 0 : 0
H 1 : 0
28 BITS Pilani, Pilani Campus
Two-tailed test for population
mean (σ known)
Do Not
Reject H0
(Acceptance
Region)
H 0 : 0
H 1 : 0
29 BITS Pilani, Pilani Campus
Examples: Hypothesis Testing
Ex.From long experience of coca-cola company, it is known that
yield is normally distributed with mean of 500 units and standard deviation
96 units. For a modified process, yield is 535 units for a sample of size 50.
At 5% significance level, does the modified process increased the yield?
Sol.
30 BITS Pilani, Pilani Campus
Examples: Normal Distribution
BITS Pilani, Pilani Campus
Examples: Hypothesis Testing
Ex. From long experience of coca-cola company, it is known that
yield is normally distributed with mean of 500 units and standard deviation
96 units. For a modified process, yield is 535 units for a sample of size 50.
At 5% significance level, does the modified process increased the yield?
Sol.
Step1:Here H 0 : 500 one-tailed (right-tailed) test; test for , known
H1 : 500
x 0 535 500
Step 2: From sample data, we formulate zcalculated 2.57
96 50
n
Step 3: At 95% confidence level, z0.05 1.64 from single tailed test of Z-table
Step 4: As zcalculated z0.05 reject the null hypothesis (i.e., enough evidence to
accept the alternative hypothesis)
32 BITS Pilani, Pilani Campus
Examples: Hypothesis Testing
Ex. A department store manager determines that a new billing system will be
cost effective only if the mean monthly account is more that $170. A random sample
of 400 monthly accounts is drawn, for which the sample mean is $178. It is known
that the accounts are approximately normally distributed with s.d. of $65.
At 5%, can we conclude that the new system will be cost-effective?
Sol.
Step1:Here H 0 : 170 one-tailed (right-tailed) test; test for and known
H1 : 170
x 0 178 170
Step 2: From sample data, we formulate zcalculated 2.46
65 400
n
Step 3: At 95% confidence level, z0.05 1.64 from single tailed test of Z-table
Step 4: As zcalculated z0.05 reject the null hypothesis (i.e., accept H1 )
33 BITS Pilani, Pilani Campus
Examples: Hypothesis Testing
Ex. A drug is given to 10 patients, and the increments in their blood pressure
were recorded as 3, 6, 2, 4, 4, 1, 6, 0, 0, 2. Is it reasonable to believe that the
drug has no effect on change of the mean blood pressure? Test at 95% confidence
level, assuming that the population is normal with variance 1.
Sol.
Step 1: Formulate the hypothesis: H 0 : 0, H1 : 0
Two-tailed test for and is known.
x 0 0.4 0
Step 2: From sample data, we formulate z calculated 1.265
n 1 10
Step 3: At 95% confidence level, z0.025 1.96, z0.025 1.96
from two tailed test of Z-table, we find z 2
Step 4: As zcalculated does not fall in the rejection region, we fail to reject H 0 .
We can believe that the drug has no effect on change of the mean blood pressure
34 BITS Pilani, Pilani Campus
Examples: Hypothesis Testing
Ex. The mean weakly sales of a magazine was 146 units. After an advertisement
campaign, mean of weakly sales in 22 stores for a typical week increased to 154 with
a standard deviation of 15 units. Was the advertisement successful at 5% significance
level? It is given that the weakly sales of magazine follows normal distribution.
Sol.
Step 1: Formulate the hypothesis: H 0 : 146
H1 : 146
One-tailed test; test for and unknown.
x 0 154 146
Step 2: From sample data, we formulate tcalculated 2.501
S n 15 22
Step 3: For 0.05 and 21 dof , t21,0.05 1.721 from one tailed test of T-table
Step 4: As tcalculated t21, 0.05 reject the null hypothesis (i.e., accept H1 )
We can conclude that the advertisement was successful.
35 BITS Pilani, Pilani Campus
Examples: Hypothesis Testing
HW. To test the null hypothesis that population mean is 4 H 0 : 4 against
alternative hypothesis 5, a test is designed based on a random sample of size 49.
It is decided that the null hypothesis will be rejected if the observed sample mean
x 4.3. If the population variance is 9, find (a) the distribution of X , assuming H 0
true, (b) the distribution of X , assuming H1 true, and (c) Probability of type-I and
type-II errors (e) What is the power of the test?
36 BITS Pilani, Pilani Campus