Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
24 views43 pages

CH 10

Chapter 10 covers building confidence intervals and hypothesis testing for means of normal populations, especially when sample sizes are small. It introduces the Student's t distribution for estimating population means and outlines procedures for testing hypotheses about differences between two means, including paired samples. The chapter also discusses the use of chi-square distribution for variance analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views43 pages

CH 10

Chapter 10 covers building confidence intervals and hypothesis testing for means of normal populations, especially when sample sizes are small. It introduces the Student's t distribution for estimating population means and outlines procedures for testing hypotheses about differences between two means, including paired samples. The chapter also discusses the use of chi-square distribution for variance analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Chapter 10

• Objectives: Building confidence intervals and


testing hypothesis about mean and difference
of two means for normal populations when
the sample size is not large enough for the
CLT to apply…
10.1 Sampling distribution of z for small
samples
• When we take a sample from a normal population,
the sample mean x has a normal distribution for
any sample size n, and
x−µ
z=
σ/ n
has a standard normal distribution.
• But if σ is unknown, and we must use s to estimate
it, the resulting statistic is not normal.

x−µ
is not normal!
s/ n
• Fortunately, this z statistic does have a
sampling distribution that is well known to
statisticians, called the Student’s t
distribution, with n-1 degrees of freedom.

x−µ
t=
s/ n
•We can use this distribution to create
estimation and testing procedures for the
population mean µ.
10.2 Properties the student’s t-
distribution
•Mound-shaped and
symmetric about 0.
•More variable than
z, with “heavier tails”

• Shape depends on the sample size n or the


degrees of freedom, n-1. Applet
• As n increases the shapes of the t and z
distributions become almost identical.
Table for the t distribution (Table 4)
• Table 4 gives the values of t that cut off certain
areas in the tail of the t distribution.
• Index df and the appropriate tail area a are
needed to find ta,the value of t with area a to its
right.
For a random sample of size n =
10, find a value of t that cuts off
.025 in the right tail.
Row = df = n –1 = 9
Column subscript = a = .025
t.025 = 2.262
10.3 Inference for the population mean:
Confidence interval

• For a 100(1−α)% confidence interval


for the population mean µ:
s
x ± tα / 2
n
where tα / 2 is the value of t that cuts off area α/2
in the tail of a t - distribution with df = n − 1.
Hypothesis testing

• The basic procedures are the same as those


used for large samples. For a test of
hypothesis:
Test H 0 : µ = µ 0 versus H a : one or two tailed
using the test statistic
x − µ0
t=
s/ n
using p - values or a rejection region based on
a t - distribution with df = n − 1.
Example

A sprinkler system is designed so that the average


time for the sprinklers to activate after being
turned on is no more than 15 seconds. A test of 5
systems gave the following times:
17, 31, 12, 17, 13, 25
Is the system working as specified? Test using
α = .05.

H 0 : µ = 15 (working as specified)
H a : µ > 15 (not working as specified)
Example
Data: 17, 31, 12, 17, 13, 25
First, calculate the sample mean and
standard deviation, using your calculator or
the formulas in Chapter 2.
∑ xi
115
x= = = 19.167
n 6

( x) 2 2
115
∑x − 2
2477 −
s= n = 6 = 7.387
n −1 5
Example continued
Data: 17, 31, 12, 17, 13, 25
Calculate the test statistic and find the
rejection region for α =.05.
Test statistic : Degrees of freedom :
x − µ 0 19.167 − 15
t= = = 1.38 df = n − 1 = 6 − 1 = 5
s / n 7.387 / 6
Rejection Region: Reject H0 if t
> 2.015. If the test statistic falls
in the rejection region, its p-value
will be less than α = .05.
Example continued
Data: 17, 31, 12, 17, 13, 25
Compare the observed test statistic to the
rejection region, and draw conclusions.
Test statistic : t = 1.38
H 0 : µ = 15
Rejection Region :
H a : µ > 15
Reject H 0 if t > 2.015.
Conclusion: For our example, t = 1.38 does not fall in the
rejection region and H0 is not rejected. There is insufficient
evidence to indicate that the average activation time is greater
than 15.
Example continued (p-value approach)

• You can only approximate the p-value


for the test using Table 4.

Since the observed value


of t = 1.38 is smaller
than t.10 = 1.476,
p-value > .10.
Use of software to get the exact p-value
Applet
• You can get the exact p-value
using some calculators or a computer.
p-value = .113 which
is greater than .10 as
we approximated
using Table 4.
One-Sample T: Times
Test of mu = 15 vs mu > 15

Variable N Mean StDev SE Mean


Times 6 19.17 7.39 3.02

Variable 95.0% Lower Bound T P


Times 13.09 1.38 0.113
10.4 Inferences about the difference
between two population means:
Independent samples case
Asin Chapter9, independen
t randomsamples
of sizen1 andn2 aredrawn
σ 12andσ 22.
from populations1 and2 with meansμ1 and μ2 andvariances

Sincethesamplesizesaresmall,thetwo populationsmustbenormal.
•To test:
•H0: µ1−µ2 = D0 versus Ha: one of three
where D0 is some hypothesized difference,
usually 0.
•The test statistic used in Chapter 9

•does not have either a z or a t distribution, and


cannot be used for small-sample inference.
•We need to make one more assumption, that the
population variances, although unknown, are
equal.
• once we assumed equality of the two
population variances, instead of estimating each
population variance separately, we estimate the
common variance by
( n − 1) s 2
+ ( n − 1) s 2
s2 = 1 1 2 2
n1 + n2 − 2

And the resulting statistic x1 − x2 − D0


t=
has a t distribution with 1 1
s  + 
2

 n1 n2 
n1+n2-2 degrees of freedom.
Confidence interval for the difference
between two means

•You can also create a 100(1-α)% confidence


interval for µ1-µ2. Remember the three
assumptions:
2 1 1
( x1 − x2 ) ± tα / 2 s  + 

 n1 n2  1. Original
populations normal
2. Samples random
− 2
+ − 2
(n1 1) s1 (n2 1) s2 and independent
with s 2
=
n1 + n2 − 2 3. Equal population
variances.
Example
• Two training procedures are compared by
measuring the time that it takes trainees to
assemble a device. A different group of trainees are
taught using each method. Is there a difference in the
two methods? Use α = .01. H : µ − µ = 0
0 1 2

Time to Method 1 Method 2 H a : µ1 − µ 2 ≠ 0


Assemble
Test statistic :
Sample size 10 12
x1 − x2 − 0
Sample mean 35 31 t=
1 1
Sample Std 4.9 4.5 s 2  + 
Dev  n1 n2 
Applet
Example continued
• Solve this problem by approximating the p-
value using Time to Method 1 Method 2
Table 4. Assemble
Sample size 10 12
Sample mean 35 31
Sample Std 4.9 4.5
Dev
Calculate :
Test statistic :
(n1 − 1) s + (n2 − 1) s
2 2
s2 = 1 2
35 − 31
n1 + n2 − 2 t=
1 1
9(4.9 2 ) + 11(4.52 ) 21.942 + 
= = 21.942  10 12 
20
= 1.99
Example contin’d
p - value : P (t > 1.99) + P (t < −1.99)
1
P (t > 1.99) = ( p - value)
2
df = n1 + n2 – 2 = 10 + 12 – 2 = 20 .025 < ½( p-value) < .05
.05 < p-value < .10
Since the p-value is
greater than α = .01, H0
is not rejected. There is
insufficient evidence to
indicate a difference in
the population means.
Column1

Mean 8.311724138 Histogram


Standard Error 0.146738173
60
Median 8.9 50

Frequency
Mode 10 40
30 Frequency
Standard Deviation 1.766961589 20
Sample Variance 3.122153257 10
Kurtosis 4.29444484 0

Skewness -1.948890679

10
Range 9.5 Bin
Minimum 0.5
Maximum 10
Sum 1205.2
Count 145
Confidence Level(95.0%) 0.290039379
Equal variance assumption
• Rule of a thump for checking equality of
variances

Rule of Thumb :
2
larger s
If the ratio, 2
≤ 3,
smaller s
the equal variance assumption is reasonable.
2
larger s
If the ratio, 2
> 3,
smaller s
use an alternative test statistic.
10.5 Testing the difference between
means of two dependent samples (paired
difference test)
• Sometimes the assumption of independent samples
is intentionally violated, resulting in a matched-
pairs or paired-difference test.
• By designing the experiment in this way, we can
eliminate unwanted variability in the experiment
by analyzing only the differences,
di = x1i – x2i
• to see if there is a difference in the two population
means, µ1−µ2.
Example
Car 1 2 3 4 5
Type A 10.6 9.8 12.3 9.7 8.8
Type B 10.2 9.4 11.8 9.1 8.3

• One Type A and one Type B tire are randomly assigned


to each of the rear wheels of five cars. Compare the
average tire wear for types A and B using a test of
hypothesis.
• But the samples are not
H 0 : µ1 − µ 2 = 0 independent. The pairs of
responses are linked because
H a : µ1 − µ 2 ≠ 0
measurements are taken on the
same car.
To test H 0 : µ1 − µ 2 = 0 we test H 0 : µ d = 0
using the test statistic
d −0
t=
sd / n
where n = number of pairs, d and sd are the
mean and standard deviation of the differences, d i .
Use the p - value or a rejection region based on
a t - distribution with df = n − 1.
Example continued
Car 1 2 3 4 5
Type A 10.6 9.8 12.3 9.7 8.8
Type B 10.2 9.4 11.8 9.1 8.3
Difference .4 .4 .5 .6 .5

H 0 : µ1 − µ 2 = 0
H a : µ1 − µ 2 ≠ 0 Test statistic :
d −0 .48 − 0
Calculated =
∑ di
= .48 t= = = 12.8
n sd / n .0837 / 5
∑d 2

(∑ di )
2

i
sd = n = .0837
n −1
Example continued: using rejection
region based on alpha=.05
Car 1 2 3 4 5
Type A 10.6 9.8 12.3 9.7 8.8
Type B 10.2 9.4 11.8 9.1 8.3
Difference .4 .4 .5 .6 .5

Rejection region: Reject H0 if


t > 2.776 or t < -2.776.
Conclusion: Since t = 12.8,
H0 is rejected. There is a
difference in the average tire
wear for the two types of tires.
• You can construct a 100(1-α)% confidence interval
for a paired experiment using

sd
d ± tα / 2
n
10.6 Inference about population variance

•Sometimes the primary parameter of interest


is not the population mean µ but rather the
population variance σ2. We choose a random
sample of size n from a normal distribution.
•The sample variance s2 can be used in its
standardized form: (n − 1) s
2
χ2 =
σ2

• which has a Chi-Square distribution with n - 1


degrees of freedom.
Reading the table of the chi-square
distribution

•Table 5 gives both upper and lower critical


values of the chi-square statistic for a given df.

For example, the value


of chi-square that cuts
off .05 in the upper tail
of the distribution with
df = 5 is χ2 =11.07.
Testing hypothesis and building
Confidence interval for the variance

To test H 0 : σ 2 = σ 02 versus H a : one or two tailed


we use the test statistic
(n − 1) s 2
χ =
2
with a rejection region based on
σ 2
0

a chi - square distribution with df = n − 1.

Confidence interval :
(n − 1) s 2 (n − 1) s 2
<σ <
2

χα / 2
2
χ 2
(1−α / 2 )
Example

•A cement manufacturer claims that his cement


has a compressive strength with a standard
deviation of 10 kg/cm2 or less. A sample of n =
10 measurements produced a mean and standard
deviation of 312 and 13.96, respectively.
A test of hypothesis: uses the test statistic:
H0: σ2 = 10 (claim is ( n − 1) s 2
9 (13. 96 2
)
correct) χ =
2
2
= = 17.5
10 100
Ha: σ2 > 10 (claim is
wrong)
Example continued

•Do these data produce sufficient evidence to


reject the manufacturer’s claim? Use α = .05.
Rejection region: Reject
H0 if χ2 > 16.919 (α = .05).
Conclusion: Since χ2=
17.5, H0 is rejected. The
standard deviation of the
cement strengths is more
than 10.
Example continued

p - value : P ( χ 2 > 17.5) with df = n − 1 = 9

.025 < p-value < .05


Since the p-value is less
than α = .05, H0 is not
rejected. There is
sufficient evidence to
reject the manufacturer’s
claim.
Example continued
• (as an exercise )Find a 95% C. I. for the
variance σ2
Inference Concerning
Two Population Variances
•We can make inferences about the ratio of two
population variances in the form of a ratio. We
choose two independent random samples of
size n1 and n2 from normal distributions.
•If the two population variances are equal, the
statistic F=
s2
1
2
s2

has an F distribution with df1 = n1 - 1 and df2 =


n2 - 1 degrees of freedom.
•Table 6 gives only upper critical values of the
F statistic for a given pair of df1 and df2.

For example, the value


of F that cuts off .05 in
the upper tail of the
distribution with df1 = 5
and df2 = 8 is F =3.69.
To test H 0 : σ = σ versusH a : oneor two tailed
2
1
2
2

weusethe test statistic


s12
F = 2 wheres12 isthe largerof the two samplevariances
.
s2
with a rejectionregionbasedon an F distribution with
df1 = n1 − 1 and df 2 = n2 − 1. Confidence interval :
2
s 1 σ
2
s 2
1
< 1
< Fdf 2 ,df1
1
s Fdf1 ,df 2 σ
2
2
2
2 s 2
2
Example

•An experimenter has performed a lab


experiment using two groups of rats. He wants
to test H0: µ1 = µ2, but first he wants to make
sure that the population variances are equal.
Standard (2) Experimental (1)
Sample size 10 11
Sample mean 13.64 12.42
Sample Std Dev 2.3 5.8

Preliminary test :
H 0 : σ 12 = σ 22 versus H a : σ 12 ≠ σ 22
Example continued
Standard (2) Experimental (1)

Sample size 10 11
Sample Std Dev 2.3 5.8
Test statistic :
H 0 : σ 12 = σ 22 s12 5.82
F= 2 = = 6.36
Ha :σ ≠ σ
2
2 2 s2 2.3
1 2

We designate the sample with the larger standard


deviation as sample 1, to force the test statistic
into the upper tail of the F distribution.
Example continued
Test statistic :
H0 : σ = σ
2
1
2
2 s12 5.82
F= 2 = = 6.36
Ha :σ ≠ σ
2 2 2
s2 2.3
1 2

The rejection region is two-tailed, with α = .05, but we only


need to find the upper critical value, which has α/2 = .025 to
its right.
From Table 6, with df1=10 and df2 = 9, we reject H0 if F >
3.96.
CONCLUSION: Reject H0. There is sufficient evidence to
indicate that the variances are unequal. Do not rely on the
assumption of equal variances for your t test!
Examples for chapter 10

You might also like