Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
15 views44 pages

Stat

Chapter 8 of STAT 201 covers estimation techniques, focusing on forming confidence intervals for mean differences in both dependent and independent samples, as well as for population proportions. It includes methods for calculating confidence intervals when population variances are known or unknown, and provides examples to illustrate these concepts. The chapter emphasizes the importance of assumptions regarding normal distribution and independence of samples.

Uploaded by

dorian gray
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views44 pages

Stat

Chapter 8 of STAT 201 covers estimation techniques, focusing on forming confidence intervals for mean differences in both dependent and independent samples, as well as for population proportions. It includes methods for calculating confidence intervals when population variances are known or unknown, and provides examples to illustrate these concepts. The chapter emphasizes the importance of assumptions regarding normal distribution and independence of samples.

Uploaded by

dorian gray
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

STAT 201

Chapter 8

Estimation: Additional Topics


Chapter Goals
After completing this chapter, you should be able to:
 Form confidence intervals for the mean difference from
dependent samples
 Form confidence intervals for the difference between
two independent population means (standard deviations
known or unknown)
 Compute confidence interval limits for the difference
between two independent population proportions
Estimation: Additional Topics

Chapter Topics

Population Population
Means, Means, Population Population
Dependent Independent Proportions Variance
Samples Samples Ch.7
Examples:
Same group Group 1 vs. Proportion 1 vs. Variance of a
before vs. after independent Proportion 2 normal distribution
treatment Group 2
Dependent Samples
Tests Means of 2 Related Populations
Dependent  Paired or matched samples
samples
 Repeated measures (before/after)
 Use difference between paired values:

di = xi - yi

 Eliminates Variation Among Subjects


 Assumptions:
 Both Populations Are Normally Distributed
Mean Difference
The ith paired difference is di , where
Dependent
samples
di = x i - yi
n

d
The point estimate for
the population mean i

paired difference is d : d i 1
n
n
The sample
standard  i
(d  d) 2

deviation is: Sd  i1


n 1
n is the number of matched pairs in the sample
Confidence Interval for
Mean Difference
The confidence interval for difference
Dependent between population means, μd , is
samples

Sd Sd
d  t n1,α/2  μd  d  t n1,α/2
n n
Where
n = the sample size
(number of matched pairs in the paired sample)
Confidence Interval for
Mean Difference
(continued)

Dependent
 The margin of error is
samples
sd
ME  t n1,α/2
n

 tn-1,/2 is the value from the Student’s t


distribution with (n – 1) degrees of freedom
for which
α
P(t n1  t n1,α/2 ) 
2
Paired Samples Example

 Six people sign up for a weight loss program. You


collect the following data:

Weight: 
d = n
di
Person Before (x) After (y) Difference, di

1 136 125 11 = 7.0


2 205 195 10
3 157 150 7
4
5
138
175
140
165
-2
10 Sd 
 i
(d  d) 2

n 1
6 166 160 6
42  4.82
Paired Samples Example
(continued)

 For a 95% confidence level, the appropriate t value is


tn-1,/2 = t5,.025 = 2.571
 The 95% confidence interval for the difference between
means, μd , is
Sd S
d  t n1,α/2  μd  d  t n1,α/2 d
n n
4.82 4.82
7  (2.571)  μd  7  (2.571)
6 6
 1.94  μd  12.06

Since this interval contains zero, we cannot be 95% confident, given this
limited data, that the weight loss program helps people lose weight
Difference Between Two Means

Population means, Goal: Form a confidence interval


independent for the difference between two
samples population means, μx – μy
 Different data sources
 Unrelated
 Independent
 Sample selected from one population has no effect on the
sample selected from the other population
 The point estimate is the difference between the two
sample means:
x–y
Difference Between Two Means
(continued)

Population means,
independent
samples

σx2 and σy2 known Confidence interval uses z/2

σx2 and σy2 unknown

σx2 and σy2


assumed equal Confidence interval uses a value
from the Student’s t distribution
σx2 and σy2
assumed unequal
σx2 and σy2 Known

Population means, Assumptions:


independent
samples  Samples are randomly and
independently drawn
σx2 and σy2 known
*  both population distributions
σx2 and σy2 unknown are normal

 Population variances are


known
σx2 and σy2 Known
(continued)

When σx and σy are known and


Population means,
both populations are normal, the
independent
samples variance of X – Y is
2
σx
2
σy
σ 2X Y  
σx2 and σy2 known
* nx ny

…and the random variable


σx2 and σy2 unknown
(x  y)  (μX  μY )
Z
σ 2x σ y
2


nX nY

has a standard normal distribution


Confidence Interval,
σx2 and σy2 Known

Population means,
independent
samples

σx2 and σy2 known


* The confidence interval for
μx – μy is:
σx2 and σy2 unknown

σ 2X σ 2Y σ 2X σ 2Y
(x  y)  z α/2   μX  μY  (x  y)  z α/2 
nx ny nx ny
σx2 and σy2 Unknown,
Assumed Equal

Population means, Assumptions:


independent
 Samples are randomly and
samples
independently drawn

σx2 and σy2 known  Populations are normally


distributed
σx2 and σy2 unknown
 Population variances are
σx2 and σy2
assumed equal * unknown but assumed equal

σx2 and σy2


assumed unequal
σx2 and σy2 Unknown,
Assumed Equal
(continued)

Population means, Forming interval


independent estimates:
samples
 The population variances
σx2 and σy2 known are assumed equal, so use
the two sample standard
deviations and pool them to
σx2 and σy2 unknown
estimate σ
σx2 and σy2
assumed equal *  use a t value with
(nx + ny – 2) degrees of
σx2 and σy2 freedom
assumed unequal
σx2 and σy2 Unknown,
Assumed Equal
(continued)

Population means,
independent
samples
The pooled variance is
σx2 and σy2 known

σx2 and σy2 unknown (nx  1)s 2x  (ny  1)s2y


sp2 
σx2 and σy2 nx  ny  2
assumed equal *
σx2 and σy2
assumed unequal
Confidence Interval,
σx2 and σy2 Unknown, Equal

σx2 and σy2 unknown

σx2 and σy2


assumed equal * The confidence interval for
μ1 – μ2 is:
σx2 and σy2
assumed unequal

sp2 sp2 sp2 sp2


(x  y)  t nx ny 2,α/2   μX  μY  (x  y)  t nx ny 2,α/2 
nx ny nx ny

(nx  1)s 2x  (ny  1)s2y


Where sp2 
nx  ny  2
Pooled Variance Example 1
You are testing two computer processors for speed.
Form a confidence interval for the difference in CPU
speed. You collect the following speed data (in Mhz):

CPUx CPUy
Number Tested 17 14
Sample mean 3004 2538
Sample std dev 74 56

Assume both populations are


normal with equal variances,
and use 95% confidence
Calculating the Pooled Variance

The pooled variance is:

S 
2
n x  1S x
2
 n y  1S y
2


17  174 2  14  156 2
 4427.03
(nx  1)  (ny  1) (17 - 1)  (14  1)
p

The t value for a 95% confidence interval is:

t nx ny 2 , α/2  t 29 , 0.025  2.045


Calculating the Confidence Limits

 The 95% confidence interval is

sp2 sp2 sp2 sp2


(x  y)  t nx ny 2,α/2   μX  μY  (x  y)  t nx ny 2,α/2 
nx ny nx ny

4427.03 4427.03 4427.03 4427.03


(3004  2538)  (2.054)   μX  μY  (3004  2538)  (2.054) 
17 14 17 14

416.69  μX  μY  515.31

We are 95% confident that the mean difference in


CPU speed is between 416.69 and 515.31 Mhz.
Pooled Variance Example 2
Two catalysts in a batch chemical process are being
compared for their effect on the output of the process
reaction. Form a confidence interval for the difference
between population means.

Catlyst 1 Catlyst2
Sample size 10 11
Sample mean 86 77
Sample std dev 3 6

Assume both populations are


normal with equal variances,
and use 95% confidence
Calculating the Pooled Variance

The pooled variance is:

S 
2
n x  1S x
2
 n y  1S y
2


10  132  11 162
 23.21
(n x  1)  (n y  1) (10 - 1)  (11 1)
p

The t value for a 95% confidence interval is:

t n x  n y 2 , α/2  t19 , 0.025  2.093


Calculating the Confidence Limits

 The 95% confidence interval is

sp2 sp2 sp2 sp2


(x  y)  t nx ny 2,α/2   μX  μY  (x  y)  t nx ny 2,α/2 
nx ny nx ny

23.21 23.21 23.21 23.21


(86 - 77)  (2.093)   μ X  μ Y  (86 - 77)  (2.093) 
10 11 10 11

4.59  μ X  μ Y  13.41

We are 95% confident that the mean difference in


catalyst performance is between 4.59 and 13.41.
σx2 and σy2 Unknown,
Assumed Unequal

Population means, Assumptions:


independent
 Samples are randomly and
samples
independently drawn

σx2 and σy2 known  Populations are normally


distributed
σx2 and σy2 unknown
 Population variances are
σx2 and σy2 unknown and assumed
assumed equal unequal
σx2 and σy2
assumed unequal *
σx2 and σy2 Unknown,
Assumed Unequal
(continued)

Forming interval estimates:


Population means,
independent
 The population variances are
samples
assumed unequal, so a pooled
variance is not appropriate
σx2 and σy2 known
 use a t value with  degrees
σx2 and σy2 unknown of freedom, where
2
 s2x s2y 
σx2 and σy2 ( )  ( )
assumed equal  n x n y 
v 2
σx2 and σy2
2 2
 sx   s 2

  /(nx  1)   y  /(ny  1)
assumed unequal *  nx 
n 
 y
Confidence Interval,
σx2 and σy2 Unknown, Unequal

σx2 and σy2 unknown

σx2 and σy2 The confidence interval for


assumed equal
μ1 – μ2 is:
σx2 and σy2
assumed unequal *
2 2
s2x s y s2x s y
(x  y)  t ,α/2   μX  μY  (x  y)  t ,α/2 
nx ny nx ny
2
 s2x s2y 
( )  ( )
v  n x n y 
Where 2
 s2 
2
 s2x 
  /(nx  1)   y  /(ny  1)
n 
 nx   y
Example
From a random sample of seven students in a marketing
research class that uses group-learning techniques, the
mean examination score was found to be 78.25 and the
sample standard deviation was 2.87. For an
independent random sample of ten students in another
marketing research class that does not use group-
learning techniques, the sample mean and standard
deviation of exam scores were 74.94 and 9.15,
respectively.

Estimate with 95% confidence the difference


between the two population mean scores; do not
assume equal population variances.
Example
ANSWER:
nx  7, x  78.25, sx  2.87, ny  10, y  74.94, s y  9.15

The degrees of freedom are given by

 s 2   s 2  
2
 s 2  2  s22 
2

   1    2     /(n1  1)    /(n2  1)  
1
11
 n1   n2    n1   n2  

  .05  t , / 2  t11,.025  2.201


The with 95% confidence interval for the difference
between the two population mean scores is
(x  y)  t  ( s / n ) ( s / n )  (78.25 – 74.94)  (2.201)((3.09)
 , / 2
2
x x
2
y y

3.31  6.80 or -3.49 < x   y < 10.11


Additional Exercises for the
difference between two
population means
Example 1
A dependent random sample from two normally
distributed populations gives the following results:
n = 15, d = 20.5, and s = 2.4. d

Find the margin of error for a 95% confidence


interval for the difference in the means of the
two populations.

ANSWER:
n  15, df  n  1  14, tn1, / 2  t14,.025  2.1448

Margin of error: ME = tn1, / 2  sd / n  (2.1448)(2.4/ 15)  1.329

d  ME  20.5  1.329 or 19.171 < d < 21.829


Example 2
Independent random sampling from two normally
distributed populations gives the following results:
n = 64, x = 441,  = 20, n = 36, y = 361, and  = 25.
x x y y

Find the margin of error for a 90% confidence


interval for the difference in the means of the two
populations.

ANSWER:
  0.10  z / 2  z0.05  1.645
Margin of error: ME  z . ( / n )  ( / n )  1.645
 /2
2
x x
2
y y (400 / 64)  (625 / 36)

= (1.645)(4.859) = 7.993
Example 2

Find the 95% confidence interval for the difference in the


means of the two populations.

ANSWER:
( x  y )  ME  80  7.993 or 72.007 <  X  Y < 87.993

Based on the above confidence interval, is there


evidence that the population means are different?

ANSWER:
This confidence interval does not include zero,
indicating strong evidence that the population
means are different.
Example 3
Suppose, for a random sample of 250 firms that revalued their
fixed assets, the mean ratio of debt to tangible assets was
0.528 and the sample standard deviation was 0.151. For an
independent random sample of 500 firms that did not revalue
their fixed assets was 0.502 and the sample standard deviation
was 0.162. Find a 99% confidence interval for the difference
between the two population means.

ANSWER:
nx  250, x  0.528, sx  0.151, ny  500, y  0.502, sy  0.162, z.005  2.575
The 99% confidence interval for the difference between the two
population mean scores is
( x  y )  z  ( s / n ) ( s / n )  (0.528 – 0.502)  (2.575)((0.01199)
/2
2
x x
2
y y

0.026  0.031 or -0.005 < x   y < 0.057


Confidence Interval for Two
Population Proportion
Two Population Proportions
Goal: Form a confidence interval for
Population the difference between two
proportions population proportions, Px – Py

Assumptions:
Both sample sizes are large (generally at
least 40 observations in each sample)

The point estimate for


the difference is
pˆ x  pˆ y
Two Population Proportions
(continued)

 The random variable


Population
proportions
(pˆ x  pˆ y )  (px  p y )
Z
pˆ x (1 pˆ x ) pˆ y (1 pˆ y )

nx ny

is approximately normally distributed


Confidence Interval for
Two Population Proportions

Population The confidence limits for


proportions
Px – Py are:

pˆ x (1 pˆ x ) pˆ y (1 pˆ y )
(pˆ x  pˆ y )  Z / 2 
nx ny
Example:
Two Population Proportions
Form a 90% confidence interval for the
difference between the proportion of
men and the proportion of women who
have college degrees.

 In a random sample, 26 of 50 men and


28 of 40 women had an earned college
degree
Example:
Two Population Proportions
(continued)

Men: ˆp x  26  0.52
50

Women: ˆp y  28  0.70
40

pˆ x (1 pˆ x ) pˆ y (1 pˆ y ) 0.52(0.48) 0.70(0.30)


    0.1012
nx ny 50 40

For 90% confidence, Z/2 = 1.645


Example:
Two Population Proportions
(continued)

The confidence limits are:


pˆ x (1 pˆ x ) pˆ y (1 pˆ y )
(pˆ x  pˆ y )  Z α/2 
nx ny

 (.52  .70)  1.645 (0.1012)

so the confidence interval is

-0.3465 < Px – Py < -0.0135

Since this interval does not contain zero we are 90% confident that the
two proportions are not equal
Example

In a random sample of 125 large retailers, 90


used regression as a method of forecasting.
In an independent random sample of 160
small retailers, 80 used regression as a
method of forecasting.

Find a 95% confidence interval for the


difference between the two population
proportions.
Example

ANSWER:
nx  125, pˆ x  90 /125  0.72, ny  160, pˆ y  80 /160  0.50, z / 2  z.025  1.96

The margin of error ME is

pˆ x (1  pˆ x ) pˆ y (1  pˆ y ) (0.72)(0.28) (0.50)(0.50)
ME = z / 2
nx

ny
 1.96
125

160
 0.1104
The 95% confidence interval for the difference
between the two population proportions is
( pˆ x  pˆ y )  ME  (0.72  0.50)  0.1104

0.22  0.1104 or 0.1096 < P  P < 0.3304


x y
Chapter Summary
 Compared two dependent samples (paired samples)
 Formed confidence intervals for the paired difference
 Compared two independent samples
 Formed confidence intervals for the difference between two
means, population variance known, using z
 Formed confidence intervals for the differences between two
means, population variance unknown, using t
 Formed confidence intervals for the differences between two
population proportions
 Determined required sample size to meet confidence
and margin of error requirements

You might also like