Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
51 views12 pages

Comparison of Two Sample Means

This document discusses testing whether the means of two independent populations are equal or not. It considers several cases including when population variances are known or unknown, equal or unequal, and when sample sizes are large or small. The test statistics and decision rules for both one-tailed and two-tailed tests are provided for each case.

Uploaded by

tithy bhuiyan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views12 pages

Comparison of Two Sample Means

This document discusses testing whether the means of two independent populations are equal or not. It considers several cases including when population variances are known or unknown, equal or unequal, and when sample sizes are large or small. The test statistics and decision rules for both one-tailed and two-tailed tests are provided for each case.

Uploaded by

tithy bhuiyan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Comparison of Two Independent Sample Means

Suppose that, we have two independent random samples if sizes n1 and n2


respectively from two populations. These populations are assumed to be
normal with means μ1 and μ2 and variances 𝜎12 and 𝜎22 respectively.

Let, 𝑥̅ 1 and 𝑥̅ 2 are the sample means and 𝑠12 and 𝑠22 are the sample
variances.
We want to test whether the population means are equal or not. That is,
H0: μ1 = μ2.
Since, the test statistic depends on some assumptions like whether the
variances 𝜎12 and 𝜎22 are known or not, whether the variances 𝜎12 and 𝜎22
are equal or not and whether the samples are large or small, here we will
consider the following three cases:

a. 𝜎12 and 𝜎22 are known


b. 𝜎12 and 𝜎22 are unknown but equal
i) Samples are large
ii) Samples are small
c. 𝜎12 and 𝜎22 are unknown and unequal
i) Samples are large
ii) Samples are small
a. 𝝈𝟐𝟏 and 𝝈𝟐𝟐 are known:
We are to test-
H0: μ1 = μ2 or H0: μ1 - μ2= 0
against H1: μ1≠ μ2
Assumptions:
i) The samples are taken from two independent normal populations.
ii) 𝜎12 and 𝜎22 are known.
Here, (𝑥̅ 1-𝑥̅ 2) is our relevant statistic.
𝜎12 𝜎22
V(𝑥̅ 1-𝑥̅ 2) = V(𝑥̅ 1)+V(𝑥̅ 2)= +
𝑛1 𝑛2

Therefore, the test statistic,


(𝒙 ̅𝟐 )−𝐄(𝒙
̅𝟏 −𝒙 ̅𝟐 )
̅𝟏 −𝒙 𝑥̅ 1−𝑥̅ 2
𝑧= = ~ N(0,1) [for known 𝜎12 and 𝜎22 ]
̅𝟐 )
̅𝟏 −𝒙
√𝑽(𝒙 𝜎2 𝜎2
√ 1+ 2
𝑛1 𝑛2

Let, the level of significance = α


Since, the test is a two tailed test, the critical values of the test statistic are
𝑧α⁄2 and 𝑧1−α⁄2 .

Comment: If 𝑧 > 𝑧1−α⁄2 or 𝑧 < 𝑧α⁄2 , we may reject the null hypothesis.
Finally, we conclude that, the population means are not equal (i.e. μ1≠
μ2).
Decision rule for one-tailed test:
For right tailed test: H0: μ1= μ2 against H1: μ1 > μ2
Reject H0 if 𝑧 > 𝑧1−𝛼
For left tailed test: H0: μ1= μ2 against H1: μ1 < μ2
Reject H0 if 𝑧 < 𝑧𝛼

𝒃. 𝒘𝒉𝒆𝒏 𝝈𝟐𝟏 and 𝝈𝟐𝟐 are unknown but equal:


We are to test-
H0: μ1 = μ2 or, H0: μ1 - μ2 = 0
against H1: μ1≠ μ2
Assumptions:
i) The samples are taken from two independent normal populations.
ii) 𝜎12 and 𝜎22 are unknown but equal.
Here, (𝑥̅ 1-𝑥̅ 2) is our relevant statistic.

Let 𝜎12 = 𝜎22 = 𝜎 2 . Now


𝜎12 𝜎22 𝜎2 𝜎2 1 1
V(𝑥̅ 1-𝑥̅ 2) = V(𝑥̅ 1)+V(𝑥̅ 2)= + = + = 𝜎 2( + )
𝑛1 𝑛2 𝑛1 𝑛2 𝑛1 𝑛2

The pooled estimate of 𝜎 2 is-

2 ̅𝟏 )𝟐 +∑(𝒙𝟐𝒊 −𝒙
∑(𝒙𝟏𝒊 −𝒙 ̅ 𝟐 )𝟐 (𝑛1 −1)𝑠12 +(𝑛2 −1)𝑠22
𝑆𝑝𝑜𝑜𝑙𝑒𝑑 = = = s2(Say)
(𝒏𝟏 −𝟏)+(𝒏𝟐 −𝟏) (𝑛1 +𝑛2 −2)

1 1
So, V(𝑥̅ 1-𝑥̅ 2) = 𝑠 2 ( + )
𝑛1 𝑛2

i) When samples are large (that is, 𝒏𝟏 ≥ 𝟑𝟎 𝒂𝒏𝒅 𝒏𝟐 ≥ 𝟑𝟎)


The test statistic,
̅𝟏 −𝒙
𝒙 ̅𝟐
z= 𝟏 𝟏
~ N(0,1)
𝒔√(𝒏 +𝒏 )
𝟏 𝟐

Let, the level of significance = α


Since, the test is a two tailed test, the critical values of the test statistic are
𝑧α⁄2 and 𝑧1−α⁄2 .

Comment: If 𝑧 > 𝑧1−α⁄2 or 𝑧 < 𝑧α⁄2 , we may reject the null hypothesis.
Finally, we conclude that, the population means are not equal (i.e. μ 1≠
μ2).
Decision rule for one-tailed test:
For right tailed test: H0: μ1= μ2 against H1: μ1 > μ2
Reject H0 if 𝑧 > 𝑧1−𝛼
For left tailed test: H0: μ1= μ2 against H1: μ1 < μ2
Reject H0 if 𝑧 < 𝑧𝛼
ii) When samples are small (that is, 𝒏𝟏 < 𝟑𝟎 𝒂𝒏𝒅 𝒏𝟐 < 𝟑𝟎)
The test statistic,
̅𝟏 −𝒙
𝒙 ̅𝟐
t= 𝟏 𝟏
~ 𝑡𝑛1 +𝑛2 −2 with n1+n2-2 degrees of freedom.
𝒔√(𝒏 +𝒏 )
𝟏 𝟐

Let, the level of significance = α


Since, the test is a two tailed test, the critical values of the test statistic are
±𝑡α⁄2,𝑛1 +𝑛2 −2

Comment: If 𝑡 < −𝑡α⁄2,𝑛1 +𝑛2 −2 or 𝑡 > 𝑡α⁄2,𝑛1 +𝑛2 −2 , we may reject the
null hypothesis. Finally, we conclude that, the population means are not
equal (i.e. μ1≠ μ2).
Decision rule for one-tailed test:
For right tailed test: H0: μ1= μ2 against H1: μ1 > μ2
Reject H0 if 𝑡 > 𝑡𝛼,𝑛1 +𝑛2 −2

For left tailed test: H0: μ1= μ2 against H1: μ1 < μ2


Reject H0 if 𝑡 < −𝑡𝛼,𝑛1 +𝑛2 −2

𝒄. 𝒘𝒉𝒆𝒏 𝝈𝟐𝟏 and 𝝈𝟐𝟐 are unknown and unequal:


We are to test-
H0: μ1 = μ2 or, H0: μ1 - μ2 = 0
against H1: μ1≠ μ2
Assumptions:
i) The samples are taken from two independent normal populations.
ii) 𝜎12 and 𝜎22 are unknown and unequal.
In this case,
𝜎12 𝜎22 𝑠12 𝑠22
V(𝑥̅ 1-𝑥̅ 2) = + = +
𝑛1 𝑛2 𝑛1 𝑛2

Where, 𝑠12 and 𝑠22 are two estimates of population variances 𝜎12 and 𝜎22 .

i) When samples are large (𝑛1 ≥ 30 𝑎𝑛𝑑 𝑛2 ≥ 30)


The test statistic,
̅𝟏 −𝒙
𝒙 ̅𝟐
z= ~ N(0,1)
𝒔𝟐 𝒔𝟐
√( 𝟏 + 𝟐 )
𝒏𝟏 𝒏𝟐

Let, the level of significance = α


Since, the test is a two tailed test, the critical values of the test statistic are
𝑧α⁄2 and 𝑧1−α⁄2 .

Comment: If 𝑧 > 𝑧1−α⁄2 or 𝑧 < 𝑧α⁄2 , we may reject the null hypothesis.
Finally, we conclude that, the population means are not equal (i.e. μ 1≠
μ2).
Decision rule for one-tailed test:
For right tailed test: H0: μ1= μ2 against H1: μ1 > μ2
Reject H0 if 𝑧 > 𝑧1−𝛼
For left tailed test: H0: μ1= μ2 against H1: μ1 < μ2
Reject H0 if 𝑧 < 𝑧𝛼
ii) When samples are small (𝑛1 < 30 𝑎𝑛𝑑 𝑛2 < 30)
The test statistic,
̅𝟏 −𝒙
𝒙 ̅𝟐
t= ~ 𝑡𝑣 with 𝑣 degrees of freedom.
𝒔𝟐 𝒔𝟐
√( 𝟏 + 𝟐 )
𝒏𝟏 𝒏𝟐

𝑠2 𝑠2
(𝑛1 +𝑛2 )2
1 2
Where, 𝑣 = 𝑠2 𝑠2
( 1 )2 (𝑛2 )2
𝑛1
+ 2
𝑛1−1 𝑛2−1

Let, the level of significance = α


Since, the test is a two tailed test, the critical values of the test statistic are
± 𝑡α⁄2,𝑣

Comment: If 𝑡 < − 𝑡α⁄2,𝑣 or > 𝑡α⁄2,𝑣 , we may reject the null hypothesis.
Finally, we conclude that, the population means are not equal (i.e. μ 1≠
μ2).
For right tailed test: H0: μ1= μ2 against H1: μ1 > μ2
Reject H0 if 𝑡 > 𝑡𝛼,𝑣

For left tailed test: H0: μ1= μ2 against H1: μ1 < μ2


Reject H0 if 𝑡 < −𝑡𝛼,𝑣

Problem: An employer in a garments factory argues that it is justified to


pay male workers more than female workers on the ground that they are
more productive. Can the employer’s claim be attained at the 5% level if
the weekly output of a sample of 15 male workers produced a mean of
350 men’s wear with a standard deviation of 18 and the weekly outputs
of 10 female workers produced a mean of 345 such wears with a standard
deviation of 21? Assume equality of variance in the population.
Solution:
Hypothesis: H0: μ1 = μ2 against H1: μ1> μ2
Level of significance, 𝛼 = 0.05
The test statistic,
̅𝟏 −𝒙
𝒙 ̅𝟐
t= 𝟏 𝟏
~ 𝑡𝑛1 +𝑛2 −2 with n1+n2-2 degrees of freedom.
𝒔 √( + )
𝒏𝟏 𝒏𝟐

Where, 𝑋̅1 = 350, 𝑋̅2 = 345, 𝑆1 = 18, 𝑆2 = 21, 𝑛1 = 15, 𝑛2 = 10

2
(𝑛1 − 1) ∗ 𝑆1 2 + (𝑛2 − 1) ∗ 𝑆1 2 (15 − 1) ∗ 182 + (10 − 1) ∗ 212
𝑆 𝑝𝑜𝑜𝑙𝑒𝑑 = =
𝑛1 + 𝑛2 − 2 23
= 369.783

Hence, 𝑆𝑝𝑜𝑜𝑙𝑒𝑑 = 19.23

350−345
Now, 𝑡𝑐𝑎𝑙 = = 0.63
1 1
19.23∗√( + )
15 10

Critical value, 𝑡𝛼, 𝑛1 +𝑛2 −2 = 𝑡0.05,23 = 1.714

Critical region: 𝑡 > 1.714


Conclusion: Since the calculated value of t falls in the acceptance region,
we may not reject null hypothesis. That is the difference between sample
means may not be statistically significant. Hence the claim of the
employer is not justified.
Problem: Suppose we want to see if there is any difference between the
average number of children born to the tribal women and the non-tribal
women of a certain administrative division in Bangladesh. A random
sample of 12 women from tribal population and 15 women from non-
tribal population were drawn and their number of children recorded. The
average number of children born to these two groups of women were 4.5
and 3.4 respectively. The population variances were 1 and 1.5
respectively. Use 5 percent level of significance to see if the sample data
reflect any difference in the mean number of children born in the
population.
Solution:
Hypothesis: H0: μ1 = μ2 against H1: μ1≠ μ2
Level of significance, 𝛼 = 0.05
The test statistic,
𝑥̅ 1−𝑥̅ 2
z= ~ N(0,1)
𝜎2 𝜎2
√ 1 + 2
𝑛1 𝑛2

Where, 𝑥̅1 = 4.5 , 𝑥̅2 = 3.4, 𝜎12 = 1, 𝜎22 = 1.5, 𝑛1 = 12, 𝑛2 = 15


𝑥̅ 1−𝑥̅ 2 4.5 −3.4
z= = = 2.57
1 1.5
𝜎2 𝜎2 √ +
√ 1+ 2 12 15
𝑛1 𝑛2

Critical region: 𝑧 < −1.96 𝑜𝑟 𝑧 > 1.96


Conclusion: Since the calculated value of z falls in the critical region, we
may reject null hypothesis at 5% level of significance. So we can conclude
that there is difference between the true average number of children born
to tribal women and non-tribal women.
Comparison of Two Correlated Sample Means
In certain situations the two sample means under comparison are related
to each other because the observations occur in pairs. Thus, instead of
having two random samples, we have in effect on random sample of pairs.
Each pair of observation is often obtained from the same individual.
For example, to test the effectiveness of a weight reducing diet, we may
collect the weights of a set of patients before the experiment starts and
after the experiment completed. In such situations, we use paired t test.
Suppose, we have n independent pairs of observations (xi, yi), i= 1, 2, …,
n.
We want to test the hypothesis-
H0: μx = μy or we can also form it as, H0: μd = μx - μy = 0
H1: μd ≠ 0 where, di = xi - yi
Assumption:
d1,d2, …,dn constitute a random sample from a normal population
N(μd,𝜎𝑑2 ).
To test the hypothesis, the test statistic is-
̅
𝒅
t = 𝒔𝒅 ~ tn-1with (n-1) degrees of freedom.

√𝒏

Let, the level of significance, α


Since, the test is a two tailed test, the critical values of the test statistic are
±𝑡𝛼,𝑛−1 .
2
Comment: If 𝑡 < −𝑡𝛼,𝑛−1 or 𝑡 > 𝑡 𝛼,𝑛−1 that is, we may reject the null
2 2
hypothesis. Finally, we conclude that, the population means are not equal
(i.e. μx≠ μy).
For right tailed test: H0: μd = 0 against H1: μd >0
Reject H0 if 𝑡 > 𝑡𝛼,𝑛−1

For left tailed test: H0: μd = 0 against H0: μd < 0


Reject H0 if 𝑡 < 𝑡𝛼,𝑛−1

Problem: The following data relate to the annual sales data in million
taka for 2014 and 2015 from 10 randomly selected companies in Dhaka
city. Test at 5% level that there is no difference between the two years
sales record.
Company A B C D E F G H I J
Sales 2014 126 56 86 62 96 36 52 50 35 53
Sales 2015 123 49 79 59 92 35 48 48 32 50

Solution:
Hypothesis: H0: μx-μy= 0 or, H0: μd=0 against H1: μd≠0 ; where,
𝜇𝑥 𝑎𝑛𝑑 𝜇𝑦 are average sales (in million taka) in Dhaka city for the year
2014 and 2015, respectively.
Level of significance, 𝛼 = 0.05
𝑑̅
Test statistic: 𝑡 = 𝑆𝑑 ~𝑡𝑛−1

√𝑛
To carry out the test, let us compute the mean and standard deviation of
the differences:
Company A B C D E F G H I J
Sales 126 56 86 62 96 36 52 50 35 53
2014 (𝑥)
Sales 123 49 79 59 92 35 48 48 32 50
2015 (𝑦)

𝑑𝑖 3 7 7 3 4 1 4 2 3 3

𝑑𝑖2 9 49 49 9 16 1 16 4 9 9

∑ 𝑑 −𝑛𝑑 2 ̅2 2
∑𝑑 37 171−10∗(3.7)
𝑑̅ = 𝑖 = = 3.7 , 𝑆𝑑 = √ 𝑖 =√ = 1.95
𝑛 10 𝑛−1 9

3.7
Now, 𝑡𝑐𝑎𝑙 = 1.95 = 6.0

√10

Critical values, ±𝑡𝛼,𝑛−1 = ±𝑡0.025,9 = ±2.262


2

Critical region: t < -2.262 or t > 2.262


Conclusion: Since the calculated value of t falls in the critical region, we
may reject null hypothesis. That is, there is significant evidence to
conclude that sales values of the companies between 2014 and 2015 differ
significantly.
Exercise: Memory capacity of the students were tested before and after
training given to the students. State whether the training was effective or
not from the following scores? Consider 𝛼 = 0.05 to test the hypothesis.
Students # 1 2 3 4 5 6 7 8 9
Score before Training 12 9 15 19 13 18 7 10 17
Score after Training 5 23 20 13 8 21 8 16 12
[Hint: H0: μx-μy= 0 or, H0: μd=0 against H1: μd>0 ; μx=mean score after
training, μy=mean score before training]

You might also like