Chapter 19 main topics
Two-sample t procedures
Robustness of two-sample t procedures
Details of the t approximation
Avoid the pooled two-sample t procedures
Sociology 360
Statistics for Sociologists I
Chapter 19
Two-Sample Problems
Avoid inference about standard deviations
Topic to omit:
The F test for comparing two standard deviations
Chapter 19 homework assignment
One- vs. two-sample t-tests
Problems: 19.6, .7, .9, .14, .16, .30, .32
One-sample test:
Is a population mean >, <, or different from some fixed value?
Two-sample test:
Goal: Compare responses to two treatments or characteristics of two
populations.
Independent samples for each treatment or population (i.e., the data
are not matched pairs).
Are the population means the same as each other, or is one greater
than the other?
Examples of two-sample tests
Two-sample t-test: assumptions
Research questions:
We have an SRS from each of two populations or an experiment with
two randomly assigned groups.
Do men have higher salaries than women?
We can consider subgroups of individuals in a random sample (e.g.,
men, women) as independent samples from their respective
populations.
Where do people travel farther to work, Detroit or Los Angeles?
The samples are independent.
The individuals that make up the two samples are not related to each
other (cannot be paired or matched).
If cases in the two samples can be paired or matched, used a matched
pairs design.
Problems: Identify the appropriate method
Null hypothesis for a two-sample test
Is this a one-sample, two-sample, one-sample matched pairs problem?
Most frequently, the null hypothesis is that the two means are the
same.
Would you perform a hypothesis test or find a confidence interval?
Option 1:
H0: !1 = !2
1. Do Burger King Whoppers have more than 670 calories?
Option 2:
2. Do Whoppers have more calories than Big Macs?
H0: !1 - !2 = 0
3. Mothers of twins were surveyed and asked how often in the past
month strangers had asked whether the twins were identical.
Of course both options mean the same thing since Option 2 is obtained
algebraically from Option 1 by subtracting !2 from both sides.
4. Are parents equally strict with boys and girls? In a random
sample of families, researchers asked a brother and sister from
each family to rate how strict their parents were.
Option 2, however, in stating that the difference between the two
population means is zero, focusses our attention on the proper sample
statistic for inference, the difference in sample means:
(x1 x2)
7
Possible alternative hypotheses
Whoppers and Big Macs
Two-tailed:
Do Whoppers have more calories than
Big Macs?
Option 1: HA: !1 ! !2 or
Let !w = Mean calories in Whoppers
Option 2: HA: !1 - !2 ! 0
One-tailed (right):
Option 1: HA: !1 > !2 or
Option 2: HA: !1 - !2 > 0
Let !bm = Mean calories in Big Macs
In each case,
Option 1 is
equivalent
to Option 2.
Write the null and alternative
hypotheses using both methods
(option 1 and option 2)
One-tailed (left)
Option 1: HA: !1 < !2 or
Option 2: HA: !1 - !2 < 0
Sampling distribution of the difference in means
10
Sampling distribution of the difference in means
Our interest centers on the difference between the two population
means, !1 - !2, which I will emphasize is a single numerical value by
writing it within parentheses, like this: (!1 - !2).
We can estimate (!1 - !2) by its sample analog, (x1 x2) .
G B = 0.4
Since (x1 x2) is a number calculated only from sample
information, it is a statistic.
As a statistic, (x1 x2) has a sampling distribution.
The sampling distribution of (x1 x2) will be Normal under the right
circumstances.
And the mean of that sampling distribution will be (!1 - !2).
All that remains to be discovered about the sampling distribution is
its standard error (or estimated standard deviation).
11
12
Standard error
Degrees of freedom
The two-sample t statistic follows approximately the t distribution with
a standard error SE reflecting variation from both samples.
Since we are using a standard error, estimated from the data, rather
than a known standard deviation, the procedures will be t rather than z
based.
In fact, its standard error is simply the square root of the sum of the
standard errors of each sample considered separately:
SE =
That means we need to have a value for the degrees of freedom of
the t distribution.
s21 s22
+
n1 n2
A conservative approach is to use the smaller of (n1 - 1) and (n2 - 1) as
the degrees of freedom.
This rule is conservative in that it may give a value larger than is
really appropriate, which leads to wider confidence intervals and
larger P-values (meaning we are a bit less likely to reject H0).
df
You should use this rule for problems done by hand; for example, on
the exam.
1"2
13
14
Two-sample t-test
Ideal number of children
The null hypothesis is that both population means !1 and !2 are equal,
Do men and women have different
beliefs about the ideal number of
children in a family?
thus their difference is equal to zero:
H0: (1 2) = 0
2004 General Social Survey asked,
with either a one-sided or a two-sided alternative hypothesis.
What do you think is the ideal
number of children for a family to
have?
We construct a t statistic via the usual comparison of the observed
statistic to the hypothesized value:
Here is a summary of the responses:
(x1 x2) (1 2)0
t=
SE
=
Gender
(x1 x2) 0
!2
s1
s22
n1 + n2
Male
2.58 0.89 374
Female
2.62 0.92 416
This statistic has an approximate t distribution if H0 is true.
15
16
Ideal number of children
Gender
Confidence interval
Male
2.58 0.89 374
Female
2.62 0.92 416
As before, we often supplement a
hypothesis test by a CI.
(And sometimes we omit the test.)
For two-sample problems, the
question is to estimate the mean of the
distribution of the difference scores in
the population.
What are the null and alternative
hypotheses?
Choose an # level.
Draw a picture of the sampling
distribution and the p-value you are
looking for.
The statistic continues to be
(x1 x2)
and the confidence interval is
Perform the test and evaluate the
result.
Note:
CI = (x1 x2) t
.892 .922
+
= .064
374 416
s21 s22
+
n1 n2
17
Effects of Reading Program
on Reading Comprehension
95% CI for the example
Gender
New reading activities for
elementary school children
Male
2.58 0.89 374
Female
2.62 0.92 416
18
RA 3rd graders to treatment group
and control group
Compare reading comprehension
df = min(373, 416) = 373
For C = .95, t373
z = 1.96.
!
s21 s22
CI = (x1 x2) t
+
n1 n2
!
.892 .922
Note:
+
= .064
374 416
Calculate a 95% CI for the effect
of the new reading activities on
reading comprehension
Note:
19
11.012 17.152
+
= 4.31
21
23
20
Robustness of the two-sample t procedures
Details of the t approximation
We must have an SRS or randomized comparative experiment.
The actual distribution of the two-sample t statistic is not really t (!).
t procedures are only exact if the population distribution is exactly
normal.
But it is a distribution that can be very closely approximated by a t
distribution with this number of degrees of freedom:
But, we will consider two-sample t procedures good enough
approximations in these cases:
df =
1. When n1 + n2 < 15, the data from both samples must be close to
normal (roughly symmetric, single peak) and without outliers.
2. When 15 " n1 + n2 < 40, mild skewness is acceptable, but not
outliers.
1
n11
s22
s21
n1 + n2
! 2 "2
s1
n1
"2
1
n21
! 2 "2
s2
n2
This is known as the Satterthwaite approximation.
The formula typically produces a non-integer degree of freedom value.
3. When n1 + n2 " 40, the t statistic will be valid even with strong
skewness.
Computers routinely calculate this approximation.
You should recognize it when you see it.
But on exams, use the smaller of (n1 - 1) and (n2 - 1) instead.
21
22
Avoid the pooled two-sample t procedures
Avoid inference about standard deviations
Your textbooks author, Moore, recommends completely avoiding the
pooled two-sample t procedures, and I agree.
In an extension of the ideas behind not using the pooled t procedures,
Moore also warns us not to try to make inferences about standard
deviations at all, at least in smaller samples, and at least without expert
statistical help.
Pooled procedures are often the default choice in stat packages (e.g.,
Stata, including the current version, 10.0).
The reasons that the pooled approach is often used are: 1) it was
historically easier to calculate; 2) it leads to a smaller estimated
standard error when the assumptions are met; 3) it amounts to a special
case of a very important technique called the analysis of variance.
But Moore is right to emphasize: 1) the assumption of normality and
equal variances cant be tested effectively when the sample sizes are
small (i.e., when the pooled procedure would be most advantageous);
2) the pooled procedure can lead to incorrect inferences when the
assumptions arent met; 3) the reduction in SEs is small for large ns.
The problem is that it is hard to make a useful test of the hypothesis
that the standard deviations in two populations are the same unless we
are willing to assume the shapes of the two distributions are the same.
(Things are even easier if we assume the shapes are normal.)
But when the sample is small there is no easy way to tell if the shapes
of two distributions are the same.
So, says Moore, avoid testing of hypotheses that standard deviations
are the same.
My only reservation about this recommendation would be in cases
where there are strong reasons to expect normality in both populations.
So you are asked to know not to accept a default assumption of equal
(pooled) variances, and why not!
23
24