Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
10 views32 pages

Biometry242 MultipleComparisons

Uploaded by

estian.maritz1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views32 pages

Biometry242 MultipleComparisons

Uploaded by

estian.maritz1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Multiple Comparisons

BMT 242, Zar Chapter 11, page 240-250, 261-262


Slides created by CS vd Westhuizen, J © The content of this presentation is confidential.
Nienkemper-Swanepoel, M Vosloo
Objectives

• Understand why multiple t tests are inappropriate for multisample hypothesis


tests.
• Understand and perform the following multiple comparison procedures
(MCP):
• Tukey, Tukey-Kramer & Games and Howell
• Dunnett test
• Fisher’s LSD
• Bonferroni correction
• List and explain the assumptions underlying the various multiple comparison
procedures.
• Construct and interpret confidence intervals for the differences in means.

2
Background

• Multiple comparison procedure (MCP)


• Compare k means simultaneously
• Significance level 𝛼 → the probability of committing at least one Type I error.

• Type I error increases with multiple testing → probability to incorrectly reject at least one of the
𝐻0 ’s (accepting that all 𝐻0 ’s are true)

𝐶 3
Type I error= 1 − 1 − 𝛼 = 1 − 0.95 = 0.14

• This is called the familywise error rate, because the error rate belongs to a family of hypothesis
tests (or a family of comparisons).
Background

Determining the number of possible two-sample tests:

𝑘 𝑘−1
= 𝑘 𝐶2
2
Suppose that 𝑘 = 4:
4 3
=6
2

Each test will compare two unique means at a time, resulting in 6 sets of hypotheses (𝐻0 : 𝜇𝐴 = 𝜇𝐵 and
𝐻1 : 𝜇𝐴 ≠ 𝜇𝐵 ):
𝐻0 : 𝜇1 = 𝜇2 ; 𝐻0 : 𝜇1 = 𝜇3 ; 𝐻0 : 𝜇1 = 𝜇4 ;
𝐻0 : 𝜇2 = 𝜇3 ; 𝐻0 : 𝜇2 = 𝜇4 ; 𝐻0 : 𝜇3 = 𝜇4

Arbitrary values:
A and B
Tukey test

• Also known as:


• Honestly Significant Difference test (HSD test)
• Wholly Significant Difference test (WSD test)
• Not necessary to have a significant result from the ANOVA analysis

• Maintains the probability of familywise Type I error at or below 𝛼.

• Assumptions
• 𝑘 samples are drawn from normally distributed populations
• However, the Tukey test (as ANOVA) is robust against deviations from this assumption.
• Severe deviations from normality can be addressed by data transformations or using non-
parametric alternative tests.
• Homoscedasticity of variances
• Tukey is sensitive to deviations from homoscedasticity, more than ANOVA.
Tukey test
1. Calculate the mean of each sample (𝑘)
2. Arrange and number the means in increasing order (small to large)
3. Tabulate the pairwise mean differences
4. Calculate the pairwise difference in the following manner (Zar: Page 243):
• largest vs the smallest, largest vs the second smallest,…,largest vs the second largest
• Second largest vs the smallest, second largest vs the second smallest,…,second largest vs the third
largest, etc.
5. Calculate an appropriate standard error for the Tukey test:
𝑠2
𝑆𝐸 = → 𝑠 2 (pooled variance / MSE), 𝑛 (replications)
𝑛

6. Calculate the Tukey test statistic (𝑞):


𝑋ത𝐴 − 𝑋ത𝐵
𝑞=
𝑆𝐸
7. Compare the test statistic (𝑞) to a critical value (𝑞𝛼;𝜈;𝑘 )
𝛼 (significance level), 𝜈 (degrees of freedom), 𝑘 (number of groups)
If 𝑞 ≥ 𝑞𝛼;𝜈;𝑘 , reject 𝐻0 .
Procedural rule: If 𝐻0 is not rejected, no significant difference exists between any means enclosed by
the specific two means, therefore no further testing of the enclosed means are performed (Zar: Page
244).
Example 1, Zar 242-243
Identify
focus
Comparing strontium (Sr) concentrations problem
in different bodies of water
Adjust theory and
Do different bodies of water have different begin again
concentrations of strontium? Report Develop plan
results of action
• Confounding factors? Environment, climate,
organisms, etc. Analyze data
and form
• Use experimental units with the same general conclusions
characteristics. Collect
data
• Collect a random sample of 6 strontium
concentrations from the following lakes: Grayson’s
Pond, Beaver, Angler’s Cove, Appletree and Rock
River.
• We need to make sure that our observations are
independent.
• Measure the concentration as mg/ml of water. What
is the experimental unit?
• ANOVA has already been performed.
𝐻0 : 𝜇1 = 𝜇2 = 𝜇3 = 𝜇4 = 𝜇5
𝐻1 : At least one 𝜇𝑖 is different
Example 1, Zar 242-243
Conc. WaterBody Conc. WaterBody
28.2 Grayson's Pond 41 Appletree Lake
33.2 Grayson's Pond 44.1 Appletree Lake
36.4 Grayson's Pond 46.4 Appletree Lake
34.6 Grayson's Pond 40.2 Appletree Lake
29.1 Grayson's Pond 38.6 Appletree Lake
31 Grayson's Pond 36.3 Appletree Lake
39.6 Beaver Lake 56.3 Rock River
40.8 Beaver Lake 54.1 Rock River
37.9 Beaver Lake 59.4 Rock River
37.1 Beaver Lake 62.7 Rock River
43.6 Beaver Lake 60 Rock River
42.4 Beaver Lake 57.3 Rock River
46.3 Angler's Cove
42.1 Angler's Cove
43.5 Angler's Cove Measured
48.8 Angler's Cove as mg/ml
43.7 Angler's Cove
40.1 Angler's Cove

Decision: From the ANOVA, we reject ANOVA


the null hypothesis, because 𝐹 > Source of variation SS DF MS F
𝐹0.05 1 ,4,25 = 2.76. Water bodies 2193.442 4 548.361 56.156
Conclusion: At least one body of water Residual (factors
has a different mean strontium other than water 244.130 25 9.765
concentration. bodies)
Total 2437.572 29
Example 1, Zar 242-243
Steps 1 & 2
Level of factor Ranked Group mean
Grayson's Pond 1 32.1 9.765
Beaver Lake 2 40.2 𝑆𝐸 = = 1.276
6
Appletree Lake 3 41.1
Angler's Cove 4 44.1
Rock River 5 58.3

Step 3 Step 4 Step 5 Step 6 Step 7


Comparison Difference in means 𝑺𝑬 𝒒 Decision
26.2
5vs1 58.3 − 32.1 = 26.2 1.276 = 20.533 Reject H0
1.276
5vs2 58.3 − 40.2 = 18.1 1.276 14.185 Reject H0
5vs3 58.3 − 41.1 = 17.2 1.276 13.480 Reject H0
5vs4 58.3 − 44.1 = 14.2 1.276 11.129 Reject H0
4vs1 44.1 − 32.1 = 12.0 1.276 9.404 Reject H0
4vs2 44.1 − 32.1 = 3.9 1.276 3.056 Do not Reject H0
4vs3 Do not test
3vs1 41.1 − 32.1 = 9.0 1.276 7.053 Reject H0
3vs2 Do not test
2vs1 40.2 − 32.1 = 8.1 1.276 6.348 Reject H0

LINEAR interpolation: 𝑞0.05,25,5 ≈ 4.155


Example 1, Zar 242-243

Show results in graphical presentations

• Using spans of overlap


𝑋ത1 𝑋ത2 𝑋ത3 𝑋ത4 𝑋ത5

• Using symbols

Level of factor Group Mean Symbol


5 (Rock) 58.3 a
4 (Angler) 44.1 b
3 (Appletree) 41.1 b
2 (Beaver) 40.2 b
1 (Grayson) 32.1 c

Conclusion: From the MCP we see that Rock River has a different mean concentration of strontium
compared to all the other lakes. Angler, Appletree and Beaver lakes have the same mean concentration of
strontium. Lake Grayson has a different mean concentration of strontium compared to all the other lakes.
Example 3, Zar 215-216
(Chapter 10)

Comparing potassium content in different Identify


focus
varieties of wheat problem
Do different varieties of wheat of different Adjust theory and
contents of potassium? begin again
• We have seen that the ANOVA lead to significant Report Develop
results plan of
results. action
𝐻0 : 𝜇1 = 𝜇2 = 𝜇3 Analyze data
𝐻1 : At least one 𝜇𝑖 is different and form
conclusions
Critical value, based on a choice of α: Collect
data
𝐹 = 20.97 > 𝐹0.05 1 ;2;15 = 3.68
 Reject H0 at a 5% level
Conclusion: There is at least one wheat variety that has
a different mean potassium content.

Now perform a MCP.


Example 3, Zar 215-216
(Chapter 10)
Source of variation 𝑺𝑺 𝑫𝑭 𝑴𝑺 𝑭 ratio Level of Group
Ranked
factor mean
Variety of wheat 46.804 2 23.402 20.97
Variety A 1 25.667
Residual (other factors other
16.736 15 1.116
than wheat variety) Variety G 2 26.983
Total 63.540 17 Variety L 3 29.550

Difference 𝑞0.05;15;3 = 3.674


Comparison between 𝑆𝐸 q Decision
means
𝑋ത𝐴 𝑋ത𝐺 𝑋ത𝐿
1.116 3.883
3 vs. 1 3.883 6 0.431 Reject 𝐻0 Level of Group
Symbol
= 𝟗. 𝟎𝟎𝟗 factor mean
= 0.431
3: L 29.550 a
2.567
3 vs. 2 2.567 0.431 Reject 𝐻0 2: G 26.983 b
0.431
= 𝟓. 𝟗𝟓𝟔 1: A 25.667 b

1.316 Conclusion: From the MCP we see that


Do not Variety L has a different mean yield
2 vs. 1 1.316 0.431 0.431 reject 𝐻0 compared to varieties A and G.Varieties
= 𝟑. 𝟎𝟓𝟑
A and G do not differ in mean yield.
Confidence intervals for
group means

Confidence intervals for group population means: wheat varieties


Variety A:
𝑠2
𝑋ത𝐴 ± 𝑡𝛼(2);𝜈
𝑛
1.116
25.667 ± 2.131
6

24.748 ; 26.586

Similar for L and G.

Conclusion: From the MCP we see that


Variety L has a different mean yield
compared to varieties A and G (the
confidence intervals do not overlap).
Varieties A and G do not differ in mean
yield (the confidence intervals overlap).
Report the differences
using a Tukey confidence interval

• 𝑋ത𝐺 − 𝑋ത𝐴 ± 𝑞𝛼;𝜈;𝑘 𝑆𝐸 Level of


Ranked
Group
factor mean
Variety A 1 25.667
Variety G 2 26.983
Variety L 3 29.550

Variety G vs A:

1.116
(26.983 − 25.667) ± 3.674
6

−0.269 ; 2.901

Similar for the other differences.

Conclusion: From the MCP we see that Variety L has a different mean yield compared
to varieties A and G (the confidence intervals do not contain zero). Varieties A and G
do not differ in mean yield (the confidence interval contains zero).
Tukey-Kramer and Games-Howell

• Unequal sample sizes (Tukey-Kramer test):


• 𝑆𝐸 formula for pairwise comparisons with unequal sample sizes:

𝑠2 1 1
𝑆𝐸 = +
2 𝑛A 𝑛B

• Unequal variances (Welch approximation) (not for assessment in BMT 242):


• 𝑆𝐸 formula for unequal population variances:

1 𝑠𝐴2 𝑠𝐵2
𝑆𝐸 = +
2 𝑛𝐴 𝑛𝐵

Approximate 𝜈′ as with Behrens-Fisher approach


Example 1a, b, c, Zar 205-210
(from Chapter10)

Comparing four different feeds on the Identify


focus
weight of pigs problem
Do the different feeds lead to different weights of
Adjust theory and begin
pigs? again
• We have seen that the ANOVA lead to significant Report Develop plan of
results. results action
𝐻0 : 𝜇1 = 𝜇2 = 𝜇3 = 𝜇4 Analyze data and
𝐻1 : At least one 𝜇𝑖 is different form conclusions

Critical value, based on a choice of α: Collect


data
𝐹 = 12.041 > 𝐹0.05 1 ;3;15 = 3.29
 Reject H0 at a 5% level
Conclusion:There is at least one feed that leads to a
different mean.

Now perform a MCP.


Example 1a, b, c, Zar 205-210
(from Chapter10)
Source of variation 𝑺𝑺 𝑫𝑭 𝑴𝑺 𝑭 ratio Level of Group
Ranked
factor mean
SS due to feed 338.938 3 112.979 12.041
Feed A 2 64.62
Residual (other factors other than
140.75 15 9.383 Feed B 3 71.3
feed)
Total Feed C 4 73.35
479.688 18
Feed D 1 63.24
Comparison Difference SE 𝒒 Decision
𝑞0.05;15;4 = 4.076
4 vs 1 10.11
9.383 1 1
+ 10.11
= 6.958 Reject 𝐻0
𝑋ത1 𝑋ത2 𝑋ത3 𝑋ത4
2 4 5
1.453
= 1.453
Level of Group
8.73 Reject 𝐻0 Symbol
4 vs 2 8.73 1.453 = 6.008 factor mean
1.453
Feed A 64.62 a
2.05 Do not reject 𝐻0
4 vs 3 2.05 1.453 = 1.411
1.453 Feed B 71.3 b
9.383 1 1
8.06 Reject 𝐻0
Feed C 73.35 b
3 vs 1 8.06 +
2 5 5 = 5.883
1.37 Feed D 63.24 a
= 1.370

3 vs 2 6.68 1.370
6.68
= 4.876
Reject 𝐻0 Conclusion: From the MCP we see that
1.37
Feed A has the same mean weight as D,
1.38 Do not reject 𝐻0 but different means as B and C. Feed B
2 vs 1 1.38 1.370 = 1.007
1.37 and C have the same mean weight.
Confidence intervals for
group means

Confidence intervals for group population means: types of feed


Feed A:
𝑠2
𝑋ത𝐴 ± 𝑡𝛼(2);𝜈
𝑛
9.383
64.62 ± 2.131
5

61.701 ; 67.539

Similar for B, C and D.

Conclusion: From the MCP we see that


Feed A has the same mean weight as D
(confidence intervals overlap), but
different means as B and C (confidence
intervals do not overlap). Feed B and C
have the same mean weight (confidence
intervals overlap).

A B C D
Report the differences
using a Tukey confidence interval
Level of Group
Ranked
factor mean
Feed A 2 64.62
Feed B 3 71.3
Feed C 4 73.35
Feed D 1 63.24

Feed D vs Feed A:

9.383 1 1
−1.38 ± 4.076 +
2 5 5

−6.964 ; 4.204

Similar for the other differences.

Conclusion: From the MCP we see that Feed A has the same mean weight as D (confidence interval contains
zero), but different means as B and C (confidence intervals do not contain zero). Feed B and C have the same
mean weight (confidence interval contains zero). D and B, and D and C have different means (confidence
intervals do not contain zero).
Treatments vs a control

• Comparing a pre-determined treatment mean with all the other treatment means
• Comparing a standard treatment with new treatments
• Comparing a placebo (medicine without an active ingredient) with a treatment (medicine with
active ingredients)
Dunnett test for comparison with a control
Tukey test could be performed, but would not be as powerful as the Dunnett test in this particular design. The
Dunnett test can be one- or two-sided, but the Tukey test is always two-sided.

This procedure is similar to Tukey, except for the following:


𝑋ത 𝑐𝑜𝑛𝑡𝑟𝑜𝑙 −𝑋ത 𝐴
Test statistic: 𝑞′ =
𝑆𝐸
𝟐𝑠 2 1 1
Standard error: 𝑆𝐸 = , and when sample sizes are unequal 𝑆𝐸 = 𝑠2 + .
𝑛 𝑛𝐴 𝑛𝑐𝑜𝑛𝑡𝑟𝑜𝑙
2 2
𝑠𝑐𝑜𝑛𝑡𝑟𝑜𝑙
𝑠𝐴
If the variances are unequal then use 𝑆𝐸 = + with 𝜈′ using the Behrens-Fisher approach
𝑛𝐴 𝑛𝑐𝑜𝑛𝑡𝑟𝑜𝑙
(not for assessments in BMT 242).
Example 4, Zar 250
Identify
focus
Comparing different fertilisers with a problem
standard fertiliser for potato yield
Adjust theory and
Are the four new fertilisers better than the begin again
standard fertiliser? Report Develop plan
results of action
• The different fertilisers were applied on 80 plots
of land. Each plot randomly received one of the Analyze data
fertilisers. The manufacturer wishes to promote and form
at least one new fertiliser. conclusions
Collect
• Confounding factors? Environment, climate, data
organisms, etc.
• Use experimental units with the same
general characteristics.
• We need to make sure that our observations are
independent.
• Measure the potato yield in metric tons per
hectare. What is the experimental unit?
• ANOVA has already been performed.
𝐻0 : 𝜇1 = 𝜇2 = 𝜇3 = 𝜇4 = 𝜇5
𝐻1 : At least one 𝜇𝑖 is different
Example 4, Zar 250

Assume control group (𝑛 = 24) and treatment groups


𝑛 = 14.
Level of Group
Ranked
Group 2 is the Control (standard fertiliser). THIS MCP factor mean
is ONE-SIDED. We are assessing if the new fertilisers 1 1 17.3
are BETTER than the standard one. 2 2 21.7
3 3 22.1
𝐻0 : 𝜇2 ≥ 𝜇𝐴 vs. 𝐻1 : 𝜇2 < 𝜇𝐴
4 4 23.6
5 5 27.8
The following information is known:

𝑋ത1 = 17.3, 𝑋ത2 = 21.7, 𝑋ത3 = 22.1, 𝑋ത4 = 23.6,


Linear interpolation:
𝑋ത5 = 27.8 ′
𝑞0.05 1 ,5,75

𝑠 2 = 10.42, 𝜈 = 4 × 14 + 24 − 5 = 75 = 2.18 + 1 − 0.25 2.21 − 2.18


= 2.203
→ −𝟐. 𝟐𝟎𝟑
1 1
𝑆𝐸 = 10.42 + = 1.086
14 24
Example 4, Zar 250
Conclusion: From the MCP we see that only fertiliser 5 yields
𝐻0 : 𝜇2 ≥ 𝜇𝐴 vs. 𝐻1 : 𝜇2 < 𝜇𝐴 better results. The mean yield for fertiliser 5 is higher than that of
the control (standard fertiliser).

𝑞0.05 1 ,5,75 = −𝟐. 𝟐𝟎𝟑

Comparison Difference SE 𝒒 Decision Level of Group


Ranked
factor mean
1 1 17.3
−6.1 Reject 𝐻0
2 vs. 5 −6.1 1.086 = −5.617 2
1.086 2 21.7
(control)
3 3 22.1
−1.9 Do not reject 𝐻0
2 vs. 4 −1.9 1.086 = −1.750 4 4 23.6
1.086
5 5 27.8

2 vs. 3 Do not test

4.4 Do not reject 𝐻0


2 vs. 1 4.4 1.086 = 4.052
1.086
Example 4, Zar 250 (suppose that
the MCP is two-sided)

𝐻0 : 𝜇2 = 𝜇𝐴 vs. 𝐻1 : 𝜇2 ≠ 𝜇𝐴

75 − 60 ′
𝑝= = 0.25 𝑞0.05 2 ,5,75 = 2.47 + 1 − 0.25 2.51 − 2.47 = 2.5
120 − 60

Level of Group
Comparison Difference SE 𝒒 Decision Ranked
factor mean
1 1 17.3
−6.1 Reject 𝐻0
2 vs. 5 −6.1 1.086 = −5.617 2
1.086 2 21.7
(control)
3 3 22.1
−1.9 Do not reject 𝐻0
2 vs. 4 −1.9 1.086 = −1.750 4 4 23.6
1.086
5 5 27.8

2 vs. 3 Do not test


Conclusion: From the MCP we
see that fertilisers 1 and 5 have
4.4 Reject 𝐻0 different mean yields than the
2 vs. 1 4.4 1.086 = 4.052 control.
1.086
Example 3, Zar 215-216
(from Chapter 10)

Comparing potassium content in different Identify


focus
varieties of wheat problem
Do different varieties of wheat have different Adjust theory and
contents of potassium? begin again
• The ANOVA lead to the following results: Report Develop
results plan of
𝐹 = 20.97 > 𝐹0.05 1 ;2;15 = 3.68 action
Analyze data
 Reject H0 at a 5% level and form
𝑠 2 = 1.116 conclusions
Perform the Dunnett test and assume that Variety A is Collect
data
the control. Do varieties G and L lead to higher
potassium content?
𝐻0 : 𝜇1 ≥ 𝜇𝑋
𝐻1 : 𝜇1 < 𝜇𝑋
Where 𝑋 indicates G and L and 1 indicates A.
Example 3, Zar 215-216
(from Chapter 10)

Level of
Ranked Group mean
′ factor
𝑞0.05 1 ,3,15 = 2.07 → −2.07
Variety A 1 25.667
Variety G 2 26.983
Variety L 3 29.550

Comparison Difference SE 𝒒′ Decision

2(1.116) −3.883
Reject 𝐻0
1 vs. 3 −3.883 6 0.610
= −6.366
= 0.610
Reject 𝐻0
1 vs. 2 −1.316 0.610 −2.157

Conclusion: From the MCP we see that both varieties G and L have significant
higher mean potassium content than the control,Variety A.
Fisher’s LSD
To test for the difference in means:
𝐻0 : 𝜇𝐴 − 𝜇𝐵 = 0 vs. 𝐻1 : 𝜇𝐴 − 𝜇𝐵 ≠ 0
We would start by calculating the following:
𝑠2 𝑠2 𝑋ത 𝐴 −𝑋ത𝐵 −0
𝑋ത𝐴 − 𝑋ത𝐵 , 𝑠𝑋ത 𝐴−𝑋ത𝐵 = + , 𝑡 = then 𝐻0 will be rejected if: |𝑡| ≥ 𝑡𝜈;𝛼(2)
𝑛𝐴 𝑛𝐵 𝑠𝑋ഥ −𝑋ഥ
𝐴 𝐵

The formula for the LSD originates from the 𝑡 test:


|𝑡| ≥ 𝑡𝜈;𝛼(2)
|𝑋ത 𝐴 −𝑋ത𝐵 −0|
as ≥ 𝑡𝜈;𝛼(2)
𝑠𝑋ഥ −𝑋ഥ
𝐴 𝐵

multiply with 𝑠𝑋ത 𝐴−𝑋ത𝐵 :


𝑋ത𝐴 − 𝑋ത𝐵 ≥ 𝑡𝜈;𝛼(2) × 𝑠𝑋ത𝐴−𝑋ത𝐵
Then, 𝑡𝜈;𝛼(2) × 𝑠𝑋ത 𝐴−𝑋ത𝐵 is called the LSD value (Least Significant Difference)

→ smallest difference that is considered to be significant


→ any mean difference ≥ 𝑳𝑺𝑫 is a significant difference

MAY ONLY BE USED IF THE ANOVA RESULT WAS SIGNIFICANT!


Example 3, Zar 215-216
(from Chapter 10)

Comparing potassium content in different Identify


varieties of wheat focus
problem
Do different varieties of wheat have different
contents of potassium? Adjust theory and
• The ANOVA lead to the following results: begin again
𝐹 = 20.97 > 𝐹0.05 1 ;2;15 = 3.68 Report Develop
results plan of
 Reject H0 at a 5% level action
𝑠𝑝2 = 1.116 Analyze data
and form
Assumptions of LSD conclusions
• 𝑘 samples are drawn form normally distributed Collect
populations data
• However, the LSD test (as ANOVA) is
robust against deviations from this
assumption.
• Severe deviations from normality can be
addressed by data transformations or using
non-parametric alternative tests.
• Homoscedasticity of variances
• LSD is sensitive to deviations from
homoscedasticity, more than ANOVA.
Example 3, Zar 215-216
(from Chapter 10)

1.116 1.116
𝐿𝑆𝐷 5% = 𝑡15;0.05 2 × +
6 6
= 2.131 × 0.610
= 𝟏. 𝟑
Any treatment difference greater than or equal to 1.3 is regarded as a significant difference.

Level of
Group mean Variety G Variety A
factor
Variety L 29.550 2.567 3.883
Variety G 26.983 1.316
Variety A 25.667

Conclusion: From the MCP we see that all varieties have different mean potassium
contents, because all the differences are greater than 1.3.

The Tukey test did not report a significant difference between Variety G and A. This
shows that the Tukey test is more conservative than Fisher’s LSD.
The Bonferroni correction

• The Bonferroni correction is an adjustment made to P values when several dependent or


independent statistical tests are being performed simultaneously on a single data set. To perform a
Bonferroni correction, divide the level of significance (α) by the number of comparisons being made.
For example, if 10 hypotheses are being tested, the new level of significance would be α/10.
• The Bonferroni correction is used to reduce the chances of obtaining false-positive results (type I
errors) when multiple pairwise tests are performed on a single set of data. Put simply, the probability
of identifying at least one significant result due to chance increases as more hypotheses are tested.
• No assumptions for Bonferroni
Example 3, Zar 215-216
(from Chapter 10)

Comparing potassium content in different Identify


focus
varieties of wheat problem
Do different varieties of wheat have different Adjust theory and
contents of potassium? begin again
• The ANOVA lead to the following results: Report Develop
results plan of
𝐹 = 20.97 > 𝐹0.05 1 ;2;15 = 3.68 action
Analyze data
 Reject H0 at a 5% level and form
𝑠𝑝2 = 1.116 conclusions
Collect
Use the P values from the various two-sample t data
tests with the Bonferroni adjustment.
𝑘 𝑘−1
= 𝑘𝐶2 = 3𝐶2 = 3 different two-sample t-
2
tests. If 𝛼 = 0.05, then compare the P values with
0.05
= 0.017, so if 𝑃 < 0.017 reject H0.
3
Example 3, Zar 215-216
(from Chapter 10)

The following three two-sample t tests were performed.


𝐻0 : 𝜇𝐺 = 𝜇𝐴 𝐻0 : 𝜇𝐺 = 𝜇𝐿 𝐻0 : 𝜇𝐴 = 𝜇𝐿
𝐻1 : 𝜇𝐺 ≠ 𝜇𝐴 𝐻1 : 𝜇𝐺 ≠ 𝜇𝐿 𝐻1 : 𝜇𝐴 ≠ 𝜇𝐿
and delivered these P values:
𝑃 = 0.036, 𝑃 = 0.001, 𝑃 = 0.0002 [obtained with Microsoft Excel Data Analysis Toolpak]

0.05
Compare the P values with = 0.017.
3

Conclusion: From the MCP we see that Varieties G and L are different (0.001 <
0.017) and Varieties A and L are different (0.0002 < 0.017). Varieties G and A are
not different (0.036 > 0.017).

The Tukey test also did not report a significant difference between Variety G and A.
This shows that the Bonferroni correction produced similar results.

You might also like