Radford University | Virginia | Best in the Southeast



Multiple Comparison Handout

I. Multiple Comparison

A. What is it?

When your Anova has more than 2 groups and it is significant, it only tells you than two or more groups are significantly different from one another, but not which one’s.

- In order to determine which means are different we have to compare different pairs of means to determine which ones are significantly different. We have to do this for multiple pairs (e.g. X1 vs X2, X1 vs X3, X2 vs X3) so it is call a Multiple Comparison.

B. Why do we do it?

When we want to test the differences between a large number of groups we could just use a series of t-tests and not do the ANOVA. However, each time we do a test we add together the type I error rates.

- For example: If I compare 3 groups I will have K(K-1)/2 comparisons to do (3 in this case). If each comparison is tested at the p X1, p < .05. X4 = X1, p = ns. Can’t test diff between X1 and X3, X1 and X2, or X2 and X3. Can test dif between X2 and X5 if the dif between the means exceeds the difference between the means of X1 and X5.

The critical value of this test is dependent on the df error (n-K) and the number of steps between means being compared (e.g. there are 5 steps between means 1 and 5, but only 2 steps between 1 and 2).

- This test sets alpha using a scaled down FW error rate: Alpha = [pic]

E.g. K = 5, c = 10

r = 1, Alpha = .05

r = 3, Alpha = .025

r = 6, Alpha = .00851

r = 10, Alpha = .00512

- Newman-Keuls is perhaps one of the most common Post Hoc test, but it is a rather controversial test. The major problem with this test is that when there is more than one true Null Hypothesis in a set of means it will overestimate they FW error rate.

- In general we would use this when the number of comparisons we are making is larger than K-1 and we don’t want to be as conservative as the Dunn’s test is.

E. Tukey’s HSD

- Tukey HSD (Honestly Significant Difference) is essentially like the Newman-Keul, but the tests between each mean are compared to the critical value that is set for the test of the means that are furthest apart (rmax e.g. if there are 5 means we use the critical value determined for the test of X1 and X5).

- This Method corrects for the problem found in the Newman-Keuls where the FW is inflated when there is more than one True Null Hypothesis in a set of means.

- This test buy protection against Type I error, but again at the cost of Power.

- This test sets alpha using the FW error rate: Alpha = [pic]

K = 2, rmax = 1, Alpha = .05

K = 3, rmax = 3, Alpha = .025

K = 4, rmax = 6, Alpha = .00851

K = 5, rmax = 10, Alpha = .00512

- this tends to me the most common test and preferred test because it is very conservative with respect to Type I error when the Null hypothesis is true. In general, HSD is preferred when you will make all the possible comparisons between a large set of means (Six or more means).

F. Tukey’s WSD

- Tukey’s WSD (Wholly Significant Difference) is sometimes referred to as the Tukey’sb Test. This test is a compromise the Newman-Keuls and the more conservative HSD. Here the alpha for each test is the Average of the Newman-Keuls Alpha and the HSD Alpha.

[pic] Where [pic]

E.g. K = 5, c = 10

r = 1, Alpha NK = .05, Alpha rmax = .00512 Alpha WSD = .02756

r = 3, Alpha NK = .025 Alpha rmax = .00512 Alpha WSD = .01506

r = 6, Alpha NK = .00851, Alpha rmax = .00512 Alpha WSD = .00682

r = 10, Alpha NK = .00512, Alpha rmax = .00512 Alpha WSD = .00512

- Thus the WSD is better than Newman-Kuels at preventing Type I error when more than one Null Hypothesis is true for your set of means, but it is not as complete as the HSD. However, with WSD you do not loose as much power as you do with the HSD.

- The WSD is best to use when you are making more than K-1 comparisons, you need more control of Type I error than Newman-Kuels, and you are testing fewer than (K(K-1))/2 comparisons.

G. Sheffé

- The Sheffé Test is designed to protect against a Type I error when all possible complex and simple comparisons are made. That is we are not just looking the possible combinations of comparisons between pairs of means. We are also looking at the possible combinations of comparisons between groups of means. Thus Sheffé is the most conservative of all tests.

- Because this test does give us the capacity to look at complex comparisons, it essentially uses the same statistic as the Linear Contrasts tests. However, Sheffé uses a different critical value (or at least it makes an adjustment to the critical value of F).

- Sheffé sets a more conservative F critical to create an Effective FW error rate.

-First, for each comparison find F critical at the Alpha = .10 (So we start off more liberal)

df btw = 1 df error = K-1

-Second, Multiply the F critical by K-1 and use the quotient as the critical value for all comparisons (both simple and complex) in that family of means.

- This test has less power than the HSD when you are making Pairwise (simple) comparisons, but it has more power than HSD when you are making Complex comparisons.

- In general, only use this when you want to make many Post Hoc complex comparisons (e.g. more than K-1).

H. q the studentized range statistic

The Newman-Kuels, HSD and WSD all use the q statistic which is based on the studentized range (q is often referred to as the studentized range statistic). When finding the critical q, you will need two pieces of information. First you need the df. In the case of post hoc testing use the df error(n-K) from the Anova. Second, you will need r. r is the number of steps between the means you are testing. (e.g. there are 5 steps between means 1 and 5, but only 2 steps between 1 and 2).

- The Formula for q

[pic] It is similar to [pic]

Thus [pic] And [pic]

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download