Wednesday, May 28, 2014

Excel Post Hoc Tukey’s HSD Test For 2-Factor ANOVA w/Replication

This is one of the following seven articles on Two-Factor ANOVA With Replication in Excel

Two-Factor ANOVA With Replication in 5 Steps in Excel 2010 and Excel 2013

Variance Tests: Levene’s and Brown-Forsythe For 2-Factor ANOVA in Excel 2010 and Excel 2013

Shapiro-Wilk Normality Test in Excel For 2-Factor ANOVA With Replication

2-Factor ANOVA With Replication Effect Size in Excel 2010 and Excel 2013

Excel Post Hoc Tukey’s HSD Test For 2-Factor ANOVA With Replication

2-Factor ANOVA With Replication – Test Power With G-Power Utility

Scheirer-Ray-Hare Test Alternative For 2-Factor ANOVA With Replication

 

Excel Post-Hoc Test

Tukey’s HSD for Two-

Factor ANOVA With

Replication

The F-test in ANOVA is classified as an omnibus test. An omnibus test is one that tests the overall significance of a model to determine whether a difference exists but not exactly where the difference is. The F Test of ANOVA tests the Null Hypothesis that states that all of the group means in that F Test are the same. When a significant result from the F Test (the p value is smaller than alpha) causes the Null Hypothesis to be rejected, further testing must be performed to determine which pairs of means are significantly different. That type of testing is called post hoc testing.

Post hoc testing is a pairwise comparison. Groups means are compared two at a time to determine whether the difference between the pair of means is significant.

Post-Hoc Tests Used When

Group Variances Are Equal

SPSS lists the following Post-Hoc tests or corrections available when groups variances are equal:

LSD

Bonferroni

Sidak

Scheffe

REGWF

REGWQ

S-N-K

Tukey (Tukey’s HSD or Tukey-Kramer)

Tukey’s b

Duncan

Hochberg’s GT2

Gabriel

Waller-Duncan

Dunnett

Of all of the post hoc tests available when groups variances are found to be similar, Tukey’s HSD test is used much more often than the others. Tukey’s HSD can only be used when group sizes are exactly the same, which is the case for balanced two-factor ANOVA with replication.

Tukey’s HSD (Honestly Significant Difference) Test – Used When Group Sizes and Group Variances Are Equal

Tukey’s HSD test compares the difference between each pair of group means to determine which differences are large enough to be considered significant.

Tukey’s HSD test is very similar to a t-test except that it makes a correction for the experiment-wide error rate that a t-test doesn’t. The experiment-wide error rate is the increased probability of type 1 errors (false positives – detecting a difference where none exists) when multiple comparisons are made.

Tukey’s HSD test can be summarized as follows:

The means of all groups are arranged into as many unique pair combinations as possible. The pair combination with the largest difference between the two means is tested first. A test statistic for this pair of means is calculated as follows:

post hoc,tukey,excel,excel 2010,excel 2013,statistics,anova,two factor anova,two-factor anova,two way anova,two-way anova

where

post hoc,tukey,excel,excel 2010,excel 2013,statistics,anova,two factor anova,two-factor anova,two way anova,two-way anova(Click On Image To See Larger Version)

n = number of samples in any group (all groups must be of equal size for Tukey’s HSD Post-Hoc test)

This test statistic is compared to qCritical . The critical q values are found on the Studentized Range q table. A unique critical q value is calculated for each unique combination of level of significance (usually set at 0.05), the degrees of freedom, and the total number of groups in the ANOVA analysis.

Tukey’s test calculates degrees of freedom as follows:

df = Degrees of freedom = (total number of samples in all groups combined) – (total number of groups in that F test)

The difference between the two means is designated as significant if its test statistic q is larger than the critical q value from the table.

If the difference between the means with the largest difference is found to be significant, the next inside pair of means is tested. This step is repeated until an innermost pair is found to have a difference that is not significant. Once an inner pair of means is found to have a difference that is not large enough to be significant, no further testing needs to be done because all untested pairs will be inside this one and have even smaller differences between the means.

The difference between the two means is designated as significant if its test statistic q is larger than the critical q value from the table.

The Tukey HSD test calculates Test Statistic q for each pair of means. This Test Statistic is compared to qCritical . The critical q values are found on the Studentized Range q table using the Excel lookup function, INDEX(array, row number, column number).

The difference between the two means is designated as significant if its test statistic q is larger than the critical q value from the table.

Tukey’s HSD Test Statistic

The Test Statistic q is calculated as follows:

q = (Max Group Mean – Min Group Mean) / SE

df = Degrees of freedom = (total number of samples) – (total number of groups in that F Test)


Tukey’s HSD Test in Excel For

the Main Effects F Test For

Factor 1

Tukey’s HSD Test should be performed for the Factor 1 Main Effect F Test but not for the Factor 2 Main Effects F Test. The purpose of Tukey’s HSD post hoc test is to determine which difference(s) between mean is significant. Factor 1 has three levels and therefore two differences between the means of the three level groups. The significant result of this F Test indicates that at least one of level group means is different than the other two level group means. The F Test is a omnibus test meaning that it does not tell where that difference lies. Tukey’s HSD test will indicate whether each of the differences between any combination of the three means is different.

Post hoc testing does not need to be performed on the two level groups of Factor 2’s Main Effects test. Tukey’s HSD test does not need to be performed when an F Test is run on only two groups. There is only one difference between the two group means. If the F Test indicates that there is a significant difference between the means of the two groups, there is no need to determine which difference is significant because there is only one difference.

The first step when performing the Tukey HSD test is to list all unique mean pairs and the differences between the means. All of this information can be found from the Excel ANOVA output as follows:

post hoc,tukey,excel,excel 2010,excel 2013,statistics,anova,two factor anova,two-factor anova,two way anova,two-way anova(Click On Image To See Larger Version)

The total number of combinations of pairs of n objects can be found by the following Excel formula:

=COMBIN(n,2)

If there are three level group means of Factor 1 (n = 3), the total number of combination pairs of these means is three, as a result of the following Excel formula:

COMBIN(3,2) = 3

From the Excel output, the three level group means of Factor 1 are the following:

Factor 1 Level 1 group mean = 76.125

Factor 1 Level 2 group mean = 49.625

Factor 1 Level 3 group mean = 85.375

Three unique group pairings exist: (1,2), (1,3), and (2,3)

The absolute differences in means of each pair are as follows:

Pair (1,2) Mean Difference = ABS(76.125-49.625) = 26.50

Pair (1,3) Mean Difference = ABS(76.125-85.375) = 9.25

Pair (2,3) Mean Difference = ABS(49.625-85.375) = 35.75

The differences between these means in descending order are as follows:

Largest difference = Pair (2,3) Mean Difference = 35.75

2nd largest difference = Pair (1,2) Mean Difference = 26.50

Smallest difference = Pair (1,3) Mean Difference = 9.25

Calculating q and q Critical for each difference requires MSWithin and dfWithin from the following section of the Excel ANOVA output.

post hoc,tukey,excel,excel 2010,excel 2013,statistics,anova,two factor anova,two-factor anova,two way anova,two-way anova(Click On Image To See Larger Version)

Difference between group means are checked for significance starting with the largest difference and working down to the smallest difference. As soon as one difference is found to be insignificant, no further differences need to be checked because all smaller differences will also be insignificant.

Calculating q and q Critical for the largest difference between the means of factor 1 level groups is done as follows:

post hoc,tukey,excel,excel 2010,excel 2013,statistics,anova,two factor anova,two-factor anova,two way anova,two-way anova(Click On Image To See Larger Version)

The q Critical value for α = 0.05 can be looked up on the critical value table for the specific k and df as follows:

post hoc,tukey,excel,excel 2010,excel 2013,statistics,anova,two factor anova,two-factor anova,two way anova,two-way anova(Click On Image To See Larger Version)

Calculating q and q Critical for the 2nd largest difference between the means of factor 1 level groups is done as follows:

post hoc,tukey,excel,excel 2010,excel 2013,statistics,anova,two factor anova,two-factor anova,two way anova,two-way anova(Click On Image To See Larger Version)

Calculating q and q Critical for the smallest difference between the means of factor 1 level groups is done as follows:

post hoc,tukey,excel,excel 2010,excel 2013,statistics,anova,two factor anova,two-factor anova,two way anova,two-way anova(Click On Image To See Larger Version)

Looking Up qCritical on the Studentized Range q Table With the Excel INDEX() Function

The Studentized Range q table and the Excel Index() function appear as follows:

=INDEX( array, relative row number, relative column number )

A relative address is the address relative to the cell in upper left corner of the array. In the INDEX() function is attempting to locate a value in a cell that is in the third column over (to the right of) and third row down from the cell in the upper left corner of the array, the relative row number equals 3 and the relative column number equals 3.

The array is the absolute address of the array. This is given by:
(upper left corner cell:lower right corner cell). In this case it would be (D5:K103)

post hoc,tukey,excel,excel 2010,excel 2013,statistics,anova,two factor anova,two-factor anova,two way anova,two-way anova(Click On Image To See Larger Version)

post hoc,tukey,excel,excel 2010,excel 2013,statistics,anova,two factor anova,two-factor anova,two way anova,two-way anova(Click On Image To See Larger Version)

post hoc,tukey,excel,excel 2010,excel 2013,statistics,anova,two factor anova,two-factor anova,two way anova,two-way anova(Click On Image To See Larger Version)

post hoc,tukey,excel,excel 2010,excel 2013,statistics,anova,two factor anova,two-factor anova,two way anova,two-way anova(Click On Image To See Larger Version)

Determining Where the Strongest

Interactions Between Factor 1 and

Factor 2 Occur

The first step is to calculate the means of treatment cell as follows:

post hoc,tukey,excel,excel 2010,excel 2013,statistics,anova,two factor anova,two-factor anova,two way anova,two-way anova(Click On Image To See Larger Version)

The second step is to plot treatment cells means on a scatterplot chart. Separate line graphs for each level of one of the factors. In this case each level of Factor 2 given its own line graph as follows.

post hoc,tukey,excel,excel 2010,excel 2013,statistics,anova,two factor anova,two-factor anova,two way anova,two-way anova(Click On Image To See Larger Version)

The preceding scatterplot shows separate line graphs for each of the two levels of Factor 2 at successive levels of Factor 1. Interaction occurs in two-Way ANOVA systematic differences between levels of one factor vary along different levels of the other factor.

The interaction of the two factors between various levels is indicated by the slopes of adjacent line segments. Adjacent line segments that are parallel show no interactions between the levels of the factors at the endpoints of the line segments. The more that the slopes differ, the greater is the interaction of the two factors between the levels at the endpoints of the adjacent line segments.

The relative degree of interaction between the two factors across all combinations of their levels can be determined by calculating the absolute difference in the slopes of adjacent line segments. The adjacent line segments that have the greatest absolute difference in slopes display the greatest degree of interaction between factor levels at the endpoints of the adjacent line segments.

The calculations below indicate that there is significantly greater interaction between Factors 1 and 2 at higher levels of Factor 1 than at lower levels of Factor 1.

post hoc,tukey,excel,excel 2010,excel 2013,statistics,anova,two factor anova,two-factor anova,two way anova,two-way anova(Click On Image To See Larger Version)

 

Excel Master Series Blog Directory

Statistical Topics and Articles In Each Topic

1 comment:

  1. The great info! I wanted to thanks for your time for this wonderful read!! thanks for sharing this blog.
    eCommerce Catalog Management Services

    ReplyDelete