How to do a Tukey HSD test with the Anova command (car package)
Question
I'm dealing with an unbalanced design/sample and originally learned aov()
. I know now that for my ANOVA tests I need to use the Type III Sum of Squares which involves using fitting using lm()
rather than using aov()
.
The problem is getting post-hoc tests (specifically Tukey's HSD) using lm()
. All the research I've done has said that using simint
in the multcomp
package would work, but now that it's updated that command seems to not be available. It also seems to rely upon going through aov()
to calculate.
Essentially all of the Tukey HSD tests I've found for R assume that you use aov()
for the comparison rather than lm()
. To get the Type III Sum of Squares I need for the unbalanced design I have to use:
mod<-lm(Snavg~StudentEthnicity*StudentGender)
Anova(mod, type="III")
How do I use a Tukey HSD test with my mod using lm()
? Or conversely, calculate my ANOVA using Type III and still be able to run a Tukey HSD test?
Thanks!
No correct solution
OTHER TIPS
Try HSD.test
in agricolae
library(agricolae)
data(sweetpotato)
model<-lm(yield~virus, data=sweetpotato)
comparison <- HSD.test(model,"virus", group=TRUE,
main="Yield of sweetpotato\nDealt with different virus")
Output
Study: Yield of sweetpotato
Dealt with different virus
HSD Test for yield
Mean Square Error: 22.48917
virus, means
yield std.err replication
cc 24.40000 2.084067 3
fc 12.86667 1.246774 3
ff 36.33333 4.233727 3
oo 36.90000 2.482606 3
alpha: 0.05 ; Df Error: 8
Critical Value of Studentized Range: 4.52881
Honestly Significant Difference: 12.39967
Means with the same letter are not significantly different.
Groups, Treatments and means
a oo 36.9
ab ff 36.33333
bc cc 24.4
c fc 12.86667
As an initial note, unless it's been changed, to get the correct results for type iii sum of squares, you need to set the contrast coding for the factor variables. This can be done inside the lm
call or with options
. The example below uses options
.
I would be cautious about using HSD.test
and similar functions with unbalanced designs unless the documentation addresses their use in these situations. The documentation for TukeyHSD
mentions that it adjusts for "mildly unbalanced" designs. I don't know if HSD.test
handles things differently. You'd have to check additional documentation for the package or the original reference cited for the function.
As a side note, enclosing the whole HSD.test
function in parentheses will cause it to print the results. See example below.
In general I would recommend abandoning Tukey.HSD
and similar functions, and using the flexible emmeans
(née lsmeans
) or multcomp
packages for all your post-hoc comparison needs. emmeans
is particularly useful for doing mean separations on interactions or for examining contrasts among treatments.
With an unbalanced design, you may want to report the E.M. (or L.S.) means instead of the arithmetic means. See SAEPER: What are least square means?. Note in the example below that the marginal means reported by emmeans
are different than those reported by HSD.test
.
Also note that the "Tukey" in glht
has nothing to do with Tukey HSD or Tukey-adjusted comparisons; it just sets up the contrasts for all pairwise tests, as the output says.
However, the adjust="tukey"
in emmeans
functions does mean to use Tukey-adjusted comparisons, as the output says.
The following example is partially adapted from ARCHBS: One-way Anova.
if(!require(car)){install.packages("car")}
library(car)
data(mtcars)
mtcars$cyl.f = factor(mtcars$cyl)
mtcars$carb.f = factor(mtcars$carb)
options(contrasts = c("contr.sum", "contr.poly"))
model = lm(mpg ~ cyl.f + carb.f, data=mtcars)
library(car)
Anova(model, type="III")
if(!require(agricolae)){install.packages("agricolae")}
library(agricolae)
(HSD.test(model, "cyl")$groups)
if(!require(emmeans)){install.packages("emmeans")}
if(!require(multcompView)){install.packages("multcompView")}
library(emmeans)
library(multcompView)
marginal = emmeans(model,
~ cyl.f)
pairs(marginal, adjust="tukey")
cld(marginal, adjust="tukey", Letters=letters)
if(!require(multcomp)){install.packages("multcomp")}
library(multcomp)
mc = glht(model,
mcp(cyl.f = "Tukey"))
summary(mc, test=adjusted("single-step"))
multcomp::cld(mc)
I found HSD.test()
also to be very meticulous about the way you have built either the lm()
or aov()
model that you're using for it.
There was no output from HSD.test()
with my data when I had used following idea of coding for lm()
:
model<-lm(sweetpotato$yield ~ sweetpotato$virus)
out <- HSD.test(model,"virus", group=TRUE, console=TRUE)
Output was only:
Name: virus
sweetpotato$virus
The output was equally bad when using the same logic for aov()
model<-aov(sweetpotato$yield ~ sweetpotato$virus)
To get the output for HSD.test()
the lm()
(or also if using aov()
for the model )
must be constructed strictly using the logic presented in the MYaseen208 answer:
model <- lm(yield~virus, data=sweetpotato)
Hope this helps someone who's not getting a proper output from HSD.test()
.
I was stuck with the same problem of the HSD.test printing out nothing. You need to put console=TRUE
inside the function, so it prints out automatically.
For example:
HSD.test(alturacrit.anova, "fator", console=TRUE).
Hope it helps!