Question

Is there a function or a package that allows to look for the best (or one of the best) variable transformation in order to make model's residuals as normal as possible?


For example:

frml = formula(some_tranformation(A) ~ B+I(B^2)+B:C+C)
model = aov(formula, data=data)
shapiro.test(residuals(model))

Is there a function that tells what is the function some_transformation() that optimizes the normality of the residuals?

Était-ce utile?

La solution

You mean like the Box-Cox transformation?

library(car)
m0 <- lm(cycles ~ len + amp + load, Wool)
plot(m0, which=2)

enter image description here

# Box Cox Method, univariate
summary(p1 <- powerTransform(m0))
# bcPower Transformation to Normality 
# 
#    Est.Power Std.Err. Wald Lower Bound Wald Upper Bound
# Y1   -0.0592   0.0611          -0.1789           0.0606
# 
# Likelihood ratio tests about transformation parameters
#                              LRT df      pval
# LR test, lambda = (0)  0.9213384  1 0.3371238
# LR test, lambda = (1) 84.0756559  1 0.0000000


# fit linear model with transformed response:
coef(p1, round=TRUE)
summary(m1 <- lm(bcPower(cycles, p1$roundlam) ~ len + amp + load, Wool))
plot(m1, which=2)

enter image description here

Autres conseils

Unfortunately this is not a solved problem in statistics. What user @statquant has suggested is pretty much the best you can do, however it is not without its own pitfalls.

One important thing to note is that tests for normality, like shapiro.test are very sensitive to changes once you get reasonable sample sizes (i.e. in the hundreds), so you should not blindly rely on them.

Myself, i've thrown the problem in the too hard basket. If the data doesn't look at least normally distributed, then I would try to find a non-parametric version of the statistics you want to run on the data.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top