I'm trying to make an automatic box-cox transform (which should be generally useful to folks norming data), but having trouble phrasing my optimization in a way that R's optim is OK with. It generally works, but I'm unclear on what's causing it to fail on variables with extreme skew.
The idea is to choose the parameter of Lambda in the box-cox transform that minimizes the absolute value of the skewness of the dataset.
library(car)
library(moments)
xskew <- function(data,par){
abs(skewness(bcPower(data,lambda=par[1]),na.rm=T)) # minimize abs(skew)
}
boxit <- function(x){
res <- optim(par=c(-5,5), xskew, data=x+1) # find argmin(^) lambda
print(res$par)
return(bcPower(x+1,lambda=res$par[1]))
This generally works quite well, for example:
> skewness(rbeta(1000,12,3))
[1] -0.6439532
becomes
> skewness(boxit(rbeta(1000,12,3)))
[1] -5.980757e-08
---almost 0 skew.
But on one extremely skewed variable, I'm getting:
Error in optim(par = c(-5, 5), xskew, data = x + 1) (from #2) :
function cannot be evaluated at initial parameters
My thoughts are I'm probably:
- Not catching how the bcPower function deals with values near zero or infinity.
- Misusing optim
- Perhaps doing something even more silly, in that I'm framing something that can't possibly converge.