Frage

I'm trying to make an automatic box-cox transform (which should be generally useful to folks norming data), but having trouble phrasing my optimization in a way that R's optim is OK with. It generally works, but I'm unclear on what's causing it to fail on variables with extreme skew.

The idea is to choose the parameter of Lambda in the box-cox transform that minimizes the absolute value of the skewness of the dataset.

library(car)
library(moments)

xskew <- function(data,par){
    abs(skewness(bcPower(data,lambda=par[1]),na.rm=T)) # minimize abs(skew)
}

boxit <- function(x){
    res <- optim(par=c(-5,5), xskew, data=x+1)         # find argmin(^) lambda
    print(res$par)
    return(bcPower(x+1,lambda=res$par[1]))

This generally works quite well, for example:

> skewness(rbeta(1000,12,3))
[1] -0.6439532

becomes

> skewness(boxit(rbeta(1000,12,3)))
[1] -5.980757e-08

---almost 0 skew.

But on one extremely skewed variable, I'm getting:

Error in optim(par = c(-5, 5), xskew, data = x + 1) (from #2) : 
  function cannot be evaluated at initial parameters

My thoughts are I'm probably:

  1. Not catching how the bcPower function deals with values near zero or infinity.
  2. Misusing optim
  3. Perhaps doing something even more silly, in that I'm framing something that can't possibly converge.
War es hilfreich?

Lösung

Oops, I was using a 2 param solver instead of using a 1-param solver with explicit lower, upper bounds. The optim call I needed was:

optim(par=-2, xskew, x=x, method="Brent", lower=-20, upper=20)

And slight redefinition to the xskew function call.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top