Question

I'm using the rpart package to fit some models, like this:

fitmodel = function(formula, data, w) {

    fit = rpart(formula, data, weights = w)
}

Call the custom function

fit = fitmodel(y ~ x1 + x2, data, w)

This causes the error:

Error in eval(expr, envir, enclos) : object 'w' not found

Then i decided to use

fitmodel = function(formula, data, w) {

    data$w = w
    fit = rpart(formula, data, weights = w)
}

This works, but there's another problem:

This will work

fit = fitmodel(y ~ x1 + x2, data, w)

This does not work

fit = fitmodel(y ~ ., data, w)

Error in eval(expr, envir, enclos) : object 'w' not found

What's the correct way to apply weights inside a custom function? Thanks!

Was it helpful?

Solution

Hopefully someone else gives a more complete answer. The reason why rpart can't find w is that rpart searches the environment that the formula is defined in for data, weights, etc. The formula is created in some environment most likely the GlobalEnv and the w is created within some other function. Changing the environment of the formula to the environment where w is created with parent.frame fixes that. rpart can still find the data since the search path will always continue to the GlobalEnv. I'm not sure why the sys.frame(sys.nframe()) works since the environments aren't the same but apparently w is still somewhere on the search path

edit: sys.frame(sys.nframe()) seems to be the same as setting the environment of the forumla to the environment of the function rpart is called in (foo3 in this example). In that case, rpart looks for w, data, etc. in foo3, then bar3 then the GlobalEnv.

library(rpart)
data(iris)

bar <- function(formula, data) {
   w <- rpois(nrow(iris), 1)
   print(environment())
   foo(formula, data, w)
}

foo <- function(formula, data, w) {
  print(environment(formula))
  fit <- rpart(formula, data, weights = w)
  return(fit)
}


bar(I(Species == "versicolor") ~ ., data = iris)
## <environment: 0x1045b1a78>
## <environment: R_GlobalEnv>
## Error in eval(expr, envir, enclos) (from #2) : object 'w' not found


bar2 <- function(formula, data) {
  w <- rpois(nrow(iris), 1)
  print(environment())
  foo2(formula, data, w)
}

foo2 <- function(formula, data, w) {
  print(environment(formula))
  environment(formula) <- parent.frame()
  print(environment(formula))
  fit <- rpart(formula, data, weights = w)
  return(fit)
}

bar2(I(Species == "versicolor") ~ ., data = iris)
## <environment: 0x100bf5910>
## <environment: R_GlobalEnv>
## <environment: 0x100bf5910>


bar3 <- function(formula, data) {
  w <- rpois(nrow(iris), 1)
  print(environment())
  foo3(formula, data, w)
}

foo3 <- function(formula, data, w) {
  print(environment(formula))
  environment(formula) <- environment() ## seems to be the same as sys.frame(sys.nframe())
  print(environment(formula))
  print(environment())
  fit <- rpart(formula, data, weights = w)
  return(fit)
}

bar3(I(Species == "versicolor") ~ ., data = iris)
## <environment: 0x104e11bb8>                                                                                                                                                                                                                 
## <environment: R_GlobalEnv>                                                                                                                                                                                                                 
## <environment: 0x104b4ff78>                                                                                                                                                                                                                 
## <environment: 0x104b4ff78>

OTHER TIPS

According to the rpart documentation (March 12, 2017, page 23, section 6.1), "Weights are not yet supported, and will be ignored if present."

https://cran.r-project.org/web/packages/rpart/vignettes/longintro.pdf

I've managed to solve this using the code below, but i'm sure there's a better way:

The weak learner

fitmodel = function(formula, data, w) {

    # just paste the weights into the data frame
    data$w = w
    rpart(formula, data, weights = w, control = rpart.control(maxdepth = 1))
}

The algorithm

ada.boost = function(formula, data, wl.FUN = fitmodel, test.data = NULL, M = 100) {

    # Just rewrites the formula and get ride of any '.'
     dep.var = all.vars(formula)[1]
     vars = attr(terms(formula, data = data), "term.labels")
     formula = as.formula(paste(dep.var, "~", paste(vars, collapse = "+")))


    # ...more code
}

Now everything works!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top