Question

Here is a kind of minimal example for a problem I stumbled upon:

mylm <- function(formula,data,subset=NULL){
  mysubset <- subset # some other clever manipulation
  lm(formula,data,mysubset)
}
mydata <- data.frame(x=rnorm(10),y=rnorm(10))
mylm(y~x,mydata) # this fails!

The reason why the last line fails is, that lm contains a call to model.frame, which is evaluated in the parent.frame, i.e. lm contains the line of code

mf <- eval(mf, parent.frame())

Here the mf on the right side is a cleverly constructed call to model.frame. I am passing on mysubset, but eval looks for it (I believe, but correct me if I am wrong) in the basic environment and doesn't find it. I know that I could probably use lm.fit, but is there a way to make the environment inside mylm the parent.frame for lm?

Was it helpful?

Solution

In this case, you are right, the call to model.frame (actually, model.frame.default) is looking for mysubset in the .GlobalEnv. However, a better generalization would be to say that it is trying to evaluate various objects either in the object passed to data or, if they are not there, in the environment of the formula that you pass to it. And that environment is the .GlobalEnv.

So model.frame.default calls

eval(substitute(subset), data, env)

That translates to "evaluate the object mysubset in data or, if not there, in env (which is environment(formula)).

One way to get around this is to recreate your formula inside your function, where it will assume the environment created when your function is called, where mysubset exists:

mylm <- function(formula,data,subset=NULL){
  mysubset <- subset # some other clever manipulation
  lm(formula(deparse(formula)),data,subset=mysubset)
}

In that way, model.frame.default should be able to find mysubset.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top