Question

I am trying to wrap a function around the 'cast' function from the reshape package which runs some checks on my data before casting it.

cast2 <- function(data, formula = ... ~ variable, fun.aggregate = NULL, 
        ..., margins = FALSE, subset = TRUE, df = FALSE, fill = NULL, 
        add.missing = FALSE, value = guess_value(data)) {

    #RunChecksOnData()

    return(cast(data, formula = formula, fun.aggregate = fun.aggregate, ..., margins = margins, subset = subset, df = df, fill = fill, add.missing = add.missing, value = value))
}

If there are no checks, I would hope that this function 'cast2' would return the same result as cast. However, when I take one of the featured examples

names(airquality) <- tolower(names(airquality))
aqm <- melt(airquality, id=c("month", "day"), na.rm=TRUE)

and run:

cast2(aqm, day ~ month, mean, subset=variable=="ozone")

this results in an error "Error in eval(expr, envir, enclos) : object 'variable' not found"

I suspect this has to do with the way the formula gets passed through the function, but I can't figure it out. (I realise I could technically solve the problem by replicating all the cast function code inside cast2, but I'm sure there must be a cleaner way).

Was it helpful?

Solution

Your problem right now is that you cannot evaluate the subset argument as cast uses no standard evaluation to capture it. By using match.call() you avoid the evaluation of subset. The variable referenced in the error is from the expression subset=variable=="ozone".

Here is a solution. If you really are only going to check the data and not attempt to modify it then this is trivial. All you need to do is check your variables, and if okay, change the call to be to cast and evaluate. If you need to modify the variables, then it is a bit trickier, but not that much. You need to make sure that the call refers to the modified variables in the environment of the function. I do this below:

cast2 <- function(data, formula = ... ~ variable, fun.aggregate = NULL, 
                  margins = FALSE, subset = TRUE, df = FALSE, fill = NULL, 
                  add.missing = FALSE, value = guess_value(data)) {

  #RunChecksOnData()
  cast.call <- match.call()
  cast.call[[1]] <- quote(cast)       # update call
  data$value <- data$value * 100      # modify data
  cast.call[["data"]] <- quote(data)  # update call for modified data (otherwise this refers to aqm)
  eval(cast.call)
}

EDIT: if you do use this, you need to be careful about avoiding conflicts between your function environment and the environment that generates the call. This here works so long as your function doesn't contain any variables with the same name as any referenced in the call. If you cannot guarantee this is the case, then you need to be careful about how you proceed with evaluation. I would create a new environment that has for parent the parent.frame of the function, assign the modified objects (e.g. data in this case) there, and then evaluate cast.call in that environment. This complication only applies if you need to modify arguments. If you only need to check the arguments, you can modify the last line to be eval(cast.call, parent.frame()) and you should be fine.

OTHER TIPS

cast is obsolete. You should use reshape2 and not reshape, where you have 2 functions : dcast and acast. One idea to implement what you want is to use . from plyr Here is an example:

library(plyr)
library(reshape2)
dcast2 <- function(){
  names(airquality) <- tolower(names(airquality))
  aqm <- melt(airquality, id=c("month", "day"), na.rm=TRUE)
  dcast(aqm, variable ~ month, mean, subset = subset)
}

Then you call it like this:

dcast2(.(variable=="ozone"))
  variable        5        6        7        8        9
1    ozone 23.61538 29.44444 59.11538 59.96154 31.44828
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top