ddply error when the aggregation function is defined within another function

StackOverflow https://stackoverflow.com/questions/23228666

  •  07-07-2023
  •  | 
  •  

Domanda

The point of the following almost minimal code is the application of ddply within a function f where the aggregation function (helper) of ddply is self-defined within the definition of f.

Unfortunately, i don't understand why sourcing the entire snippet produces an Error in eval(expr, envir, enclos) : could not find function "helper". The code works when the helper function is run independently of the function f. When i replace the ddply call with the uncommented call of by, the code runs without error.

Can you explain the error and provide a solution or a workaround? [Tested with plyr 1.8.1 and R 3.0.3]

rm (list = ls())
library(plyr)

f <- function() {

  dfx <- data.frame(
    group = c(rep('A', 8), rep('B', 15), rep('C', 6)),
    sex = sample(c("M", "F"), size = 29, replace = TRUE),
    age = runif(n = 29, min = 18, max = 54)
  )

  helper <- function(x) {
    return(max(x))
  }

  result <- ddply(dfx, .(group, sex), summarize, max_age = helper(age))
  #result <- by(dfx$age, dfx[,c("group", "sex")], helper)

  return(result)
}

print(f())
È stato utile?

Soluzione

Try:

result <- ddply(dfx, .(group, sex), here(summarize), max_age = helper(age))

From the help page for here:

This function captures the current context, making it easier to use **ply with functions that do special evaluation and need access to the environment where ddply was called from.

summarize is one such special function.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top