문제

I'm new to plyr and want to take the weighted mean of values within a class to reshape a dataframe for multiple variables. Using the following code, I know how to do this for one variable, such as x2:

set.seed(123)
frame <- data.frame(class=sample(LETTERS[1:5], replace = TRUE),
                    x=rnorm(20), x2 = rnorm(20), weights=rnorm(20))
ddply(frame, .(class),function(x) data.frame(weighted.mean(x$x2, x$weights)))       

However, I would like the code to create a new data frame for x and x2 (and any amount of variables in the frame). Does anybody know how to do this? Thanks

도움이 되었습니까?

해결책

You might find what you want in the ?summarise function. I can replicate your code with summarise as follows:

library(plyr)
set.seed(123)
frame <- data.frame(class=sample(LETTERS[1:5], replace = TRUE), x=rnorm(20), 
                    x2 = rnorm(20), weights=rnorm(20))
ddply(frame, .(class), summarise, 
      x2 = weighted.mean(x2, weights)) 

To do this for x as well, just add that line to be passed into the summarise function:

ddply(frame, .(class), summarise, 
      x = weighted.mean(x, weights),
      x2 = weighted.mean(x2, weights)) 

Edit: If you want to do an operation over many columns, use colwise or numcolwise instead of summarise, or do summarise on a melted data frame with the reshape2 package, then cast back to original form. Here's an example.


That would give:

wmean.vars <- c("x", "x2")

ddply(frame, .(class), function(x)
      colwise(weighted.mean, w = x$weights)(x[wmean.vars]))

Finally, if you don't like having to specify wmean.vars, you can also do:

ddply(frame, .(class), function(x)
      numcolwise(weighted.mean, w = x$weights)(x[!colnames(x) %in% "weights"]))

which will compute a weighted-average for every numerical field, excluding the weights themselves.

다른 팁

A data.table answer for fun, which also doesn't require specifying all the variables individually.

library(data.table)
frame <- as.data.table(frame)
keynames <- setdiff(names(frame),c("class","weights"))
frame[, lapply(.SD,weighted.mean,w=weights), by=class, .SDcols=keynames]

Result:

   class          x         x2
1:     B  0.1390808 -1.7605032
2:     D  1.3585759 -0.1493795
3:     C -0.6502627  0.2530720
4:     E  2.6657227 -3.7607866
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top