Question

I'm trying to use the dplyr package to apply a function to all columns in a data.frame that are not being grouped, which I would do with aggregate():

aggregate(. ~ Species, data = iris, mean)

where mean is applied to all columns not used for grouping. (Yes, I know I can use aggregate, but I'm trying to understand dplyr.)

I can use summarize like this:

species <- group_by(iris, Species)
summarize(species,
          Sepal.Length = mean(Sepal.Length),
          Sepal.Width = mean(Sepal.Width))

But is there a way to have mean() applied to all columns that are not grouped, similar to the . ~ notation of aggregate()? I have a data.frame with 30 columns that I want to aggregate, so writing out the individual statements is not ideal.

Was it helpful?

Solution

If you're willing to try out an experimental dplyr, you can try out the new (and still experimental) summarise_each():

devtools::install_github("hadley/dplyr", ref = "colwise")

library(dplyr)
iris %.%
  group_by(Species) %.%
  summarise_each(funs(mean))
## Source: local data frame [3 x 5]
## 
##      Species Sepal.Length Sepal.Width Petal.Length Petal.Width
## 1     setosa        5.006       3.428        1.462       0.246
## 2 versicolor        5.936       2.770        4.260       1.326
## 3  virginica        6.588       2.974        5.552       2.026

iris %.%
  group_by(Species) %.%
  summarise_each(funs(min, max))
## Source: local data frame [3 x 9]
## 
##      Species Sepal.Length_min Sepal.Width_min Petal.Length_min
## 1     setosa              4.3             2.3              1.0
## 2 versicolor              4.9             2.0              3.0
## 3  virginica              4.9             2.2              4.5
## Variables not shown: Petal.Width_min (dbl), Sepal.Length_max (dbl),
##   Sepal.Width_max (dbl), Petal.Length_max (dbl), Petal.Width_max (dbl)

Feedback much appreciated!

This will appear in dplyr 0.2.

OTHER TIPS

This will get you almost all the way in dplyr.

h = iris %.%
  group_by(Species) %.%
  do(function(d){
    sapply(Filter(is.numeric, d), mean)  
  })

as.data.frame(h)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top