質問

I'm wondering if there is a way of doing this:

iris %.% group_by(Species) %.% 
  mutate(v1=Sepal.Length / mean(Sepal.Length)) %.% 
  filter(v1 > 1.15) %.% select(Species:v1)

While skipping the select bit. I thought the following should work (but doesn't, for many reasons):

iris %.% group_by(Species) %.% 
  select(Species, v1=Sepal.Length / mean(Sepal.Length)) %.% 
  filter(v1 > 1.15)

Note in this example I replaced mutate with select in the hopes that alone would do it. This also doesn't work because summarize expects expressions to return 1 value:

iris %.% 
  group_by(Species) %.% 
  summarise(Sepal.Length / mean(Sepal.Length)) %.% 
  filter(v1 > 1.15)

Clearly, not a huge deal, but wondering if there is a simpler way of replicating default data.table behavior:

data.table(iris)[, Sepal.Length / mean(Sepal.Length), by=Species][V1 > 1.15]

Which produces just the by columns and the computed value:

      Species       V1
1:     setosa 1.158610
2: versicolor 1.179245
3: versicolor 1.162399
4:  virginica 1.153613
5:  virginica 1.168792
6:  virginica 1.168792
7:  virginica 1.168792
8:  virginica 1.199150
9:  virginica 1.168792
役に立ちましたか?

解決

This can now be simplified with dplyr's new transmute function which drops any columns except for the grouping variable and cumputed variables (V1 in this case).

require(dplyr) # >= 0.3.0.2
iris %>% 
  group_by(Species) %>% 
  transmute(v1 = Sepal.Length / mean(Sepal.Length)) %>% 
  filter(v1 > 1.15)

#Source: local data frame [9 x 2]
#Groups: Species
#
#     Species       v1
#1     setosa 1.158610
#2 versicolor 1.179245
#3 versicolor 1.162399
#4  virginica 1.153613
#5  virginica 1.168792
#6  virginica 1.168792
#7  virginica 1.168792
#8  virginica 1.199150
#9  virginica 1.168792
ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top