Pergunta

I'm going to use the diamond data set that comes standard with the ggplot2 package to illustrate what I'm looking for.

I want to build a graph that is like this:

library(ggplot2)
ggplot(diamonds, aes(clarity, fill=cut)) + geom_bar(position="dodge")

However, instead of having a count, I would like to return the mean of a continuous variable. I'd like to return cut and color and get the mean carat. If I put in this code:

ggplot(diamonds, aes(carat, fill=cut)) + geom_bar(position="dodge")

My output is a count of the number of carats vs the cut.

Anyone know how to do this?

Foi útil?

Solução

You can get a new data frame with mean(carat) grouped by cut and color and then plot:

library(plyr)
data <- ddply(diamonds, .(cut, color), summarise, mean_carat = mean(carat))
ggplot(data, aes(color, mean_carat,fill=cut))+geom_bar(stat="identity", position="dodge")

enter image description here

If you want faster solutions you can use either dplyr or data.table

With dplyr:

library(dplyr)
data <- group_by(diamonds, cut, color)%.%summarise(mean_carat=mean(carat)) 

With data.table:

library(data.table)
data <- data.table(diamonds)[,list(mean_carat=mean(carat)), by=c('cut', 'color')]

The code for the plot is the same for both.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top