Question

I would like to get the facet subet mean (x + y axis) of the subset with ggplot. However, I get the mean of the data and not the subset one. I don't know how to solve this issue.

hsb2<-read.table("http://www.ats.ucla.edu/stat/data/hsb2.csv", sep=",", header=T)
head(hsb2)
hsb2$gender = as.factor(hsb2$female)

ggplot() +
  geom_point(aes(y = read,x = write,colour = gender),data=hsb2,size = 2.2,alpha = 0.9) +
  scale_colour_brewer(guide = guide_legend(),palette = 'Set1') +
  stat_smooth(aes(x = write,y = read),data=hsb2,colour = '#000000',size = 0.8,method = lm,formula = 'y ~ x') +
  geom_vline(aes(xintercept = mean(write)),data=hsb2,linetype = 3) +
  geom_hline(aes(yintercept = mean(read)),data=hsb2,linetype = 3) +
  facet_wrap(facets = ~gender)

enter image description here

Was it helpful?

Solution

One way to do it is to explicitly calculate the means (x and y) for each gender and store them as new columns in the original data frame. And when faceting splits it by gender, the lines get drawn where you want them.

Using tapply

#compute the read and write means for each gender 
read_means <- tapply(hsb2$read, hsb2$gender, mean)
write_means <- tapply(hsb2$write, hsb2$gender, mean)

#store it in the data frame
hsb2$read_mean <- ifelse(hsb2$gender==0, read_means[1], read_means[2])
hsb2$write_mean <- ifelse(hsb2$gender==0, write_means[1], write_means[2])

An alternative to the lines above is to use ddply.

Using ddply from the Plyr package

The new columns can be created using a single line.

library(plyr)
ddply(hsb2, "gender", transform, 
      read_mean  = mean(read),
      write_mean = mean(write))

Now, pass the two new column means to the vline and hline calls in ggplot.

ggplot() +
  geom_point(aes(y = read,x = write,colour = gender),data=hsb2,size = 2.2,alpha = 0.9) +
  scale_colour_brewer(guide = guide_legend(),palette = 'Set1') +
  stat_smooth(aes(x = write,y = read),data=hsb2,colour = '#000000',
              size = 0.8,method = lm,formula = 'y ~ x') +
  geom_vline(aes(xintercept = write_mean),data=hsb2,linetype = 3) +
  geom_hline(aes(yintercept = read_mean),data=hsb2,linetype = 3) +
  facet_wrap(facets = ~gender)

Produces: enter image description here

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top