Question

I assumed this would be a straight forward task, but despite searching stackoverflow, the documentation and the R-help archives I have been unable to find the answer. I need to be able to color the boxes of a boxplot differently based on factor values.

The simplified following example shows the desired result:

df<-data.frame("Grp" = rep(LETTERS[1:5],each=20),"V" = rnorm(100),"F" = c(rep("a",80),rep("b",20)))
boxplot(V~Grp,df,col=c("red","red","red","red","blue"))

What I need to do is replace the col=c(...) with something that says the equivalent of "The colors of boxes having F="a" will be red, and the colors of boxes having F="b" will be blue". In real data, of course, there are several factors, there are many more Grps, and so on.

Any ideas will be appreciated.

Thank you.

Was it helpful?

Solution 2

df<-data.frame("Grp" = factor(rep(LETTERS[1:5],each=20)),"V" = rnorm(100),"F" = c(rep("a",80),rep("b",20)))
boxplot(V~Grp,df,col=c("red","red","red","red","blue"))

# these are the combinations of group and F, the color-by variable    
unique(df[c('Grp','F')])
#    Grp F
# 1    A a
# 21   B a
# 41   C a
# 61   D a
# 81   E b

## you need a color vector that is the same length as the grouping variable in boxplot
(colors <- c('red','blue')[unique(df[c('Grp','F')])$F])
# [1] "red"  "red"  "red"  "red"  "blue"

# plot with a legend for F
boxplot(V ~ Grp, df, boxfill = colors)
legend('top', horiz = TRUE, fill = unique(colors), legend = levels(df$F), bty = 'n')

enter image description here

OTHER TIPS

This is easy to do with ggplot2. I can't think of a straightforward way to do it with base graphics.

The idea is to create a color variable in your dataset that depends on the factor. Then pass it to ggplot as a color attribute:

df<-data.frame("Grp" = rep(LETTERS[1:5],each=20),"V" = rnorm(100),"F" =     c(rep("a",80),rep("b",20)))
df$boxcolor <- with(df, ifelse(F == "a", "red", "blue"))

library(ggplot2)
ggplot(df, aes(x = Grp, y = V, color = boxcolor)) + geom_boxplot()

In your simple example, you can pass the variable F directly as the color variable and let ggplot choose the colors for you. I don't know if this will scale up to your more complex problem.

ggplot(df, aes(x = Grp, y = V, color = F)) + geom_boxplot()    
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top