df<-data.frame("Grp" = factor(rep(LETTERS[1:5],each=20)),"V" = rnorm(100),"F" = c(rep("a",80),rep("b",20)))
boxplot(V~Grp,df,col=c("red","red","red","red","blue"))
# these are the combinations of group and F, the color-by variable
unique(df[c('Grp','F')])
# Grp F
# 1 A a
# 21 B a
# 41 C a
# 61 D a
# 81 E b
## you need a color vector that is the same length as the grouping variable in boxplot
(colors <- c('red','blue')[unique(df[c('Grp','F')])$F])
# [1] "red" "red" "red" "red" "blue"
# plot with a legend for F
boxplot(V ~ Grp, df, boxfill = colors)
legend('top', horiz = TRUE, fill = unique(colors), legend = levels(df$F), bty = 'n')
Color boxes of boxplot differently by factor levels
Question
I assumed this would be a straight forward task, but despite searching stackoverflow, the documentation and the R-help archives I have been unable to find the answer. I need to be able to color the boxes of a boxplot differently based on factor values.
The simplified following example shows the desired result:
df<-data.frame("Grp" = rep(LETTERS[1:5],each=20),"V" = rnorm(100),"F" = c(rep("a",80),rep("b",20)))
boxplot(V~Grp,df,col=c("red","red","red","red","blue"))
What I need to do is replace the col=c(...)
with something that says the equivalent of "The colors of boxes having F="a"
will be red, and the colors of boxes having F="b"
will be blue".
In real data, of course, there are several factors, there are many more Grps, and so on.
Any ideas will be appreciated.
Thank you.
Solution 2
OTHER TIPS
This is easy to do with ggplot2
. I can't think of a straightforward way to do it with base graphics.
The idea is to create a color variable in your dataset that depends on the factor. Then pass it to ggplot
as a color attribute:
df<-data.frame("Grp" = rep(LETTERS[1:5],each=20),"V" = rnorm(100),"F" = c(rep("a",80),rep("b",20)))
df$boxcolor <- with(df, ifelse(F == "a", "red", "blue"))
library(ggplot2)
ggplot(df, aes(x = Grp, y = V, color = boxcolor)) + geom_boxplot()
In your simple example, you can pass the variable F
directly as the color variable and let ggplot choose the colors for you. I don't know if this will scale up to your more complex problem.
ggplot(df, aes(x = Grp, y = V, color = F)) + geom_boxplot()