Question

I'm plotting a boxplot of y for the interactions between two variables x1 and x2. The problem is that for levels where there is no data, boxplot still shows blank space for the boxplot.

How can I easily avoid the blank space? In reality I have many more than two factor levels. Also, I would like to avoid ggplot2-based solutions.

Example:

> set.seed(0)
> t <- data.frame(y =rnorm(60),
                  x1 = rep(c("a","a","b"), each=20),
                  x2 = rep(c("c","d","d"), each=20))
> boxplot(y~x1+x2, t)
> points(aggregate(y~x1+x2, t, mean)$y, col="red")

The points function to plot the means does not know about the missing interaction b.c, so the points don't correspond to groups:

enter image description here

I could work it out from the output of boxplot(y~x1+x2, t, plot=F), but I don't know how to easily plot the modified object.

> b <- boxplot(y~x1+x2, t, plot=F)
> i <- complete.cases(t(b$stats))
> b$stats <- b$stats[,i]
> b$n <- b$n[i]
> b$conf <- b$conf[,i]
> b$names <- b$names[i]
Was it helpful?

Solution

You can create one variable containing the interaction with interaction. Then you can drop the unused levels with droplevels:

boxplot(y ~ droplevels(interaction(x1, x2)), t)

points(aggregate(y ~ droplevels(interaction(x1, x2)), t, mean)$y, col="red")

enter image description here

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top