Question

I have this plot which is too crowded to be useful:

ggplot(data = meandist.SG, aes(x = starttime,y = meandist)) +  #set main plot variables
    geom_ribbon(aes(ymin=meandist-se, ymax=meandist+se, fill=mapped), alpha=0.1) + #add standard error 
    geom_line(aes(colour = mapped),alpha = 1) +  #add a line for each group 
    labs(title = "Comparison of Groups", x = "time (s)", y = "mean distance (mm)") #set title, and axis labels

unreadable

I can make a plot for each pair of groups by wrapping the following in mlply and passing in the possible group pairs. But this means I can't easily see all the plots at the same time.

ggplot(data = subset(meandist.SG, mapped %in% c('a', 'f')) ,aes(x = starttime,y = meandist)) +  #set main plot variables
    geom_ribbon(aes(ymin=meandist-se, ymax=meandist+se, fill=mapped), alpha=0.1) + #add standard error to main plot
    geom_line(aes(colour = mapped),alpha = 1,size = 1) + #plot a line on main plot for each group
    labs(title = 'GroupA and GroupB, Distance over Time', x = "time (s)", y = "mean distance (mm)")

a single pair

What I'd like to do is create a single image with the paired group plots arranged like a pairplot with the mapped factor as the diagonal.

The data looks like this:

> str(meandist.SG)
'data.frame':   2400 obs. of  4 variables:
 $ starttime: num  0 0 0 0 0 0 0 0 60 60 ...
 $ mapped   : Factor w/ 8 levels "rowA","rowB",..: 1 2 3 4 5 6 7 8 1 2 ...
 $ meandist : num  123.2 115 91.9 112.8 108.6 ...
 $ se       : num  8.95 9.54 9.57 9.86 11.96 ...

> head(meandist.SG)
  starttime mapped meandist        se
1         0   rowA 123.1739  8.952757
2         0   rowB 114.9875  9.544961
3         0   rowC  91.8875  9.571005
4         0   rowD 112.7583  9.861424
5         0   rowE 108.5826 11.962127
6         0   rowF 126.4917  9.331622

I'm thinking I should use the GGally package, but I can't figure out how to use the levels of a factor as the diagonal. Ideas?

Was it helpful?

Solution

If I understand you correctly, here is a solution using facets. I had to generate a demo dataset because your sample is not nearly sufficient.

library(ggplot2)
library(data.table)
library(plyr)
# this generates the demo dataset - you have this already
set.seed(1)
df <- do.call(rbind,lapply(1:8,function(i){
  data.frame(starttime=seq(0,20000,100),
        mapped=LETTERS[i],
        meandist=100*i+rnorm(201,0,20),
        se=50)
}))
# you start here...
dt=data.table(df)
setnames(dt,c("starttime","mapped","meandist","se"),c("x","H","y.H","se.H"))
setkey(dt,x)
gg <- dt[,list(V=H,y.V=y.H,se.V=se.H),key="x"]
gg <- dt[gg, allow.cartesian=T]
ggp <- ggplot(gg,aes(x=x))
ggp <- ggp + geom_line(aes(y=y.H, color=H))
ggp <- ggp + geom_line(subset=.(H!=V), aes(y=y.V, color=V))
ggp <- ggp + geom_ribbon(aes(ymin=y.H-se.H, ymax=y.H+se.H, fill=H), alpha=0.1)
ggp <- ggp + geom_ribbon(aes(ymin=y.V-se.V, ymax=y.V+se.V, fill=V), alpha=0.1)
ggp <- ggp + facet_grid(V~H, scales="free")
ggp <- ggp + guides(fill=guide_legend("mapped"),color=guide_legend("mapped"))
ggp <- ggp + theme(axis.text.x=element_text(angle=-90,vjust=.2, hjust=0))
ggp <- ggp + labs(x="Start Time",y="Mean Distance")
print(ggp)

This creates a faceted pair-wise plot of meandist vs. starttime for each pair of groups (`mapped'). Note that you get two copies of each plot (above and below the diagonal).

This approach basically creates two copies of the dataset and does a Cartesian join on the x-variable (starttime). I use data tables because the join is much more efficient and the code is more compact. I renamed the columns for convenience.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top