R: Combining lattice dotplots and ordering data
Question
To create my dotplot I am using the following text file:
## filename difference RMSD
1bso.pdb 1.0 0.5645
1cj51.9.pdb 2.0 3.5596
1cj51.1.pdb 3.0 3.5573
3qzj.pdb 3.0 0.8302
1bsy.pdb 4.0 0.5387
1cj51.5.pdb 8.0 3.9864
2gj5.pdb 10.0 0.8446
1cj51.10.pdb 11.0 3.5914
1uz2.pdb 12.0 1.7741
2blg.pdb 12.0 0.5449
The 1st column is the file name, second column in the difference and the 3rd is the RMSD. The data was ordered so the difference is ascending.
I can create individual dot plots using the following commands:
# This plots the difference
library(lattice)
data <- read.table("~/Documents/Beta_test_area/pa.txt", header=F, sep="\t")
dotplot(V1~V2, xlim=c(0, 150), xlab="CCS Difference", data=data)
# This plots the RMSD
dotplot(V1~V3, xlim=c(0, 5), xlab="RMSD", data=data)
On the graph the data on the Y axis is ordered by file name and the data is not plotted as in the text file, how can I order the Y axis to mirror the order in the data file?
The other problem I am having is combining the plots. How can I have the make the plots so that I have the plots in one row but over two columns. With the difference plot on the left and the RMSD plot on the right.
Solution
@Roman's part #1 is correct -- here's a slightly slick way to get the order the way you want it.
dat <- read.table(textConnection("
filename diff RMSD
1bso.pdb 1.0 0.5645
1cj51.9.pdb 2.0 3.5596
1cj51.1.pdb 3.0 3.5573
3qzj.pdb 3.0 0.8302
1bsy.pdb 4.0 0.5387
1cj51.5.pdb 8.0 3.9864
2gj5.pdb 10.0 0.8446
1cj51.10.pdb 11.0 3.5914
1uz2.pdb 12.0 1.7741
2blg.pdb 12.0 0.5449"),
header=TRUE)
dat <- transform(dat,filename=factor(as.character(filename),
levels=filename))
The grid.arrange
function from the gridExtra
package is handy for arranging lattice plots:
library(lattice)
d1 <- dotplot(filename~diff, xlim=c(0, 150), xlab="CCS Difference", data=dat)
# This plots the RMSD
d2 <- dotplot(filename~RMSD, xlim=c(0, 5), xlab="RMSD", data=dat)
library(gridExtra)
grid.arrange(d1,d2,nrow=1)
Or (from @Aaron):
library(latticeExtra)
c(d1,d2)
Alternatively as @Roman suggested you can create small multiples.
library(reshape)
m <- melt(dat)
dotplot(filename~value|variable,
scales=list(x=list(relation="free")), xlim=list(c(0,150), c(0,5)),
data=m)
Or
library(ggplot2)
g1 <- qplot(value,filename,data=m)+
facet_grid(.~variable,scale="free")+theme_bw()+
opts(panel.margin=unit(0,"lines"))
although here I really don't know how to set the x axis limits panel-by-panel, other than doing something nasty like trying to add invisible points appropriately.
edit: panel-by-panel scaling from Josh O'Brien, latticeExtra from Aaron
OTHER TIPS
I think your first question is related to ordering of factors. It's a common problem but once you learn the trick that factors use, it become a (nice) feature. This has been discussed a number of times, at least here and here.
I'm not sure I understand your second question down to all the details, but generally there are two strategies. In base graphics, you can use par
argumentmfrow
to open a device with defined rows/columns into which you plot your graphics, e.g. par(mfrow = c(2, 1))
which will plot two plots in two rows and one column. par(mfrow = c(2,2))
will give you graphs laid out in 2x2 grid. You can also consider alternatives, layout
and split.screen
.
In grid graphics (think lattice and ggplot2), the approach is different. You can plot a number of graphs in a grid, using |
or facet_grid
for lattice
and ggplot2
, respectively.