Question

I have a dataframe mdata which looks like:

>head(mdata)

          ID  variable    value
 SJ5444_MAXGT   coding   4.241920
 SJ5426_MAXGT   coding   4.254331
 HR1383_MAXGT   coding   4.244994
 HR5522_MAXGT   missense 4.250347
CH30041_MAXGT   missense 4.303174
 SJ5438_MAXGT   utr.3    4.242218

and I am trying to plot a violin plot like this:

x1<- mdata$value[mdata$variable=='coding']
x2<- mdata$value[mdata$variable=='missense']
x3<- mdata$value[mdata$variable=='utr.3']

vioplot(x1, x2, x3, names=as.character(unique(mdata$variable)), col="red")
title("Violin Plot: Log10 values")

But I have another dataframe ndata which looks like:

>head(ndata)

           ID variable   value
 SJ5444_MAXGT   coding   17455
 SJ5426_MAXGT   coding   17961
 HR1383_MAXGT   coding   17579
 HR5522_MAXGT   missense 17797
CH30041_MAXGT   missense 20099
 SJ5438_MAXGT   utr.3    17467

Basically mdata$value is:

mdata$value = log10(ndata$value)

So I can make the Violin plot alright. But I need to change the Y-axis labels to match ndata$value and not mdata$value. I am plotting mdata$value but want the Y-axis labels taken from ndata$value. Just FYI, this is a subset of the actual data & min and max values in actual data are 12 & 36937 and I know how to plot it on a boxplot using:

axis(side=2,labels=round(10^(seq(log10(min(ndata$value)),log10(max(ndata$value)),len=5))),at=seq(log10(min(ndata$value)),log10(max(ndata$value)),len=5))

But I cannot plot the Y-axis labels to match ndata$value in the Violin plot. Any suggestions?

P.S. I could not find a tag vioplot or violinplot so I couldn't tag it.

Was it helpful?

Solution

vioplot isn't very flexible -- it doesn't allow you to turn off the axis labels or modify them -- but you can create your own empty plot first, then add the violin plot to it with vioplot(...,add=TRUE), then add the labels manually, as follows:

## make up data
set.seed(101)
x1 <- rlnorm(1000,meanlog=3,sdlog=1)
x2 <- rlnorm(1000,meanlog=3,sdlog=2)
x3 <- rlnorm(1000,meanlog=2,sdlog=2)

Now create the plot:

library(vioplot)
par(las=1,bty="l")  ## my preferred setting
## set up empty plot
plot(0:1,0:1,type="n",xlim=c(0.5,3.5),ylim=range(log10(c(x1,x2,x3))),
     axes=FALSE,ann=FALSE)
vioplot(log10(x1),log10(x2),log10(x3),add=TRUE)
axis(side=1,at=1:3,labels=c("first","second","third"))
axis(side=2,at=-2:4,labels=10^(-2:4))

enter image description here

Alternately, you could use ggplot2::geom_violin() along with scale_y_log10() (I think).

OTHER TIPS

Based on Ben Bolker's suggestion, I used ggplot2::geom_violin() and achieved what I wanted, plotting log10(value) but labeling 'value' as such on the Y-axis using:

ggplot(mdata, aes(variable, log10(value))) + geom_violin(colour="black",fill="red")
+ scale_y_continuous(
breaks = seq(log10(min(mdata$value)),log10(max(mdata$value)),len=5), 
labels = round(10^(seq(log10(min(mdata$value)),log10(max(mdata$value)),len=5)))
)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top