Question

I have a series of data that indicates how long ago a certain type of DNA element was active in the genome. It might look something like this:

data.df <- data.frame(name=c("type1", "type1", "type1", "type2", "type2", "type2"),
                      active=c(9,11,10,21,21,18))

So there are three 'type1' elements active approximately 10 years ago and three type 2 elements active 20 years ago.

I've created a stacked density plot using ggplot2 to get a distribution of when each element was active, something like this:

ggplot(data.df, aes(x=active)) + geom_density(position="stack", aes(fill=name))

Stacked sample plot

I have information for the relative abundances of these elements, and I would like to multiply the height of each elements density by that number. This would end up giving me The actual abundance of activity of these elements in the genome, rather than just a distribution of their activity.

So my question boils down to: How do I transform/multiply the height of each element type's density by some factor, depending on group? For example, if I had 1000 type one elements in the genome and only 3 type 2 elements, the stacked density plot would be dominated by type 1, and you'd hardly see the curve associated with type 2.

I hope this makes sense. Thanks in advance!

Was it helpful?

Solution

I am not sure if I have understood your question correctly, but is this what you want?

ggplot(data.df)
+geom_density(aes(x=active,y=..scaled..,fill=name),position="stack")

ggplot2's help under stat_density says that scaled gives the "density estimate, scaled to maximum of 1".

Alternatively, you could also add a weight column (e.g., wght) to your data.frame, use the weight argument in geom_density and ignore the warning message

data.df=data.frame(name=c("type1","type1","type1","type1","type1","type1","type2", "type2","type2"),active=c(1.1,1,1,1,1,1,17.1,17,17),stringsAsFactors =FALSE)
data.df=within(data.df,wght<-c(rep(1/6,6),rep(4/9,3)))

ggplot(data.df)+
geom_density(aes(x=active,y=(..density..),fill=name,weight=wght),position="stack")

However, I do not exactly know how geom_density handles weights that do not sum up to 1.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top