Question

I'd like to make a density plot in rpy2 (using ggplot2) that has y-values representing fractional counts, such that the y axis can be interpreted as "fraction of data points" that have a particular value. My code is:

df = pandas.melt(pandas.DataFrame({"x": np.random.rand(1000),
                                   "y": list(np.random.rand(20)) + [np.nan] * 980}))
# pandas dataframe to R
r_df = make_r_df(df)
r.pdf("plot.pdf")
p = ggplot2.ggplot(r_df) + \
    ggplot2.geom_density(aes_string(x="value",
                                    y="..count../..sum..(..count..)")) + \
    ggplot2.facet_wrap(Formula("~ variable"))
p.plot()

x has more points than y and the resulting plot shows that the density for y is uniformly lower -- this does not make sense if the y axis is normalized to the number of points. It seems like y=..count../..sum..(..count..) is somehow not interpreted. How can I get this to work? thanks.

Was it helpful?

Solution

I think that it should be sum(),not ..sum..() (and a quick search to verify is pointing toward a similar question on SO

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top