controlling z-order of points and combining legends from multiple layers in ggplot2/rpy2

StackOverflow https://stackoverflow.com/questions/16880159

  •  30-05-2022
  •  | 
  •  

Question

I have a dense scatter plot where I want to control the order of the points being plotted, since using an alpha does not work - basically, there is a minority of points that I want to be plotted "on top". It sounds like the only way to control this "z order" of points in ggplot2 is to use layers. I have the following:

df <- data.frame(x=rnorm(10000),y=rnorm(10000))
df$g <- c("1")
df[500,]$g <- c("special")
df[600,]$g <- c("special2")
ggplot(df) + geom_point(aes(x=x, y=y, colour=g))

This displays a nice legend for the value in the g column. Here the points with the values "special", "special2" happen to be plotted on top of the rest of the points but many times this is not true and in that case I tried breaking up the dataframe into layers, plotting each layer one by one to ensure I get the order of points:

df1 <- df[df$g == "special",]
df2 <- df[df$g == "special2",]
df3 <- df[df$g == "1",]
# plot layer by layer
p <- ggplot() + geom_point(aes_string(x=x,y=y), data=df1, color="red")
p <- p + geom_point(aes_string(x=x, y=y), data=df2, color="green")
p <- p + geom_point(aes_string(x=x, y=y), data=df3, color="blue")

this works but then no legend is shown unlike in the first case where df is plotted with just colour=g. how can I get the legend to show?

and also, is there a simpler way to do this? The original call is perfect except the layering issue. Is there no way to tell ggplot the order, e.g.

ggplot(df) + geom_point(aes(x=x, y=y, colour=g), order=c("1", "special", "special2"))

to denote that it should first plot all the points with g value 1, then all the points with g value special, etc. Breaking up the dataframe into separate one and using layers and then specifying the legend manually is a huge pain just to control the order of points plotted. In the simple first call, ggplot also takes care of the colors for you which is very nice. thanks.

ps i am using rpy2 to make these plots but i showed R since the translation is (usually) straightforward to rpy2, but rpy2 solutions would be even better.

Was it helpful?

Solution

The order of geom_point is determined by the order of your data. In other words, data points at the bottom of the data frame will be plotted last.

This points to a simple solution: Sort your data frame before plotting:

df <- df[order(df$g), ]
df$size <- df$g != 1
ggplot(df) + geom_point(aes(x=x, y=y, colour=g, size=size))

enter image description here

However, in many cases it will simply be easier to use different layers, as you suggested in your question. In this case, you can still use the aes argument to specify the colour scale:

ggplot() + 
  geom_point(data=df[df$g==1, ], aes(x=x, y=y, colour=g), size=1) +
  geom_point(data=df[df$g!=1, ], aes(x=x, y=y, colour=g), size=5)

enter image description here

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top