Question

I have a dataset which has a categorical variable and two continuous variables. I am attempting to make a scatterplot with confidence ellipses. However, one of the ellipses looks like a pac man, for lack of a better explanation. Also, I am not certain I am making an ellipse for the confidence intervals.

Here is my data file on dropbox: https://www.dropbox.com/s/fal6x9jzk5kvafl/cv12.csv

Here is my code

qplot(data = cv12, x = x, y = y, colour = taxa) + 
  stat_ellipse(geom = "polygon", alpha = 1/2, aes(fill = taxa)) + 
  coord_fixed() + 
  xlim(-5,5) + 
  ylim(-4.5,4.5) + 
  scale_fill_manual (values=c("blue2","gray16","red2","#a65628","purple2")) + 
  scale_colour_manual (values=c("blue2","gray16","red2","#a65628","purple2"))

The purple ellipse is shaped like pac-man, but everything else looks okay. I can't figure out what I am doing wrong.

Please let me know if you have any questions.

Was it helpful?

Solution

As baptiste noted, you probably wanted to do this:

qplot(data = dat, x = x, y = y, colour = colour) + 
    stat_ellipse(geom = "polygon", alpha = 1/2, aes(fill = colour)) + 
    coord_fixed(xlim = c(-5,5),ylim = c(-4.5,4.5)) +
    scale_fill_manual (values=c("blue2","gray16","red2","#a65628","purple2")) + 
    scale_colour_manual (values=c("blue2","gray16","red2","#a65628","purple2"))

xlim() and ylim() drop observations up front and they are never included in any layers at all. Hence the weird looking ellipse, because you're actually missing values.

Setting xlim and ylim in coord_fixed() (or more commonly coord_cartesian()) will clip the plot after the fact.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top