Question

this is related to the question here, but the proposed solutions don't work in my case.

I've already posted a question regarding my large data.frame here, but if you just want to download it (>2000 rows) please do so via this link.

So the following code is used to create a simple geom_barbarplot in ggplot2

gghist<-ggplot(ARCTP53_SOExample[ARCTP53_SoExample$Structural_motif=="NDBL/beta-sheets",],
aes(x=p53_IHC))
gghist+geom_bar()

This produces this plot: A barplot with lots of "NA's"

As you can see, the NA's are plotted as well. I've tried various options to remove the NA's including the following:

gghist<-ggplot(ARCTP53_SOExample[ARCTP53_SOExample$Structural_motif=="NDBL/beta-sheets",], aes(x=p53_IHC), drop=TRUE)
gghist+geom_bar()
gghist<-ggplot(ARCTP53_SOExample[ARCTP53_SOExample$Structural_motif=="NDBL/beta-sheets",], aes(x=p53_IHC), na.rm=TRUE)
gghist+geom_bar()
gghist<-ggplot(ARCTP53_SOExample[ARCTP53_SOExample$Structural_motif=="NDBL/beta-sheets",], aes(x=p53_IHC, na.rm=TRUE))
gghist+geom_bar()
gghist<-ggplot(ARCTP53_SOExample[ARCTP53_SOExample$Structural_motif=="NDBL/beta-sheets",], aes(x=factor(p53_IHC), na.rm=TRUE))
gghist+geom_bar()
 gghist<-ggplot(ARCTP53_SOExample[ARCTP53_SOExample$Structural_motif=="NDBL/beta-sheets",], aes(x=factor(p53_IHC))
    gghist+geom_bar(na.rm=TRUE)

And then I tried this:

gghist_2<-ggplot(na.omit(ARCTP53_SOExample[ARCTP53_EsoMutClean$Structural_motif=="NDBL/beta-sheets",]), aes(x=p53_IHC))
gghist_2+geom_bar()

Which gives me this error:

Error in as.environment(where) : 'where' is missing

Further, I tried this, which gives me the following errors

datasub<-ARCTP53_SOExample[!is.na(ARCTP53_SOExample)]
gghist_3<-ggplot(datasub[datasub$Structural_motif=="NDBL/beta-sheets",])
Error in datasub$Structural_motif : 
  $ operator is invalid for atomic vectors

And this code doesn't work either:

gghist_3<-ggplot(datasub, aes(x=p53_IHC))
Error: ggplot2 doesn't know how to deal with data of class character

So, does anyone know an easy solution to this? The data.frame is big and inherently has a lot of missing data depending on which column I'm looking at, but not all missing data is missing across all rows, so deleting any row that has a single "NA" in it, would be defeating the point.

Help is much appreciated.

Kind regards,

Oliver

EDIT Following Daniel's comments below, this what I get following the code below. As you can see the problem is still not solved. Sorry.

gghist<-ggplot(ARCTP53_SOExample[ARCTP53_SOExample$Structural_motif=="NDBL/beta-sheets" & is.na(ARCTP53_SOExample$p53_IHC) == F,], aes(x=p53_IHC))
gghist+geom_bar()

UPDATE: Re-Ran the code and for some reason it works now. Got the plot below:

It works

NAs reduced but still there

Was it helpful?

Solution

I couldn't downlaod your data, but this should work:

gghist<-ggplot(ARCTP53_SOExample[ARCTP53_SoExample$Structural_motif=="NDBL/beta-sheets" & is.na(ARCTP53_SOExample$p53_IHC) == F,], aes(x=p53_IHC))
gghist+geom_bar()
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top