Question

I am trying to figure out a way to display only the top three bars of a data set. In order to make things simple, I'm using the diamond data set to illustrate what I'd like to do. First, I ordered it by largest to smallest.

library(data.table)
diamonds <- data.table(diamonds)
diamonds1 <- within(diamonds, cut <- factor(cut, levels=names(sort(table(cut), decreasing=TRUE))))

Then, I plotted.

ggplot(diamonds1, aes(cut, fill=cut)) + geom_bar(position="dodge") + guides(fill=FALSE) + ylab("Count") + xlab("Cut")

And I got this:

My plot

But instead of seeing all of the bars, I just want to see the top three. Additionally, I want this to be repeatable, so if the data set changes and there is a different top three, I can use the same code to create the correct top three. Is there any way to do this?

Was it helpful?

Solution

Sure, you can define xlim(). Add:

+ xlim('Ideal', 'Premium', 'Very Good')

Edit after @Arun comments below: A more direct approach would be to subset the data before you feed it to ggplot(). You can use data.table's features to make this very fast

setkey(diamonds, cut)  ## needed for fast subsetting and grouping
tt <- diamonds[, list(count=.N), by=cut]  ## same as table(diamonds$cut) but faster    
cut.values <- tt[order(count), cut][1:3]  ## select top 3 cut values by count
ggplot(diamonds[J(cut.values)], ...       ## run the same plot commands on subset of data
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top