Question

I am working with a Danish dataset on immigrants by country of origin and age group. I transformed the data so I can see the top countries of origin for each age group. I am plotting it using facet_wrap. What I would like to do is, since different age groups come from quite different areas, to show a different set of values for one axis in each facet. For example, those that are between 0 and 10 years old come from countries x,y and z, while those 10-20 years of age come from countries q, r, z and so on.

In my current version, it shows the entire set of values, including countries that are not in the top 10. I would like to show just the top ten countries of origin for each facet, in effect having different axis labels for each. (And, if it is possible, sorting by high to low for each facet). Here is what I have so far:

library(ggplot2)
library(reshape)
###load and inspect data
load(url('http://dl.dropbox.com/u/7446674/dk_census.rda'))
head(dk_census)

###reshape for plotting--keep just a few age groups
dk_census.m <- melt(dk_census[dk_census$Age %in% c('0-9 år', '10-19 år','20-29 år','30-39 år'),c(1,2,4)])

###get top 10 observations for each age group, store in data frame
top10 <- by(dk_census.m[order(dk_census.m$Age,-dk_census.m$value),], dk_census.m$Age,     head, n=10)
top10.df<-do.call("rbind", as.list(top10))
top10.df

###plot
ggplot(data=top10.df, aes(x=as.factor(Country), y=value)) +
  geom_bar(stat="identity")+
  coord_flip() +
  facet_wrap(~Age)+
  labs(title="Immigrants By Country by Age",x="Country of Origin",y="Population")

immigrant chart

Was it helpful?

Solution

One option (that I actually strongly suspect you won't be happy with) is this:

p <- ggplot(data=top10.df, aes(x=Country, y=value)) +
  geom_bar(stat="identity")+
  coord_flip() +
  facet_wrap(~Age)+
  labs(title="Immigrants By Country by Age",x="Country of Origin",y="Population")

pp <- dlply(.data=top10.df,.(Age),function(x) {x$Country <- reorder(x$Country,x$value); p %+% x})
library(gridExtra)
do.call(grid.arrange,pp)

(Edited to sort each graph.)

Keep in mind that the only reason faceting exists is to plot multiple panels that share a common scale. So when you start asking to facet on some variable, but have the scales be different (oh, and also sort them separately on each panel as well) what you're doing is really no longer faceting. It's just making four different plots and arranging them together.

OTHER TIPS

using lattice (Here I use ``latticeExtrafor ggplot2 theme), you can set torelation=freebetween panels. Here I am using abbreviate = TRUE` to short long labels.

library(latticeExtra)


barchart(value~ Country|Age,data=top10.df,layout=c(2,2),
         horizontal=T, 
         par.strip.text =list(cex=2),
         scales=list(y=list(relation='free',cex=1.5,abbreviate=T,
                            labels=levels(factor(top10.df$Country)))),
#         ,cex=1.5,abbreviate=F),
         par.settings = ggplot2like(),axis=axis.grid,
         main="Immigrants By Country by Age",
         ylab="Country of Origin",
         xlab="Population")

enter image description here

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top