Pergunta

I want to create ggplot2::geom_area plots from summary datasets that have some data categories missing for some of the data periods (month), for instance:

require(ggplot2)
set.seed(1)
d = data.frame(x = rep(1:10,each=4), y = rnorm(40,10), cat=rep(c('A','B','C','D'), 10))
(d = d[-sample(1:40,10),]) # remove some rows
ggplot(d, aes(x, y, fill=cat)) + geom_area()

enter image description here

Ggplot's stacked area plot doesn't respond well to missing values, so it seems we need to add zero entries to the data.frame. The best way I can think (unless any better suggestions?) of is to reshape2::dcast it, convert NA's to zeros and reshape back. But I can't figure out the right formula. Grateful for assistance from someone who understands reshape(2).

require(reshape2)
dcast(d, x ~ cat)  # right direction but missing the data
    x    A    B    C    D
1   1    A    B    C    D
2   2 <NA>    B    C <NA>
3   3    A    B    C    D
4   4 <NA>    B    C <NA>
5   5    A <NA>    C    D
6   6    A    B    C    D
7   7 <NA>    B    C <NA>
8   8    A    B    C    D
9   9 <NA>    B <NA>    D
10 10    A    B <NA>    D
Foi útil?

Solução

# Expand the data.frame
p.data <- merge(d,
                expand.grid(x=unique(d$x),
                            cat=unique(d$cat),
                            stringsAsFactors=F),
                all.y=T)

# Fill NA values with zeros
p.data$y[is.na(p.data$y)] <- 0

# Plot the graph
ggplot(p.data, aes(x, y, fill=cat)) +
    geom_area()

enter image description here

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top