Question

I have a set of UNIX timestamps and URIs and I'm trying to plot the cumulative count of requests for each URI. I managed to do that for one URI at a time using a dummy column:

x.df$count <- apply(x.df,1,function(row) 1) # Create a dummy column for cumsum
x.df <- x.df[order(x.df$time, decreasing=FALSE),] # Sort
ggplot(x.df, aes(x=time, y=cumsum(count))) + geom_line()

However, that would make roughly 30 plots in my case.

ggplot2 does allow you to plot multiple lines into one plot (I copied this piece of code from here):

ggplot(data=test_data_long, aes(x=date, y=value, colour=variable)) +
    geom_line()

The problem is that, this way, cumsum() would count on and on.

Does anybody have an idea?

Was it helpful?

Solution

Here's a test data which uses plyr's transform to calculate the cumulative sum first and then apply that data to plot using ggplot2:

set.seed(45)
DF <- data.frame(grp = factor(rep(1:5, each=10)), x=rep(1:10, 5))
DF <- transform(DF, y=runif(nrow(DF)))

# use plyr to calculate cumsum per group of x
require(plyr)
DF.t <- ddply(DF, .(grp), transform, cy = cumsum(y))

# plot
require(ggplot2)
ggplot(DF.t, aes(x=x, y=cy, colour=grp, group=grp)) + geom_line()

enter image description here

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top