Plot time data in R to various resolutions (to the minute, to the hour, to the second, etc.)

StackOverflow https://stackoverflow.com/questions/1256347

  •  12-09-2019
  •  | 
  •  

Question

I have some data in CSV like:

"Timestamp", "Count"
"2009-07-20 16:30:45", 10
"2009-07-20 16:30:45", 15
"2009-07-20 16:30:46", 8
"2009-07-20 16:30:46", 6
"2009-07-20 16:30:46", 8
"2009-07-20 16:30:47", 20

I can read it into R using read.cvs. I'd like to plot:

  1. Number of entries per second, so:
    "2009-07-20 16:30:45", 2
    "2009-07-20 16:30:46", 3
    "2009-07-20 16:30:47", 1
    
  2. Average value per second:
    "2009-07-20 16:30:45", 12.5
    "2009-07-20 16:30:46", 7.333
    "2009-07-20 16:30:47", 20
    
  3. Same as 1 & 2 but then by Minute and then by Hour.

Is there some way to do this (collect by second/min/etc & plot) in R?

Was it helpful?

Solution

Read your data, and convert it into a zoo object:

R> X <- read.csv("/tmp/so.csv")
R> X <- zoo(X$Count, order.by=as.POSIXct(as.character(X[,1])))

Note that this will show warnings because of non-unique timestamps.

Task 1 using aggregate with length to count:

R> aggregate(X, force, length)
2009-07-20 16:30:45 2009-07-20 16:30:46 2009-07-20 16:30:47 
                  2                   3                   1 

Task 2 using aggregate:

R> aggregate(X, force, mean)
2009-07-20 16:30:45 2009-07-20 16:30:46 2009-07-20 16:30:47 
             12.500               7.333              20.000 

Task 3 can be done the same way by aggregating up to higher-order indices. You can call plot on the result from aggregate:

plot(aggregate(X, force, mean))

OTHER TIPS

Averaging the data is easy with the plyr package.

library(plyr)
Second <- ddply(dataset, "Timestamp", function(x){
    c(Average = mean(x$Count), N = nrow(x))
})

To do the same thing by minute or hour, then you need to add fields with that info.

library(chron)
dataset$Minute <- minutes(dataset$Timestamp)
dataset$Hour <- hours(dataset$Timestamp)
dataset$Day <- dates(dataset$Timestamp)
#aggregate by hour
Hour <- ddply(dataset, c("Day", "Hour"), function(x){
    c(Average = mean(x$Count), N = nrow(x))
})
#aggregate by minute
Minute <- ddply(dataset, c("Day", "Hour", "Minute"), function(x){
    c(Average = mean(x$Count), N = nrow(x))
})
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top