Question

im a newbie in R.

I've got timestamps in a list in this style:

 [1] "2011-10-04 17:23:28 CEST" "2011-10-04 17:26:13 CEST" "2011-10-05 16:17:34 CEST" "2011-10-07 09:59:37 CEST"

Now I want to plot a graph which shows me how many events occur in january, february and so on.

Every timestamp represents one event and there may be months without an event (that should be shown as 0)

Was it helpful?

Solution

I would put the vector of times into a column of a data.frame. The example data below has a thousand timestamps with a random time between the current time and two year from now.

dat = data.frame(timestamp = Sys.time() + sort(round(runif(1000, (24*3600), (2*365*24*3600)))))

Next step is to create a new column that identifies which month and year the timestamp is in:

dat$month = strftime(dat$time, "%b")
dat$year = strftime(dat$time, "%Y")

Now we can count the timestamps per month for each year using count from the plyr package.

library(plyr)
timestamps_month = count(dat, vars = c("month","year"))

And create the histogram with ggplot2:

library(ggplot2)
ggplot(data = timestamps_month) + geom_bar(aes(x = month, y = freq, fill = year), stat="identity", position = "dodge")

See this SO post for an example of how the resulting plot looks like:

How to create histogram in R with CSV time data?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top