Question

I want to count all the occurrences that take place in each period defined by a vector of dates. The vector indicate the first day of each period. The result should be a vector of same length of the input vector and with the number of occurrences.

I came up with a "looped" solution that is terribly inefficient (see below). I wonder if there is any way to process the same task more quickly.

events <- c("2000-01-05", "2000-02-08", "2000-04-09", "2000-02-08", "2000-03-13", "2000-03-13")

# Create vector of dates (in this case 52, 7 days periods)
week_vector = as.Date("2000-01-01")
i <- 1; N <- 51
while (i <= N) {
week_vector = append(week_vector, as.Date(week_vector[i] + 7)) 
  i <- i + 1
}

i <- 1; N <- length(week_vector)
while (i <= N) {
  occurrences_by_week <- sum(events >= week_vector[i] & events < week_vector[i] + 7)
}

I originally came out with this solution (using rollapply of the zoo package). But with rollapply I am not able to define the day when I wish to start grouping the occurrences:

frequency <- as.data.frame(table(as.Date(events)))

frequency.zoo <- read.zoo(frequency)

frequency.zoo.week <- rollapply(frequency.zoo, 7, sum, by = 7)
Was it helpful?

Solution

Something like this?

events <- as.Date(c("2000-01-05", "2000-02-08", "2000-04-09", "2000-02-08",
            "2000-03-13", "2000-03-13"))

week_vector <- seq(from = as.Date("2000-01-01"), to = as.Date("2000-12-23"), by = 7)
# or arguments more similar to the wording in the question, "52 [dates], 7 days periods":
week_vector <- seq(from = as.Date("2000-01-01"), length.out = 52, by = 7)

events2 <- cut(events, breaks = week_vector)

table(events2)

# 2000-01-01 2000-01-08 2000-01-15 2000-01-22 2000-01-29 2000-02-05 2000-02-12 2000-02-19 
# 1          0          0          0          0          2          0          0 
# 2000-02-26 2000-03-04 2000-03-11 2000-03-18 2000-03-25 2000-04-01 2000-04-08 2000-04-15 
# 0          0          2          0          0          0          1          0 
# 2000-04-22 2000-04-29 2000-05-06 2000-05-13 2000-05-20 2000-05-27 2000-06-03 2000-06-10 
# 0          0          0          0          0          0          0          0 
# 2000-06-17 2000-06-24 2000-07-01 2000-07-08 2000-07-15 2000-07-22 2000-07-29 2000-08-05 
# 0          0          0          0          0          0          0          0 
# 2000-08-12 2000-08-19 2000-08-26 2000-09-02 2000-09-09 2000-09-16 2000-09-23 2000-09-30 
# 0          0          0          0          0          0          0          0 
# 2000-10-07 2000-10-14 2000-10-21 2000-10-28 2000-11-04 2000-11-11 2000-11-18 2000-11-25 
# 0          0          0          0          0          0          0          0 
# 2000-12-02 2000-12-09 2000-12-16 
# 0          0          0

OTHER TIPS

Using cut and table:

events <- c("2000-01-05", "2000-02-08", "2000-04-09", "2000-02-08", "2000-03-13", "2000-03-13")
events <- as.Date(events)
events_week <- cut(events, breaks = "week")
table(events_week)

With custom breaks:

breaks_custom = c("2000-01-01", "2000-02-01", "2000-03-01", "2000-05-01")
breaks_custom = as.Date(breaks_custom)
events_cut <- cut(events, breaks = breaks_custom)
table(events_cut)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top