Question

I have some data in R stored in a data.frame that looks like this:

time     value
53       5
55       5
59       7
61       9
79       6
118      11
200      5

I would like to bucket my data by time, making buckets of 60 seconds (the time is in seconds). However, what I want to do is make a new data.frame that will keep only the 1st and the last entry of each bucket. I understand I can do this easily with a loop but my problem is with how to tell R to find the 1st and the last element of a bucket.

Was it helpful?

Solution

data.table makes this convenient. Here, columns are added for the first and last value in each minute:

xt <- data.table(x)
xt[,first:=head(.SD,1), by=time %/% 60]
xt[,last:=tail(.SD,1), by=time %/% 60]
xt
##    time value first last
## 1:   53     5     5    7
## 2:   55     5     5    7
## 3:   59     7     5    7
## 4:   61     9     9   11
## 5:   79     6     9   11
## 6:  118    11     9   11
## 7:  200     5     5    5

Here is one easy way to trim this to the minute buckets. Modify the time column so that it indicates the head of the minute, remove the value column, and pass to unique:

xt$time <- 60 * xt$time %/% 60
xt$value <- NULL
unique(xt)
##    time first last
## 1:    0     5    7
## 2:   60     9   11
## 3:  180     5    5

To get the times and values for the first and last rows in each minute, aggregate.data.frame works well, but you need two passes.

First values:

aggregate(cbind(time, value) ~ time %/% 60, data=x, FUN=head, 1)
##   time%/%60 time value
## 1         0   53     5
## 2         1   61     9
## 3         3  200     5

Last values:

aggregate(cbind(time, value) ~ time %/% 60, data=x, FUN=tail, 1)
##   time%/%60 time value
## 1         0   59     7
## 2         1  118    11
## 3         3  200     5

These may then be combined into the desired output.

OTHER TIPS

It appears that the times are ascending so this gives the subset of rows which are first or last in each bucket of 60 seconds.

subset(DF, time %in% unlist(tapply(time, time %/% 60 * 60, range)))

giving:

  time value
1   53     5
3   59     7
4   61     9
6  118    11
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top