Two interpretations come to mind.
Interpretation 1: It's the combination of "Date" + "Time" that matters, not the consecutive repetition. In this case, just use aggregate
(or your favorite aggregating function or package, like "data.table").
aggregate(value ~ Date + Time, mydf, median)
# Date Time value
# 1 A 1 4.5
# 2 B 1 6.0
# 3 A 2 4.0
# 4 B 2 4.0
# 5 A 3 3.0
# 6 B 3 2.0
# 7 A 4 2.0
# 8 A 5 7.0
# 9 B 5 3.0
# 10 B 6 4.0
Interpretation 2: The consecutive repetitions are important. In this case, you need another "grouping" variable. For this, we can use rle
. After that, the aggregation step is pretty much the same.
RLE <- rle(DF$Time)$lengths
RLE <- rep(seq_along(RLE), RLE)
aggregate(value ~ Date + Time + RLE, DF, median)
# Date Time RLE value
# 1 A 1 1 4.5
# 2 A 2 2 4.0
# 3 A 3 3 3.0
# 4 A 4 4 2.0
# 5 A 5 5 7.0
# 6 B 1 6 6.0
# 7 B 2 7 4.0
# 8 B 3 8 2.0
# 9 B 5 9 3.0
# 10 B 6 10 4.0
# 11 A 1 11 3.0
# 12 B 3 12 2.0
For the benefit of others, here's some reproducible data: mydf
and DF
. (DF
is just mydf
with a few rows repeated.)
mydf <- structure(list(Date = c("A", "A", "A", "A", "A", "A", "B", "B",
"B", "B", "B", "B", "B"), Time = c(1L, 1L, 2L, 3L, 4L, 5L, 1L,
2L, 2L, 2L, 3L, 5L, 6L), value = c(3L, 6L, 4L, 3L, 2L, 7L, 6L,
5L, 3L, 4L, 2L, 3L, 4L)), .Names = c("Date", "Time", "value"),
class = "data.frame", row.names = c(NA, -13L))
DF <- rbind(mydf, mydf[c(1, 1, 11, 11), ])