Question

I am trying to calculate the time differences between two consecutive observerations in data.frame called temp. I have both the time and the date:

           id version       date     time
872169 261986       0 2012-01-13 24:24:34
872170 262026       0 2012-01-13 24:26:11
872171 262037       0 2012-01-13 00:02:46
872172 262053       0 2012-01-14 00:10:28
872173 262074       0 2012-01-14 00:28:42
872174 262090       0 2012-01-15 14:29:31

Class of the time vector is of course times. Now I can create a vector containing the the differences:

count <- as.difftime(temp[,6], units="mins")

But how do I account for the days? I tried several things: I combined the date and the time vector:

as.difftime(paste(temp[,4], temp[,6]), unit="min")

but this only gives me NA's.

Also

as.difftime(strptime( paste(temp[,4], temp[,6]), "%Y-%m-%d %H:%M:%S"), unit="mins")

didn't work.

difftime() also doesn't work doesnt work since the dates aren't in two distinct vectors. I could think of copying the date vector and shifting it upwards so that the first value of the second date vector is the second element of the first date vector. But there must be something smarter.

Thanks in advance!

Was it helpful?

Solution

Use both columns as input:

> temp <- read.table(text="           id version       date     time
+ 872169 261986       0 2012-01-13 24:24:34
+ 872170 262026       0 2012-01-13 24:26:11
+ 872171 262037       0 2012-01-13 00:02:46
+ 872172 262053       0 2012-01-14 00:10:28
+ 872173 262074       0 2012-01-14 00:28:42
+ 872174 262090       0 2012-01-15 14:29:31", header=TRUE, stringsAsFactors=FALSE)

# didn't actually need the as.character but you probably have factor variables

> temp$tm <- as.POSIXct( paste(as.character(temp[[3]]), as.character(temp[[4]]) ) )
> temp$count <- c(NA, as.numeric(diff( temp$tm , units="min"))/60 )
> temp
           id version       date     time         tm count
872169 261986       0 2012-01-13 24:24:34 2012-01-13    NA
872170 262026       0 2012-01-13 24:26:11 2012-01-13     0
872171 262037       0 2012-01-13 00:02:46 2012-01-13     0
872172 262053       0 2012-01-14 00:10:28 2012-01-14  1440
872173 262074       0 2012-01-14 00:28:42 2012-01-14     0
872174 262090       0 2012-01-15 14:29:31 2012-01-15  1440

I figured out that you had malformed date-time; using both "24" and "00" as hours. That makes no sense. If we change the 24's to 23 it works as expected:

> temp$tm <- as.POSIXct( paste(as.character(temp[['date']]), as.character(temp[['time']]) ) )
>     temp$count <- c(NA, as.numeric(diff( temp$tm , units="min"))/60 )
> temp
           id version       date     time                  tm        count
872169 261986       0 2012-01-13 23:24:34 2012-01-13 23:24:34           NA
872170 262026       0 2012-01-13 23:26:11 2012-01-13 23:26:11   0.02694444
872171 262037       0 2012-01-13 00:02:46 2012-01-13 00:02:46 -23.39027778
872172 262053       0 2012-01-14 00:10:28 2012-01-14 00:10:28  24.12833333
872173 262074       0 2012-01-14 00:28:42 2012-01-14 00:28:42   0.30388889
872174 262090       0 2012-01-15 14:29:31 2012-01-15 14:29:31  38.01361111
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top