By default, time difference in pandas is in nanosecond resolution, i.e. timedelta64[ns]
, so one way to convert it into seconds/minutes/hours/etc. is to divide its nanosecond representation by 10**9
to convert to seconds, by 60*10**9
for minutes etc. This method is at least 3 times faster than other methods suggested on this page.1
df['diff_in_seconds'] = df['from_date'].sub(df['to_date']).view('int64') // 10**9
df['diff_in_minutes'] = df['from_date'].sub(df['to_date']).view('int64') // (60*10**9)
df['diff_in_hours'] = df['from_date'].sub(df['to_date']).view('int64') // (3600*10**9)
PS: The above code assumes that you want the difference in whole seconds, minutes, hours etc. so it uses integer division (//
) but if you want the fractions as well, then use true division (/
) instead. That said, if you want the exact difference, then instead of fractional seconds/minutes/hours, consider converting the difference into higher resolution (milliseconds/microseconds/etc.)
1 Some benchmarks using Trenton McKinney's setup:
data = {'to_date': [pd.Timestamp('2014-01-24 13:03:12.050000'), pd.Timestamp('2014-01-27 11:57:18.240000')]*1000000,
'from_date': [pd.Timestamp('2014-01-26 23:41:21.870000'), pd.Timestamp('2014-01-27 15:38:22.540000')]*1000000}
df = pd.DataFrame(data)
df['Diff'] = df['from_date'] - df['to_date']
%timeit df['Diff'].view('int64') // (3600*10**9)
# 11 ms ± 271 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit df['Diff'] // pd.Timedelta(hours=1)
# 36.7 ms ± 2.99 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit df['Diff'].astype('timedelta64[h]')
# 46.5 ms ± 865 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit df['Diff'].dt.total_seconds() // 3600
# 169 ms ± 7.71 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)