One way would be to use groupby
and transform
. max - min
is also called peak-to-peak
, or ptp
for short, and so ptp
here basically means for lambda x: x.max() - x.min()
.
>>> df = pd.read_csv("eye.csv",sep="\s+")
>>> df["duration"] = df.dropna().groupby("event")["time"].transform("ptp")
>>> df
time event duration
49 44295 NaN NaN
50 44311 NaN NaN
51 44328 NaN NaN
52 44345 2 66
53 44361 2 66
54 44378 2 66
55 44395 2 66
56 44411 2 66
57 44428 3 50
58 44445 3 50
59 44461 3 50
60 44478 3 50
61 44495 NaN NaN
62 44511 NaN NaN
63 44528 NaN NaN
64 44544 NaN NaN
65 44561 NaN NaN
66 44578 NaN NaN
67 44594 NaN NaN
68 44611 4 33
69 44628 4 33
70 44644 4 33
71 44661 NaN NaN
72 44678 NaN NaN
The dropna
was to prevent each NaN
value in the event
column from being considered its own event. (There's also something weird going on in how ptp
works when the key is NaN
too, but that's a separate issue.)