Pretty straightforward when you only want a couple of columns which you are specifying e.g. max of a, and min of b for example
In [65]: df = DataFrame(randn(100,4),columns=list('abcd'),
index=date_range('20130101 16:00',periods=100,freq='T'))
In [66]: df.head(20)
Out[66]:
a b c d
2013-01-01 16:00:00 0.404056 0.115774 -0.202356 0.998315
2013-01-01 16:01:00 -0.231966 0.262609 1.192302 -0.702163
2013-01-01 16:02:00 -0.467005 0.744724 -0.871782 -0.308637
2013-01-01 16:03:00 -0.175704 0.036244 1.404604 -0.106320
2013-01-01 16:04:00 0.046306 -0.098140 0.535573 -0.306300
2013-01-01 16:05:00 -0.115620 -1.069991 0.790965 -0.504283
2013-01-01 16:06:00 1.496555 0.373582 1.028092 -0.816990
2013-01-01 16:07:00 0.432081 0.182106 0.115107 1.239192
2013-01-01 16:08:00 -0.245789 -2.030840 0.118330 -1.922616
2013-01-01 16:09:00 -0.358188 -0.121750 1.768505 -2.096908
2013-01-01 16:10:00 -1.634722 -0.808355 -0.773417 0.095078
2013-01-01 16:11:00 -0.396295 0.168568 -0.901945 -0.073811
2013-01-01 16:12:00 -1.364391 2.052481 -0.175291 0.927363
2013-01-01 16:13:00 -0.523331 0.042475 0.361593 -2.239468
2013-01-01 16:14:00 1.573967 -0.709043 0.551812 0.452311
2013-01-01 16:15:00 0.180578 0.846856 -2.304107 -1.283507
2013-01-01 16:16:00 0.065386 0.356015 -0.174369 1.167562
2013-01-01 16:17:00 -1.747416 1.279114 0.559075 0.200927
2013-01-01 16:18:00 -2.041764 -0.085398 2.032789 0.195671
2013-01-01 16:19:00 -0.639329 0.268832 0.394621 -0.271260
rolling functions compute from that point on, so we timeshift (which just changes the index) so that the values align (with the start point, rather than the end point)
In [67]: df['max_a'] = pd.rolling_max(df['a'].tshift(-14),15)
In [68]: df['min_b'] = pd.rolling_min(df['b'].tshift(-14),15)
In [69]: df.head(20)
Out[69]:
a b c d max_a min_b
2013-01-01 16:00:00 0.404056 0.115774 -0.202356 0.998315 1.573967 -2.030840
2013-01-01 16:01:00 -0.231966 0.262609 1.192302 -0.702163 1.573967 -2.030840
2013-01-01 16:02:00 -0.467005 0.744724 -0.871782 -0.308637 1.573967 -2.030840
2013-01-01 16:03:00 -0.175704 0.036244 1.404604 -0.106320 1.573967 -2.030840
2013-01-01 16:04:00 0.046306 -0.098140 0.535573 -0.306300 1.573967 -2.030840
2013-01-01 16:05:00 -0.115620 -1.069991 0.790965 -0.504283 1.573967 -2.030840
2013-01-01 16:06:00 1.496555 0.373582 1.028092 -0.816990 1.573967 -2.030840
2013-01-01 16:07:00 0.432081 0.182106 0.115107 1.239192 1.573967 -2.030840
2013-01-01 16:08:00 -0.245789 -2.030840 0.118330 -1.922616 1.573967 -2.030840
2013-01-01 16:09:00 -0.358188 -0.121750 1.768505 -2.096908 1.573967 -1.185540
2013-01-01 16:10:00 -1.634722 -0.808355 -0.773417 0.095078 1.573967 -1.185540
2013-01-01 16:11:00 -0.396295 0.168568 -0.901945 -0.073811 1.573967 -1.185540
2013-01-01 16:12:00 -1.364391 2.052481 -0.175291 0.927363 1.573967 -1.185540
2013-01-01 16:13:00 -0.523331 0.042475 0.361593 -2.239468 1.573967 -1.185540
2013-01-01 16:14:00 1.573967 -0.709043 0.551812 0.452311 1.573967 -1.185540
2013-01-01 16:15:00 0.180578 0.846856 -2.304107 -1.283507 1.266667 -1.185540
2013-01-01 16:16:00 0.065386 0.356015 -0.174369 1.167562 1.266667 -1.563288
2013-01-01 16:17:00 -1.747416 1.279114 0.559075 0.200927 1.266667 -1.563288
2013-01-01 16:18:00 -2.041764 -0.085398 2.032789 0.195671 1.266667 -1.810085
2013-01-01 16:19:00 -0.639329 0.268832 0.394621 -0.271260 1.266667 -1.810085
Hi low diff is just
df['max_a'] - df['min_b']
Seems you have gaps in your series, use asfreq
:
In [16]: df = DataFrame(randn(10,2),columns=list('ab'),index=date_range('20130101 9:00',freq='T',periods=10))
In [17]: df
Out[17]:
a b
2013-01-01 09:00:00 0.516518 -1.497564
2013-01-01 09:01:00 1.747399 1.100530
2013-01-01 09:02:00 -0.223476 -0.682712
2013-01-01 09:03:00 0.343172 -0.341965
2013-01-01 09:04:00 -1.380057 -1.565732
2013-01-01 09:05:00 -2.156675 1.043532
2013-01-01 09:06:00 -1.237155 -0.219086
2013-01-01 09:07:00 1.626510 -0.596204
2013-01-01 09:08:00 -0.767588 0.496110
2013-01-01 09:09:00 -0.014556 0.012049
In [18]: df.index
Out[18]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-01-01 09:00:00, ..., 2013-01-01 09:09:00]
Length: 10, Freq: T, Timezone: None
In [19]: df.append(Series(name=[Timestamp('20130101 09:15')]))
Out[19]:
a b
2013-01-01 09:00:00 0.516518 -1.497564
2013-01-01 09:01:00 1.747399 1.100530
2013-01-01 09:02:00 -0.223476 -0.682712
2013-01-01 09:03:00 0.343172 -0.341965
2013-01-01 09:04:00 -1.380057 -1.565732
2013-01-01 09:05:00 -2.156675 1.043532
2013-01-01 09:06:00 -1.237155 -0.219086
2013-01-01 09:07:00 1.626510 -0.596204
2013-01-01 09:08:00 -0.767588 0.496110
2013-01-01 09:09:00 -0.014556 0.012049
2013-01-01 09:15:00 NaN NaN
In [20]: df.append(Series(name=[Timestamp('20130101 09:15')])).index
Out[20]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-01-01 09:00:00, ..., 2013-01-01 09:15:00]
Length: 11, Freq: None, Timezone: None
In [21]: df.append(Series(name=[Timestamp('20130101 09:15')])).asfreq('T')
Out[21]:
a b
2013-01-01 09:00:00 0.516518 -1.497564
2013-01-01 09:01:00 1.747399 1.100530
2013-01-01 09:02:00 -0.223476 -0.682712
2013-01-01 09:03:00 0.343172 -0.341965
2013-01-01 09:04:00 -1.380057 -1.565732
2013-01-01 09:05:00 -2.156675 1.043532
2013-01-01 09:06:00 -1.237155 -0.219086
2013-01-01 09:07:00 1.626510 -0.596204
2013-01-01 09:08:00 -0.767588 0.496110
2013-01-01 09:09:00 -0.014556 0.012049
2013-01-01 09:10:00 NaN NaN
2013-01-01 09:11:00 NaN NaN
2013-01-01 09:12:00 NaN NaN
2013-01-01 09:13:00 NaN NaN
2013-01-01 09:14:00 NaN NaN
2013-01-01 09:15:00 NaN NaN
In [22]: df.append(Series(name=[Timestamp('20130101 09:15')])).asfreq('T').index
Out[22]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-01-01 09:00:00, ..., 2013-01-01 09:15:00]
Length: 16, Freq: T, Timezone: None