More Efficient way to check values of neighboring rows in pandas dataframe for stock backtesting

https://stackoverflow.com/questions/22138002

19-10-2022
|

Question

I have a large dataframe of financial data. I'd like to create a 'buy' signal when the price of the stock is above the upper rolling mean for at least an X number of days and is below the rolling mean for at least an X number of days. I've done a naive implementation that checks if the stock is above at least 2 days before and 2 days after. I'm wondering if: (1) there's a way to make this more general, so that I can implement an X number of days without putting in a bunch of if statements (2) there's a more efficient/effective way to write the code below.

data2 is simply the dataframe data2.AboveUpper is a column of Trues when the stock is above the upper rolling mean, and False when the stock is below the upper rolling mean.

upperlist = data2[data2.AboveUpper == True].AboveUpper.index

for i, val in enumerate(upperlist):
    if data2.iloc[val+1]['AboveUpper'] == True:
        continue
    if data2.iloc[val+1]['AboveUpper'] == False:
        if data2.iloc[val+2]['AboveUpper'] == False:
            data2.ix[val+2, 'buy'] = True
        if data2.iloc[val+2]['AboveUpper'] == False:
            continue

Solution

if data2.AboveUpper is True when the stock is above the upper rolling mean, then

pd.rolling_sum( data2.AboveUpper, window=X ) >= X

is True wherever the stock has been above the upper rolling mean for at least X consecutive days; so all you need is:

data2['buy'] = pd.rolling_sum( data2.AboveUpper, window=X ) >= X

as an example:

>>> ts = pd.Series( [True, True, False, False, True, True, True, False, True ] )
>>> pd.rolling_sum( ts, window=2 ) >= 2
0    False
1     True
2    False
3    False
4    False
5     True
6     True
7    False
8    False
dtype: bool

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow