Question

I know how to use AND type of query in pandas HDFStore.select, but how can I use OR?

for example, I have the following code

import pandas as pd
df1 = pd.DataFrame({'A': randn(100),
                'B': randn(100),
                'C': randn(100).cumsum()},
                index=pd.bdate_range(end=pd.Timestamp('20131031 23:59:00'), periods=100))
df1.to_hdf('testHDF.h5', 'testVar1', format='table', data_columns=True, append=True)

Then I can use the following to partially load from this dataset

store = pd.HDFStore('testHDF.h5')
store.select('testVar1', [pd.Term('index', '>=', pd.Timestamp('20131017')), 'A > 0'])

or

store.select('tableVar2', where=('A > 0', 'B > 0', 'index >= 20131017'))

Apparently, it is using AND to combine all the criteria I provided, such as ('A > 0' AND 'B > 0' AND 'index >= 20131017')

My question is, how can I use OR, such as the returned result is ('A > 0' OR 'B > 0')?

Thanks for any help

Was it helpful?

Solution

in 0.12, you have to concat the result of selecting multiple criteria (keeping in mind that you may generate duplicates)

In [9]: pd.concat([store.select('testVar1', where=('A > 0', 'index >= 20131017')),
                   store.select('testVar1', where=('B > 0', 'index >= 20131017'))]).drop_duplicates().sort_index()
Out[9]: 
                   A         B          C
2013-10-17  0.156248  0.085911  10.238636
2013-10-22 -0.125369  0.335910  10.865678
2013-10-23 -2.531444  0.690332  12.335883
2013-10-24 -0.266777  0.501257  13.529781
2013-10-25  0.815413 -0.629418  14.690554
2013-10-28  0.383213 -0.587026  13.589094
2013-10-31  1.897674  0.361764  14.595062

[7 rows x 3 columns]

In 0.13/master (0.13rc1 is out!), you can just do a very natural query

In [10]: store.select('testVar1', where='(A > 0 | B > 0) & index >= 20131017')
Out[10]: 
                   A         B          C
2013-10-17  0.156248  0.085911  10.238636
2013-10-22 -0.125369  0.335910  10.865678
2013-10-23 -2.531444  0.690332  12.335883
2013-10-24 -0.266777  0.501257  13.529781
2013-10-25  0.815413 -0.629418  14.690554
2013-10-28  0.383213 -0.587026  13.589094
2013-10-31  1.897674  0.361764  14.595062

[7 rows x 3 columns]
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top