Pergunta

I love the query method in Pandas. It's fast and highly expressive and less verbose than the regular selection methods.

Given a query, is it possible to get the True/ False mask that corresponds to the values returned by query?

For example, say I have:

my_query = 'values >= {0} and values <= {1}'.format(Q1, Q2)
inliers  = df.query(my_query)

inliers will hold the data that satisfied the query, but can I also get the mask of this query?

Getting the mask can be useful, for example, to quickly negate the query, or to get a result of the same size of the original dataframe.

Foi útil?

Solução

Use df.eval(). df.query() is basically df[df.eval()]

In [32]: df = DataFrame(dict(A = range(5)))

In [33]: df
Out[33]: 
   A
0  0
1  1
2  2
3  3
4  4

In [34]: df.query('A>3')
Out[34]: 
   A
4  4

In [36]: df.eval('A>3')
Out[36]: 
0    False
1    False
2    False
3    False
4     True
dtype: bool
Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top