I have the following program in python
# input
import pandas as pd
import numpy as np
data = pd.DataFrame({'a':pd.Series([1.,2.,3.]), 'b':pd.Series([4.,np.nan,6.])})
Here the data
is:
In: print data
a b
0 1 4
1 2 NaN
2 3 6
Now I want a isnull
column indicating if the row has any nan
:
# create data
data['isnull'] = np.zeros(len(data))
data['isnull'][pd.isnull(data).any(axis=1)] = 1
The output is not correct (the second one should be 1):
In: print data
a b isnull
0 1 4 0
1 2 NaN 0
2 3 6 0
However, if I execute the exact command again, the output will be correct:
data['isnull'][pd.isnull(data).any(axis=1)] = 1
print data
a b isnull
0 1 4 0
1 2 NaN 1
2 3 6 0
Is this a bug with pandas or am I missing something obvious?
my python version is 2.7.6
. pandas is 0.12.0
. numpy is 1.8.0