Question

Suppose, for example, I have a Numpy nd.array which has the shape (10,10):

import numpy as np
a = np.linspace(-1,1,100).reshape(10,10)

I'd like to perform a calculation on the first element of each row if and only if the first element is smaller than zero. To do this, I've been thinking of using a masked array:

a = np.ma.MaskedArray(a,mask=(np.ones_like(a)*(a[:,0]<0)).T)

>>> (np.ones_like(a)*(a[:,0]<0)).T
array([[ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.],
   [ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.],
   [ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.],
   [ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.],
   [ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]])

This will allow me to perform calculations only on the rows in which the first element is less than zero (it just so happens that in this example the other elements of these rows are also negative, but I've tested the case where only the first elements are negative and the others are positive). I have a few questions at this point:

1) Should I add an additional mask to cover up all columns except the first to perform my calculation (to make the example concrete: I'd like to add 1000 to the first element of each row where that element is less than zero) ?

2) Is masking an array permanent? Is there an unmask method?

3) Is this the easiest way to perform this type of calculation?

Any suggestions would be appreciated. Thanks!

Was it helpful?

Solution

In my opinion using a masked array seems a bit overkill for doing something relatively simple like this. I would use fancy indexing of numpy to do it:

#get indices of rows to update
rowsToUpdate = np.nonzero(a[:,0]<0)[0]
#increment first element of target rows by 1000
a[rowsToUpdate,0] += 1000

OTHER TIPS

You could do the following using pandas:

import numpy as np
from pandas import DataFrame  # DataFrame is the workhorse of pandas

a = DataFrame(np.linspace(-1, 1, 100).reshape(10, 10))
mask = a[0] < 0 # a[0] is the 0th column of a
suba = a[mask]

# do some calcs with suba ... make sure the index remains the same

a[mask] = suba
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top