How would I replace the missing values in the 'b' array below with the corresponding row averages in 'c'?

a=numpy.arange(24).reshape(4,-1)
b=numpy.ma.masked_where(numpy.remainder(a,5)==0,a);b
Out[46]: 
 masked_array(data =
 [[-- 1 2 3 4 --]
 [6 7 8 9 -- 11]
 [12 13 14 -- 16 17]
 [18 19 -- 21 22 23]],
         mask =
 [[ True False False False False  True]
 [False False False False  True False]
 [False False False  True False False]
 [False False  True False False False]],
       fill_value = 999999)

c=b.mean(axis=1);c
Out[47]: 
masked_array(data = [2.5 8.2 14.4 20.6],
         mask = [False False False False],
   fill_value = 1e+20)
有帮助吗?

解决方案

Try this:

np.copyto(b, c[...,None], where=b.mask)

You have to add the extra axis to c so that it knows to apply it to each row. (if np.mean had a keepdims option like np.sum, this wouldn't be necessary :P

import numpy as np

a = np.arange(24).reshape(4,-1).astype(float)   # I changed your example to be a float
b = np.ma.masked_where(numpy.remainder(a,5)==0,a)
c = b.mean(1)

np.copyto(b, c[...,None], where=b.mask)

In [189]: b.data
Out[189]: 
array([[  2.5,   1. ,   2. ,   3. ,   4. ,   2.5],
       [  6. ,   7. ,   8. ,   9. ,   8.2,  11. ],
       [ 12. ,  13. ,  14. ,  14.4,  16. ,  17. ],
       [ 18. ,  19. ,  20.6,  21. ,  22. ,  23. ]])

This is faster than creating an inds array:

In [169]: %%timeit
   .....: inds = np.where(b.mask)
   .....: b[inds] = np.take(c, inds[0])
   .....: 
10000 loops, best of 3: 81.2 µs per loop


In [173]: %%timeit
   .....: np.copyto(b, c[...,None], where=b.mask)
   .....: 
10000 loops, best of 3: 45.1 µs per loop

Another advantage is that it will warn you about the dtype issue:

a = np.arange(24).reshape(4,-1)    # still an int
b = np.ma.masked_where(numpy.remainder(a,5)==0,a)
c = b.mean(1)

In [193]: np.copyto(b, c[...,None], where=b.mask)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-193-edc7f01f3f89> in <module>()
----> 1 np.copyto(b, c[...,None], where=b.mask)

TypeError: Can not cast scalar from dtype('float64') to dtype('int64') according to the rule 'same_kind'

By the way, there is a set of functions for such a task, depending on what different source formats you have, such as

np.put
sequentially puts the input array into the output array in locations given by indices and would work like @Ophion's answer.

np.place
sequentially assigns each element from the input (list or 1d array) into places in the output array wherever the mask is true, (not aligned with the input array, as their shapes don't have to match).

np.copyto
will always put a value from the input array into the same (broadcasted) location in the output array. Shapes must match (or be broadcastable). It effectively replaces the older function np.putmask.

其他提示

You can use where and take:

inds = np.where(b.mask)

b[inds] = np.take(c,inds[0])

b
masked_array(data =
 [[2 1 2 3 4 2]
 [6 7 8 9 8 11]
 [12 13 14 14 16 17]
 [18 19 20 21 22 23]],
             mask =
 [[False False False False False False]
 [False False False False False False]
 [False False False False False False]
 [False False False False False False]],
       fill_value = 999999)

In this particular example you have issues with the dtype of a. If you add a = a.astype(np.float) before the creation of b it works just fine. There may be a faster way to create the indices then np.where.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top