Question

Am hoping someone can explain to me the following behavior I observe with a numpy array:

>>> import numpy as np
>>> data_block=np.zeros((26,480,1000))
>>> indices=np.arange(1000)
>>> indices.shape
(1000,)
>>> data_block[0,:,:].shape
(480, 1000)            #fine and dandy
>>> data_block[0,:,indices].shape
(1000, 480)            #what happened????  why the transpose????
>>> ind_slice=np.arange(300)    # this is more what I really want.
>>> data_block[0,:,ind_slice].shape
(300, 480)     # transpose again!   arghhh!

I don't understand this transposing behavior and it is very inconvenient for what I want to do. Could anyone explain it to me? An alternative method for getting that subset of data_block would be a great bonus.

Était-ce utile?

La solution

You can achieve your desired result this way:

>>> data_block[0,:,:][:,ind_slice].shape
(480L, 300L)

I confess I don't have a complete understanding of how complicated numpy indexing works, but the documentation seems to hint at the trouble you're having:

Basic slicing with more than one non-: entry in the slicing tuple, acts like repeated application of slicing using a single non-: entry, where the non-: entries are successively taken (with all other non-: entries replaced by :). Thus, x[ind1,...,ind2,:] acts like x[ind1][...,ind2,:] under basic slicing.

Warning: The above is not true for advanced slicing.

and. . .

Advanced indexing is triggered when the selection object, obj, is a non-tuple sequence object, an ndarray (of data type integer or bool), or a tuple with at least one sequence object or ndarray (of data type integer or bool).

Thus you are triggering that behavior by indexing with your ind_slice array instead of a regular slice.

The documentation itself says that this kind of indexing "can be somewhat mind-boggling to understand", so it's not surprising we both have trouble with this :-).

Autres conseils

There really is not much to be surprised about once you understand how fancy indexing works. If you have lists or arrays as indices, they must all be of the same shape, or be broadcastable to a common shape. That shape will be the base shape of the return array. If there are indices which are slices, then every entry in the base shape array will be multidimensional, so the base shape gets extended with extra entries. While this may seem a weird choice, it really is the only one consistent with multidimensional fancy indexing. As an example, try to figure what would you expect the return shape to be if you did the following:

>>> ind_slice=np.arange(16).reshape(4, 4)
>>> data_block[ind_slice, :, ind_slice].shape
(4, 4, 480) # No, (4, 4, 480, 4, 4) is not a better option

There are several ways to get what you are after. For the particular case in your question, the most obvious would be to not use fancy indexing, as you can get what you ask with slices:

>>> data_block[0, :, :300].shape
(480, 300)

If you do need fancy indexing, you can replace slices with broadcastable arrays:

>>> data_block[0, np.arange(480)[:, None], ind_slice].shape
(480, 300)

You may want to take a look at np.ogrid and np.mgrid if you need to replace more complicated slices with arrays.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top