質問

Given a length n array of indices in 0 ... k-1 (i.e. A = [0, 0, 1, 2, 1, ...]), what is the most efficient way to form a new array of shape (n, k) B, such that B[i,j] = 1 if A[i] == j and A[i] = 0 otherwise?

i.e, for the example A = [0, 0, 1, 2, 1, ...] (k=3), we would get

B = [[1, 0, 0], [1, 0, 0], [0, 1, 0], [0, 0, 1], [0, 1, 0], ...]

Is there a way to do this without an explicit for loop?

役に立ちましたか?

解決

Given the sparsity of the array that you build, you might want to use Scipy's sparse matrices, which have the advantage of having a small memory footprint:

import numpy
from scipy import sparse

A = numpy.array([0, 0, 1, 2, 1])
k = 3
B = sparse.coo_matrix((numpy.full(len(A), 1, dtype=int), (numpy.arange(len(A)), A)), shape=(len(A), k))

(coo_matrix() is described in Scipy's documentation). This gives the intended result:

>>> B.todense()
matrix([[ 1.,  0.,  0.],
        [ 1.,  0.,  0.],
        [ 0.,  1.,  0.],
        [ 0.,  0.,  1.],
        [ 0.,  1.,  0.]])

but with a small memory footprint (if k is large enough [larger than a few units]). In order to save even more memory, the dtype above could be made smaller (depending on your exact needs), with dtype=numpy.int8 or even dtype=bool.

他のヒント

import numpy as np

A = np.array([0, 0, 1, 2, 1])

B = np.zeros((len(A), 3), dtype=np.int)

B[np.arange(len(A)), A] = 1

Result:

>>> B
array([[1, 0, 0],
       [1, 0, 0],
       [0, 1, 0],
       [0, 0, 1],
       [0, 1, 0]])
A=np.array([0, 0, 1, 2, 1])
n=5
k=3
B=np.zeros(n*k, 'int')
B[np.arange(n)*k+A]=1
B.reshape((n,k))

result:

array([[ 1,  0,  0],
       [ 1,  0,  0],
       [ 0,  1,  0],
       [ 0,  0,  1],
       [ 0,  1,  0]])
ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top