Domanda

Given a length n array of indices in 0 ... k-1 (i.e. A = [0, 0, 1, 2, 1, ...]), what is the most efficient way to form a new array of shape (n, k) B, such that B[i,j] = 1 if A[i] == j and A[i] = 0 otherwise?

i.e, for the example A = [0, 0, 1, 2, 1, ...] (k=3), we would get

B = [[1, 0, 0], [1, 0, 0], [0, 1, 0], [0, 0, 1], [0, 1, 0], ...]

Is there a way to do this without an explicit for loop?

È stato utile?

Soluzione

Given the sparsity of the array that you build, you might want to use Scipy's sparse matrices, which have the advantage of having a small memory footprint:

import numpy
from scipy import sparse

A = numpy.array([0, 0, 1, 2, 1])
k = 3
B = sparse.coo_matrix((numpy.full(len(A), 1, dtype=int), (numpy.arange(len(A)), A)), shape=(len(A), k))

(coo_matrix() is described in Scipy's documentation). This gives the intended result:

>>> B.todense()
matrix([[ 1.,  0.,  0.],
        [ 1.,  0.,  0.],
        [ 0.,  1.,  0.],
        [ 0.,  0.,  1.],
        [ 0.,  1.,  0.]])

but with a small memory footprint (if k is large enough [larger than a few units]). In order to save even more memory, the dtype above could be made smaller (depending on your exact needs), with dtype=numpy.int8 or even dtype=bool.

Altri suggerimenti

import numpy as np

A = np.array([0, 0, 1, 2, 1])

B = np.zeros((len(A), 3), dtype=np.int)

B[np.arange(len(A)), A] = 1

Result:

>>> B
array([[1, 0, 0],
       [1, 0, 0],
       [0, 1, 0],
       [0, 0, 1],
       [0, 1, 0]])
A=np.array([0, 0, 1, 2, 1])
n=5
k=3
B=np.zeros(n*k, 'int')
B[np.arange(n)*k+A]=1
B.reshape((n,k))

result:

array([[ 1,  0,  0],
       [ 1,  0,  0],
       [ 0,  1,  0],
       [ 0,  0,  1],
       [ 0,  1,  0]])
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top