Pregunta

I have a numpy array such as:

array = [0.2, 0.3, 0.4]

(this vector is actually size 300k dense, I'm just illustrating with simple examples)

and a sparse symmetric matrix created using Scipy such as follows:

M = [[0, 1, 2]  
     [1, 0, 1]  
     [2, 1, 0]]

(represented as dense just to illustrate; in my real problem it's a (300k x 300k) sparse matrix)

Is it possible to multiply all rows by the elements in array and then make the same operation regarding the columns?

This would result first in :

M = [[0 * 0.2, 1 * 0.2, 2 * 0.2]
     [1 * 0.3, 0 * 0.3, 1 * 0.3]
     [2 * 0.4, 1 * 0.4, 0 * 0.4]]

(rows are being multiplied by the elements in array)

M = [[0, 0.2, 0.4]
     [0.3, 0, 0.3]
     [0.8, 0.4, 0]]

And then the columns are multiplied:

M = [[0 * 0.2, 0.2 * 0.3, 0.4 * 0.4]
     [0.3 * 0.2, 0 * 0.3, 0.3 * 0.4]
     [0.8 * 0.2, 0.4 * 0.3, 0 * 0.4]]

Resulting finally in:

M = [[0, 0.06, 0.16]
     [0.06, 0, 0.12]
     [0.16, 0.12, 0]]

I've tried applying the solution I found in this thread, but it didn't work; I multiplied the data of the M by the elements in array as it was suggested, then transposed the matrix and applied the same operation but the result wasn't correct, still coudn't understand why!

Just to point this out, the matrix I'll be running this operations are somewhat big, it has 20 million non-zero elements so efficiency is very important!

I appreciate your help!

Edit:

Bitwise solution worked very well. Here it took 1.72 s to compute this operation but that's ok to our work. Tnx!

¿Fue útil?

Solución

In general you want to avoid loops and use matrix operations for speed and efficiency. In this case the solution is simple linear algebra, or more specifically matrix multiplication.

To multiply the columns of M by the array A, multiply M*diag(A). To multiply the rows of M by A, multiply diag(A)*M. To do both: diag(A)*M*diag(A), which can be accomplished by:

numpy.dot(numpy.dot(a, m), a)

diag(A) here is a matrix that is all zeros except having A on its diagonal. You can have methods to create this matrix easily (e.g. numpy.diag() and scipy.sparse.diags()).

I expect this to run very fast.

Otros consejos

The following should work:

[[x*array[i]*array[j] for j, x in enumerate(row)] for i, row in enumerate(M)]

Example:

>>> array = [0.2, 0.3, 0.4]
>>> M = [[0, 1, 2], [1, 0, 1], [2, 1, 0]]
>>> [[x*array[i]*array[j] for j, x in enumerate(row)] for i, row in enumerate(M)]
[[0.0, 0.059999999999999998, 0.16000000000000003], [0.059999999999999998, 0.0, 0.12], [0.16000000000000003, 0.12, 0.0]]

Values are slightly off due to limitations on floating point arithmetic. Use the decimal module if the rounding error is unacceptable.

I use this combination:

def multiply(matrix, vector, axis):
    if axis == 1:
        val = np.repeat(array, matrix.getnnz(axis=1))
        matrix.data *= val
    else:
        matrix = matrix.multiply(vector)
    return matrix

When the axis is 1 (multiply by rows), I replicate the second approach of this solution, and when the axis is 0 (multiply by columns) I use multiply

The in-place result (axis=1) is more efficient.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top