I have a large m *n sparse matrix Y. I would like to normalize each row of Y, so that each row has zero mean.

I first tried this. But the mean of each row is also subtracted from the zero entries, which is not what I want.

Ynorm = bsxfun(@minus, Y, Ymean); 

Then I tried this.

[m, n] = size(Y);
nonZeroNum = nnz(Y); 
Ynorm = spalloc(m,n,nonZeroNum); 
for i = 1:m
    Ynorm(i, :) = spfun(@(x)(x - Ymean(i)), Y(i, :));
end

However, this non-vectorized solution is too slow.

I've also thought of combining bsxfun and spfun, but didn't make it.

Does anyone have a vectorized solution?

有帮助吗?

解决方案

Easy, peasy.

A random sparse matrix.

A = sprand(100,100,.05);

Get the row means. In case there are no non-zero elements in a row, we will expect 0/0 = NaN, but then that row will never be touched by the next step.

rowmeans = sum(A,2)./sum(A~=0,2);

Extract the non-zeros.

[i,j.a] = find(A);

And restore the array, mean subtracted.

[n,m] = size(A);
B = sparse(i,j,a - rowmeans(i),n,m);

Now, test it. Don't forget that floating point arithmetic applies here, so the row means will not be exactly zero, only on the order of eps.

min(mean(B,2))
ans =
   (1,1)     -1.5543e-17

max(mean(B,2))
ans =
   (1,1)      1.1657e-17

Seems about right, and fully vectorized. To convince you that the result truly is sparse and that the zero elements have not been corrupted, here is the result of spy.

spy(B)

spyplot.jpg

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top