Question

Suppose, in MATLAB, that I have a matrix, A, whose elements are either 0 or 1.

How do I get a vector of the index of the last non-zero element of each column in a faster, vectorized way?

I could do

[B, I] = max(cumsum(A));

and use I, but is there a faster way? (I'm assuming cumsum would cost a bit of time even suming 0's and 1's).

Edit: I guess that I vectorized even more than I need fast - Mr. Fooz' loop is great but each loop in MATLAB seems to cost me a lot in debugging time even if it is fast.

Was it helpful?

Solution

As shown by Mr Fooz, for loops can be pretty fast now with newer versions of MATLAB. However, if you really want to have compact vectorized code, I would suggest trying this:

[B,I] = max(flipud(A));
I = size(A,1)-I+1;

This is faster than your CUMSUM-based answer, but still not quite as fast as Mr Fooz's looping options.

Two additional things to consider:

  • What results do you want to get for a column that has no ones in it at all? With the above option I gave you, I believe you will get an index of size(A,1) (i.e. the number of rows in A) in such a case. For your option, I believe you will get a 1 in such a case, while the nested-for-loops option from Mr Fooz will give you a 0.

  • The relative speed of these different options will likely vary based on the size of A and the number of non-zeroes you expect it to have.

OTHER TIPS

Fast is what you should worry about, not necessarily full vectorization. Recent versions of Matlab are much smarter about handling loops efficiently. If there's a compact vectorized way of expressing something, it's usually faster, but loops should not (always) be feared like they used to be.

clc

A = rand(5000)>0.5;
A(1,find(sum(A,1)==0)) = 1; % make sure there is at least one match

% Slow because it is doing too much work
tic;[B,I1]=max(cumsum(A));toc

% Fast because FIND is fast and it runs the inner loop
tic;
I3=zeros(1,5000);
for i=1:5000
  I3(i) = find(A(:,i),1,'last');
end
toc;
assert(all(I1==I3));

% Even faster because the JIT in Matlab is smart enough now
tic;
I2=zeros(1,5000);
for i=1:5000
  I2(i) = 0;
  for j=5000:-1:1
    if A(j,i)
      I2(i) = j;
      break;
    end
  end
end
toc;
assert(all(I1==I2));

On R2008a, Windows, x64, the cumsum version takes 0.9 seconds. The loop and find version takes 0.02 seconds. The double loop version takes a mere 0.001 seconds.

EDIT: Which one is fastest depends on the actual data. The double-loop takes 0.05 seconds when you change the 0.5 to 0.999 (because it takes longer to hit the break; on average). cumsum and the loop&find implementation have more consistent speeds.

EDIT 2: gnovice's flipud solution is clever. Unfortunately, on my test machine it takes 0.1 seconds, so it's much faster than cumsum, but slower than the looped versions.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top