indices of occurence of each row in MATLAB

Question 1

Well, the best answer for this would require knowledge of how A is filled. If A is sparse, that is, if it has few columns values and B is quite large, then I think the best way for memory saving may be using a sparse matrix instead of a cell.

% No fancy stuff, just fast and furious 
bMax = numel(B);
nRows = size(A,1);

cLogical = sparse(nRows,bMax);

for curRow = 1:nRows
  curIdx = A(curRow,:);
  cLogical(curRow,curIdx) = 1;
end

Answer:

cLogical =

   (2,1)        1
   (3,1)        1
   (4,1)        1
   (4,2)        1
   (1,3)        1
   (2,3)        1
   (3,3)        1
   (1,4)        1
   (3,4)        1
   (4,4)        1
   (1,5)        1
   (2,5)        1

How to read the answer. For each column the rows show the indexes that the column index appears in A. That is 1 appears in rows [2 3 4], 2 appear in row [4], 3 rows [1 2 3], 4 row [1 3 4], 5 in row [1 2].

Then you can use cLogical instead of a cell as an indexing matrix in the future for your needs.

Another way would be to allocate C with the expected value for how many times an index should appear in C.

% Fancier solution using some assumed knowledge of A
bMax = numel(B);
nRows = size(A,1);
nColumns = size(A,2);

% Pre-allocating with the expected value, an attempt to reduce re-allocations.
% tic; for rep=1:10000; C = mat2cell(zeros(bMax,nColumns),ones(1,bMax),nColumns); end; toc 
% Elapsed time is 1.364558 seconds.
% tic; for rep=1:10000; C = repmat({zeros(1,nColumns)},bMax,1); end; toc
% Elapsed time is 0.606266 seconds.
% So we keep the not fancy repmat solution
C = repmat({zeros(1,nColumns)},bMax,1);
for curRow = 1:nRows
  curIdxMsk = A(curRow,:);
  for curCol = 1:nColumns
    curIdx = curIdxMsk(curCol);
    fillIdx = ~C{curIdx};
    if any(fillIdx) 
      fillIdx = find(fillIdx,1);
    else
      fillIdx = numel(fillIdx)+1;
    end
    C{curIdx}(fillIdx) = curRow;
  end
end

% Squeeze empty indexes:
for curRow = 1:bMax
  C{curRow}(~C{curRow}) = [];
end

Answer:

>> C{:}

ans =

     2     3     4


ans =

     4


ans =

     1     2     3


ans =

     1     3     4


ans =

     1     2

Which solution will performs best? You do a performance test in your code because it depends on how big is A, bMax, the memory size of your computer and so on. Yet, I'm still curious with solutions other people can do for this x). I liked chappjc's solution although it has the cons that he has pointed out.

For the given example (10k times):

Solution 1: Elapsed time is 0.516647 seconds. 
Solution 2: Elapsed time is 4.201409 seconds (seems that solution 2 is a bad idea hahaha, but since it was created to the specific issue of A having many rows it has to be tested in those conditions).
chappjc' solution: Elapsed time is 2.405341 seconds.

Question 2

We can do it without making any assumptions about B. Try this use of bsxfun and mat2cell:

M = squeeze(any(bsxfun(@eq,A,permute(B,[3 2 1])),2)); % 4x3x1 @eq 1x1x5 => 4x3x5
R = sum(M); % 4x5 -> 1x5
[ii,jj] = find(M);
C = mat2cell(ii,R)

The cells in C above will be column vectors rather than rows as in your example. To make the cells contain row vectors, use C = mat2cell(ii',1,R)' instead.

My only concern is that mat2cell could be slow for millions of values of R, but if you want your output in a cell, I'm not sure how much better you can do. EDIT: If you can deal with a sparse matrix like in Werner's first solution with the loop, replace the last line of the above with the following:

>> Cs = sparse(ii,jj,1)
Cs =
   (2,1)        1
   (3,1)        1
   (4,1)        1
   (4,2)        1
   (1,3)        1
   (2,3)        1
   (3,3)        1
   (1,4)        1
   (3,4)        1
   (4,4)        1
   (1,5)        1
   (2,5)        1

Unfortunately, bsxfun will probably run out of memory if both size(A,1) and numel(B) are large! You may have to loop over the elements of A or B if memory becomes an issue. Here's one way to do it by looping over your vertexes in B:

for i=1:numel(B), C{i} = find(any(A==B(i),2)); end

Yup, that easy. Cell array growing is extremely fast in MATLAB as it similar to a sequence container that stores contiguous references to the data, rather than keeping the data itself contiguous. Perhaps ismember was the bottleneck in your test.