Compute mean on matrix by groups of indices matlab

Question 1

Let's do what you say in the question step by step:

For each unique value in A(:, 2):
```
[U, ia, iu] = unique(A(:, 2));
```
Take the corresponding values in A(:, 1) and look for their value in C:
```
[tf, loc] = ismember(A(:, 1), C);
```
It's also recommended to make sure, just in case, that all values are actually found in C:
```
assert(all(tf))
```

Then take the relevant rows in B and compute their mean:

[X, Y] = meshgrid(1:size(B, 2), iu);
result = accumarray([Y(:), X(:)], reshape(B(loc, :), 1, []), [], @mean);

Hope this helps! :)

Example

%// Sample input
A = [1 2 ; 3 2; 4 7; 10 2; 6 7; 10 9];
B = [1 2 3; 4 4 9; 1 8 0; 3 7 9; 3 6 8];
C = [4; 10; 6; 3; 1];

%// Compute means
[U, ia, iu] = unique(A(:, 2));
[tf, loc] = ismember(A(:, 1), C);
[X, Y] = meshgrid(1:size(B, 2), iu);
result = accumarray([Y(:), X(:)], reshape(B(loc, :), [], 1), [], @mean);

The result is:

result = 
   3.3333   5.6667   8.6667
   1.0000   5.0000   1.5000
   4.0000   4.0000   9.0000

Question 2

Here is another solution without arrayfun and accumarray using good old-fashion matrix multiplication:

r = bsxfun(@eq, A(:,1), C')*(1:numel(C))'; 
[~,m,n] = unique(A(:,2));
f=histc(n, 1:numel(m));
result = diag(1./f)*bsxfun(@eq, 1:numel(m), n).'*B(r,:);

I ran a benchmark against other two solutions and it appears to be faster than both. For 1000 repetitions:

This method takes 0.205650 seconds.
Eitan T's solution takes 0.546976 seconds.
matlabit's solution takes 1.619039 seconds.

Here is the benchmark code:

N = 1e3;

tic
for k=1:N,
    r = bsxfun(@eq, A(:,1), C')*(1:numel(C))'; % faster than [~,r] = ismember(A(:,1), C)
    [~,m,n] = unique(A(:,2));
    f=histc(n, 1:numel(m));
    result2 = diag(1./f)*bsxfun(@eq, 1:numel(m), n).'*B(r,:);
end
toc

tic
for k=1:N,
    [U, ia, iu] = unique(A(:, 2));
    [tf, loc] = ismember(A(:, 1), C);
    [X, Y] = meshgrid(1:size(B, 2), iu);
    result1 = accumarray([Y(:), X(:)], reshape(B(loc, :), [], 1), [], @mean);
end
toc

tic
for k=1:N,
    D = [arrayfun(@(x) find(C == x,1,'first'), A(:,1) ), A(:,2)];
    data = [B(D(:,1),:), D(:,2)];
    st = grpstats(data(:,1:3),data(:,4:4),{'mean'});
end
toc

Question 3

Thanks, I also thought of:

D = [arrayfun(@(x) find(C == x,1,'first'), A(:,1) ), A(:,2)];
data = [B(D(:,1),:), D(:,2)];
st = grpstats(data(:,1:3),data(:,4:4),{'mean'});