This question is related to this question and probably to this other as well.
Suppose you have two matrices A and B. A is M-by-N and B is N-by-K. I want to obtain an M-by-K matrix C such that C(i, j) = 1 - prod(1 - A(i, :)' .* B(:, j))
. I have tried some solutions in Matlab - I am here comparing their computation performance.
% Size of matrices:
M = 4e3;
N = 5e2;
K = 5e1;
GG = 50; % GG instances
rntm1 = zeros(GG, 1); % running time of first algorithm
rntm2 = zeros(GG, 1); % running time of second algorithm
rntm3 = zeros(GG, 1); % running time of third algorithm
rntm4 = zeros(GG, 1); % running time of fourth algorithm
rntm5 = zeros(GG, 1); % running time of fifth algorithm
for gg = 1:GG
A = rand(M, N); % M-by-N matrix of random numbers
A = A ./ repmat(sum(A, 2), 1, N); % M-by-N matrix of probabilities (?)
B = rand(N, K); % N-by-K matrix of random numbers
B = B ./ repmat(sum(B), N, 1); % N-by-K matrix of probabilities (?)
%% First solution
% One-liner solution:
tic
C = squeeze(1 - prod(1 - repmat(A, [1 1 K]) .* permute(repmat(B, [1 1 M]), [3 1 2]), 2));
rntm1(gg) = toc;
%% Second solution
% Full vectorization, using meshgrid, arrayfun and reshape (from Luis Mendo, second link above)
tic
[ii jj] = meshgrid(1:size(A, 1), 1:size(B, 2));
D = arrayfun(@(n) 1 - prod(1 - A(ii(n), :)' .* B(:, jj(n))), 1:numel(ii));
D = reshape(D, size(B, 2), size(A, 1)).';
rntm2(gg) = toc;
clear ii jj
%% Third solution
% Partial vectorization 1
tic
E = zeros(M, K);
for hh = 1:M
tmp = repmat(A(hh, :)', 1, K);
E(hh, :) = 1 - prod((1 - tmp .* B), 1);
end
rntm3(gg) = toc;
clear tmp hh
%% Fourth solution
% Partial vectorization 2
tic
F = zeros(M, K);
for hh = 1:M
for ii = 1:K
F(hh, ii) = 1 - prod(1 - A(hh, :)' .* B(:, ii));
end
end
rntm4(gg) = toc;
clear hh ii
%% Fifth solution
% No vectorization at all
tic
G = ones(M, K);
for hh = 1:M
for ii = 1:K
for jj = 1:N
G(hh, ii) = G(hh, ii) * prod(1 - A(hh, jj) .* B(jj, ii));
end
G(hh, ii) = 1 - G(hh, ii);
end
end
rntm5(gg) = toc;
clear hh ii jj C D E F G
end
prctile([rntm1 rntm2 rntm3 rntm4 rntm5], [2.5 25 50 75 97.5])
% 3.6519 3.5261 0.5912 1.9508 2.7576
% 5.3449 6.8688 1.1973 3.3744 3.9940
% 8.1094 8.7016 1.4116 4.9678 7.0312
% 8.8124 10.5170 1.9874 6.1656 8.8227
% 9.5881 12.0150 2.1529 6.6445 9.5115
mean([rntm1 rntm2 rntm3 rntm4 rntm5])
% 7.2420 8.3068 1.4522 4.5865 6.4423
std([rntm1 rntm2 rntm3 rntm4 rntm5])
% 2.1070 2.5868 0.5261 1.6122 2.4900
The solutions are equivalent but the algorithms with partial vectorization are way more efficient in terms of memory and execution time. Even the triple loop seems to perform better than arrayfun! Is there any approach that is actually better than the third, only partially vectorized solution?
EDIT: Dan's solutions are the best so far. Let rntm6, rntm7 and rntm8 be the runtime of his first, second and third solution. Then:
prctile(rntm6, [2.5 25 50 75 97.5])
% 0.6337 0.6377 0.6480 0.7110 1.2932
mean(rntm6)
% 0.7440
std(rntm6)
% 0.1970
prctile(rntm7, [2.5 25 50 75 97.5])
% 0.6898 0.7130 0.9050 1.1505 1.4041
mean(rntm7)
% 0.9313
std(rntm7)
% 0.2276
prctile(rntm8, [2.5 25 50 75 97.5])
% 0.5949 0.6005 0.6036 0.6370 1.3529
mean(rntm8)
% 0.6753
std(rntm8)
% 0.1890