Question

In matlab, I want a 3D matrix with random numbers that are unique along the 3th dimension, as this code does:

M = 2;  N = 10;  L = 5;  K = 100;
mat = zeros([M N L]);
for ii=1:M
    for jj=1:N
        mat(ii,jj,:) = randperm(K,L);
    end
end

However, when the matrix is larger, the computation time rises a lot. Thus, I would like to delete the loop's with any vectorization. I couldn't figure it out any way of doing so, is it possible?

Thank you for the help.


Edit: I have run all the methods for several matrix sizes in this script, and these are the results:

enter image description here

Besides, the distribution of numbers is:

enter image description here

Thus, the implementation of @Luis Mendo is the one that scales better for low L values, which is my case. But, @Rody Oldenhuis optimized proposal is fast interdependently of the L value. Hence, a combined solution could be:

function mat = assignPermMatrix_comb(M,N,L,K)
    R = M*N;
    mat = zeros([L R]);
    if L<K*0.15
        ind = true(1,R); 
        while R
            mat(:,ind) = randi(K, L, R); 
            ind = any(diff(sort(mat))==0);
            R = nnz(ind);
        end
    else
        for ii=1:R
            mat(:,ii) = randperm(K,L);
        end
    end
    mat = reshape(mat.', [M N L]);
end

I am very thankful for all the effort that you put in your answers.

Was it helpful?

Solution

A rejection approach may be faster, depending on the values of L and K.

The idea is to generate all entries with randi without regard to repetition, detect third-dim-lines which have repetitions, and generate those again, until no repetitions exist. It's easier to work with the first two dimensions collapsed into one, and reshape at the end.

Of course, running time is random with this approach.

ind = true(1,M*N); %// lines that need generaring. Initially all of them
R = M*N; %// number of third-dim-lines that need to be generated
while R
    output(:,ind) = randi(K, L, R); %// (re)generate random values where needed 
    ind = any(diff(sort(output))==0); %// detect repetitions, for next iteration
    R = nnz(ind);
end
output = output.';
output = reshape(output, [M N L]);

OTHER TIPS

Partially unrolling your loop will certainly help:

mat = zeros(L,M*N);
for ii=1:M*N        
    mat(:,ii) = randperm(K,L);
end
mat = reshape(mat.', [M N L]);

But I think the main issue is that you're using randperm with large K and small L. I'm not sure how randperm is implemented on newer versions of MATLAB (which you seem to have), but if it's anything like the version I have, it physically creates a randomized sorting of the integers 1 through K, and then extracts the first L from that array. So, if K is relatively large and L relatively small, this means you're doing a lot of unnecessary work at each loop iteration. Luis' solution is then better.

To test that theory, consider the following simple test:

M = 20;  N = 100;  
L = 5;   K = 1000;

%// Original
tic
mat = zeros([M N L]);
for ii=1:M
    for jj=1:N   
        [~,P] = sort(rand(K,1)); %// Note: I don't have the 
        mat(ii,jj,:) = P(1:L);   %// newer randperm
    end
end
toc

%// Optimized version
tic
mat = zeros(L, M*N);
for ii=1:M*N
    [~,P] = sort(rand(K,1));
    mat(:,ii) = P(1:L);
end
mat = reshape(mat.', [M N L]);
toc

%// Avoid doing so much useless work
tic
ints = 1:K;
mat = zeros(L, M*N);
for ii=1:M*N
    mat(:,ii) = inds(randi(K,L,1));
end
mat = reshape(mat.', [M N L]);
toc

Results:

Elapsed time is 0.233492 seconds. %// original
Elapsed time is 0.231393 seconds. %// optimized
Elapsed time is 0.007062 seconds. %// oh...wow.

Note that the last test is not yet a valid solution, because I'm not yet checking for uniqueness. Nevertheless, it goes to show that this is likely still how the newer randperm is doing its job.

So, the final version:

ints = 1:K;
mat = zeros(L, M*N);
for ii=1:M*N
    inds = randi(K,L,1);
    while any(diff(sort(inds))==0)
        inds = randi(K,L,1); end
    mat(:,ii) = inds();
end
mat = reshape(mat.', [M N L]);

Test results for M = 100; N = 200; L = 5; K = 100;:

Elapsed time is 0.315532 seconds.
Elapsed time is 0.297795 seconds.
Elapsed time is 0.189210 seconds.

Test results for M = 100; N = 200; L = 5; K = 100;:

Elapsed time is 10.818245 seconds.
Elapsed time is 10.733220 seconds.
Elapsed time is 0.788050 seconds.

However, test results for M = 10; N = 10; L = 40; K = 50;:

Elapsed time is 0.001326 seconds.
Elapsed time is 0.001108 seconds.
Elapsed time is 238.300146 seconds.  %// wait, WHAT?!

So, it would seem we have to come up with something smarter...

So, after a bit of soulsearching, I came up with the following:

%// This uses a form of the Fisher/Yates shuffle algorithm
mat  = zeros(L, M*N);
ints = 1:K;
inds = randi(K,M*N,L);
L1   = 1:L;

for ii = 1:M*N

    tmp = ints(L1);
    ints(L1) = ints(inds(ii,:));
    ints(inds(ii,:)) = tmp;

    mat(:,ii) = ints(L1);

end

mat = reshape(mat.', [M N L]);

Results for M = 250; N = 250; L = 150; K = 250;

Elapsed time is 2.332690 seconds.
Elapsed time is 2.140191 seconds.
Elapsed time is 1.512606 seconds.

Results for M = 250; N = 250; L = 15; K = 100;

Elapsed time is 1.021733 seconds.
Elapsed time is 0.956033 seconds.
Elapsed time is 0.445112 seconds.

Still quite disappointing really...But oh well, certainly better than it was.

This should execute quicker:

s = repmat(L, [M*N 1]);
P = arrayfun(@(x)(randperm(K, x)), s, 'UniformOutput', false);
Q = cell2mat(P);
mat = reshape(Q, [M N L]);

NOTE: The randperm I have only accepts one parameter, so I couldn't try your code, this approach works for me with the anonymous function @(x)(randperm(x)) in arrayfun.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top