Pregunta

I set up the following minimal example:

rng(0);

randseedoffset = random('unid', 10^5) + 1;

t = cell(10,1);
for i = 1:10
    rng(randseedoffset+i);
    t{i} = random('unid', 1000);
end

disp(t);

This will generate 10 random numbers and store them in t. It will always produce the same random numbers reliably because I set the seed with rng in the for loop.

If I now change for to parfor, I get different results! Though they will also always be reproducible.

I want to accelerate my code with parfor and still obtain the same exact same random numbers as with for...

¿Fue útil?

Solución

Ok, I just found the reason:

MATLAB supports different random number genereation algorithms. While in the usual setting of the current version this is the Mersenne Twister. When you go into the parfor loop, this changes to what they call 'Combined Recursive Method'.

The problem can be fixed by explicitely setting the type to 'twister' in the loop:

parfor i = 1:10
    rng(randseedoffset+i, 'twister');
    t{i} = random('unid', 1000);
end

Otros consejos

try this:

p = gcp; % Get or open a pool

numWork = p.NumWorkers; % Get the number of workers

stream = RandStream('mrg32k3a','seed',mydata.seed);
RandStream.setGlobalStream(stream);

% s = RandStream.create('mrg32k3a','NumStreams',numWork,'CellOutput',true,'Seed',mydata.seed); % create numWork independent streams

n = 200; % number of values to generate on each worker
spmd
RandStream.setGlobalStream(stream);
x = rand(1,n);
end

I feel the need to elaborate on this. Do not reset the seed in a parfor loop and furthermore do not use the Mersenne Twister algorithm in parallel (you will get poor results of statistical independence).

The reason that you get different results is because the algorithm is different due to the statistical properties which these numbers should maintain. In a parallel pool MATLAB will set the algorithm to 'combRecursive' and set a different subStream on each worker, so for random numbers you are good to go. Furthermore, the parfor loop does not guarantee—

  • The order in which the loops proceed,
  • which workers will be executing each piece, or
  • how many of the iterations are performed on each worker.

Therefore generating random numbers in parfor loops will generally not return the same random numbers even with the same state on each worker. Instead make a RandStream with subStreams of the combRecursive algorithm, set the global stream on each worker in a spmd block, then generate the numbers on each worker in a spmd block:

p = gcp; % Get or open a pool

numWork = p.NumWorkers; % Get the number of workers

s = RandStream.create('mrg32k3a','NumStreams',numWork,...
    'CellOutput',true); % create numWork independent streams

n = 200; % number of values to generate on each worker
spmd
    RandStream.setGlobalStream(s{labindex});
    x = rand(1,n);
end

% I generate row vectors as the Composite matrix x will return a 
% comma-separated list using the syntax, x{:}, which can then be 
% concatenated into a single vector:
randVals2 = [x{:}]'; 

Each worker in a cluster working on the same job has an independent random number generator stream. By default, therefore, each worker in a pool, and each iteration in a parfor-loop has a unique, independent set of random numbers. Subsequent runs of the parfor-loop generate different numbers.

In a parfor-loop, you cannot control what sequence the iterations execute in, nor can you control which worker runs which iterations. So even if you reset the random number generators, the parfor-loop can generate the same values in a different sequence.

To reproduce the same set of random numbers in a parfor-loop each time the loop runs, you must control random generation by assigning a particular substream for each iteration.

First, create the stream you want to use, using a generator that supports substreams. Creating the stream as a parallel.pool.Constant allows all workers to access the stream.

sc = parallel.pool.Constant(RandStream('Threefry'))

Inside the parfor-loop, you can set the substream index by the loop index. This ensures that each iteration uses its particular set of random numbers, regardless of which worker runs that iteration or what sequence iterations run in.

r = zeros(1,16);
parfor i = 1:16
    stream = sc.Value;        % Extract the stream from the Constant
    stream.Substream = i;
    r(i) = rand(stream);
end

https://www.mathworks.com/help/parallel-computing/repeat-random-numbers-in-parfor-loops.html

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top