Pergunta

I need to execute two Matlab functions in parallel. The problem is that getting results from them is much slower than execution.

First method:

spmd;
 if labindex==1, 
  K=MatricaK(NP, NE, r, Kxx, N, h, 1); %K is 1000x1000 matrix
 end;
 if labindex==2,
  F=Apkrovos(NP, NE, N, r, Ta, h, 1); %F is 1000x1 vector
 end;
end;
%This part is quite fast, around 0.17s.
K=K{1};
F=F{2};
%This part is very slow, around 1.15s.

Second method:

parfor i=1:2
 if i==1
  K=MatricaK(NP, NE, r, Kxx, N, h, 1); %this way doesn't return K outside the loop, but very fast, around 0.15 for all loop
 ..
  K{i}=MatricaK(NP, NE, r, Kxx, N, h, 1); %this works, but slow, around 1.5s
 ..
  K = [K MatricaK(NP, NE, r, Kxx, N, h, 1)]; %also works, but slow, around 1.5s
 ...
end;

How could I make the result returning fast? I found Parallel Programming on MATLAB to execute 3 different functions at the same time but there is nothing about speed.

Foi útil?

Solução

So, the problem is the overhead incurred passing the results around. While I can't give you a concrete answer - it depends on your situation and version of matlab - I can suggest some things to try.

  1. Try different methods of retrieving the data from the matlabpool labs. Some of them may allow for matlab to do some behind the scenes optimisation. For this I would suggest looking at distributed arrays. Alternatively if you don't need to do any subsequent processing on the data you could just save it straight to disk rather than pass it through memory.

  2. Try running it as a distributed batch job. The more recent versions of matlab allow you to create batch jobs and run them on your local machine (rather than on a dedicated cluster). Since your two functions are completely independent you could use this method which may be quicker for retrieving the data.

  3. Try to find someway of having the results placed in shared memory. This is difficult since Matlab's parallel processing is based on MPI which works on message passing; some modes are basicly a matlab wrapper over MPI. This means that each worker is a separate instance of Matlab with a separate memory space and so when you gather the data again it is actually making an MPI message of it, which has some overhead. Ways to avoid this have been discussed on stackoverflow before.

  4. Take a look inside the functions. If you can/are allowed to modify the two functions you are calling you may be able to parallelise them in a way which doesn't have such a communication overhead. Matlab is quite clever when it comes to parallel processing of vectors, so if there are any operations which you can translate into vector/matrix operations you could make a significant speed saving there without the need for the communication overhead.

  5. Try something other than matlab. If you have the time and coding skills then you could re-code the functions in a language which allows multi-threaded execution such as C++ with OpenMP.

I have faced similar problems with parallel programming in Matlab myself, though in my case despite sharing input data the problem was embarrassingly parallel so I ended up saving the output to disk straight from the worker nodes.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top