Generally the best way is the way that splits the work evenly, to maintain efficiency (no cores are waiting). E.g. in your case probably static scheduling is not a good idea, because 40 does not divide 150 evenly, for the last iteration you would loose 25% of computing power. So it might turn out, that it would be better to put
parallel
clause before second loop. It all the depends on the mode you choose, and how really work is distributed within loops. E.g., Ifmyfunction
does 99% then its a bad idea, if 99% of work is within 2 inner loops it might be good.Not really. There are 3 scheduling modes. But none of them works in a way, that it blocks other threads. There is a pool of tasks (iterations) that is distributed among the threads. Scheduling mode describes the strategy of assigning tasks to threads. When one thread finishes, it just gets next task, no waiting. The strategies are described in more detail here: http://en.wikipedia.org/wiki/OpenMP#Scheduling_clauses (I am not sure if balant-copy paste from wiki is a good idea, so I'll leave a link. It's a good material.)
Maybe what is not written there is that the modes overhead are presented in order of the amount of overhead they introduce. static
is fastest, then dynamic
, then guided
. My advice when to use which would be, this is not the exact best, but good rule of thumb IMO:
static
if you know will be divided evenly among the threads and take the same amount of timedynamic
if you know the tasks will not be divided evenly or their execution times are not evenguided
for rather long tasks that you pretty much cannot tell anything
If your tasks are rather small you can see an overhead even for static scheduling (E.g. why my OpenMP C++ code is slower than a serial code?), but I think in your case dynamic
should be fine and best choice.