Question

Suppose I have an array of ~10K elements and I need to process all elements of the array. I would like to process them in such a way that only K elements are processed in parallel.

I use Scala 2.9. I tried parallel collections (see below) but I saw more than K elements processed in parallel.

import collection.parallel.ForkJoinTasks.defaultForkJoinPool._
val old = getParallelism
setParallelism(K)
val result = myArray.par.map(...) // process the array in parallel
setParallelism(old)

How would you suggest process an array in Scala 2.9 in such a way that only K elements are processed in parallel ?

Was it helpful?

Solution

The setParallelism method sets the recommended number of parallel workers that fork/join pool of the parallel collection is supposed to use. Those K workers may work on any part of the collection - it is up to the scheduler to decide which elements the workers will be assigned to.

If you would like to include only first K elements in the parallel operation, you should use the take method, followed by a map:

myArray.par.take(K).map(...)

You can alternatively use view.take(K).map(...).force to create a parallel view before doing the mapping.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top