Question

I have a a list of ids in Scala and for each id I launch a future to perform a database operation like below:

import myPackage.myExecutionContext

def doDbOperation(ids: List[Long]) = ids.map { id =>
  Future(get(id))
}

I notice that the Futures are executed only after the traversal of list is completed (List is pretty large). How do I make the Future to be launched as and when possible (based on available threads) without waiting for the traversal to complete?

Was it helpful?

Solution

Take a look at Future.traverse. Docs say "This is useful for performing a parallel map. For example, to apply a function to all items of a list in parallel".

Future.traverse(ids)(id => Future(get(id)))

OTHER TIPS

I can not explain the observed behavior (Future body not executing until the map traversal has completed). The following will print Running future while mapping:

package org.example

import scala.concurrent.{ Future }
import scala.concurrent.ExecutionContext.Implicits.global

object Test extends App {

  @volatile var hasStarted = false

  val ids: List[Long] = Range.Long(0L, 100000L, 1L).toList

  val res = ids.map({ id =>
    val res = Future {
      if (!hasStarted) hasStarted = true
      id
    }
    if (hasStarted) println("Running future while mapping")
    res
  })
}

Which makes sense; Future.apply does not perform differently depending on whether it is called inside a map or a Future.traverse.

Re: Future.traverse The return types are different.

ids.map { id => Future(get(id)) } returns List[Future[DBObject]]

but

Future.traverse(ids)(id => Future(get(id))) returns Future[List[DBObject]]

Future[List[DBObject]] is almost always more useful but you are changing the semantics (and maybe you don't want to).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top