Question

I'm trying to write a data module in Scala.

While loading entire data in parallel, some data depends on other data, so execution sequence has to be managed in efficient way.

For example in code, I keep a map with name of data and manifest

val dataManifestMap = Map(
  "foo" -> manifest[String],
  "bar" -> manifest[Int],
  "baz" -> manifest[Int],
  "foobar" -> manifest[Set[String]], // need to be executed after "foo" and "bar" is ready
  "foobarbaz" -> manifest[String], // need to be executed after "foobar" and "baz" is ready
)

These data will be stored in a mutable hash map

private var dataStorage = new mutable.HashMap[String, Future[Any]]()

There are some code that will load data

def loadAllData(): Future[Unit] = {
  Future.join(
    (dataManifestMap map {
      case (data, m) => loadData(data, m) } // function has all the string matching and loading stuff
    ).toSeq
  )    
}

def loadData[T](data: String, m: Manifest[T]): Future[Unit] = {
  val d = data match {
    case "foo" => Future.value("foo")
    case "bar" => Future.value(3)
    case "foobar" => // do something with dataStorage("foo") and dataStorage("bar")
    ... // and so forth (in a real example it would be much more complicated for sure)
  }

  d flatMap { 
    dVal => { this.synchronized { dataStorage(data) = dVal }; Future.value(Unit) }
  }
}

This way, I cannot make sure "foobar" is loaded when "foo" and "bar" is ready, and so forth.

How can I manage this in a "cool" way, since I might have hundreds of different data?

It would be "awesome" if I could have some kind of data structure that has all the info about something has to be loaded after something, and sequential execution can be handled by flatMap in a neat way.

Thanks for the help in advance.

Was it helpful?

Solution

All things being equal, I'd tend to use for comprehensions. For example:

def findBucket: Future[Bucket[Empty]] = ???
def fillBucket(bucket: Bucket[Empty]): Future[Bucket[Water]] = ???
def extinguishOvenFire(waterBucket: Bucket[Water]): Future[Oven] = ???
def makeBread(oven: Oven): Future[Bread] = ???
def makeSoup(oven: Oven): Future[Soup] = ???
def eatSoup(soup: Soup, bread: Bread): Unit = ???


def doLunch = {
  for (bucket <- findBucket;
       filledBucket <- fillBucket(bucket);
       oven <- extinguishOvenFire(filledBucket);
       soupFuture = makeSoup(oven);
       breadFuture = makeBread(oven);
       soup <- soupFuture;
       bread <- breadFuture) {
    eatSoup(soup, bread)
  }
}

This chains futures together, and calls the relevant methods once dependencies are satisfied. Note that we use = in the for comprehension to allow us to start two Futures at the same time. As it stands, doLunch returns Unit, but if you replace the last few lines with:

// ..snip..
       bread <- breadFuture) yield {
    eatSoup(soup, bread)
    oven
  }
}

Then it will return Future[Oven] - which might be useful if you want to use the oven for something else after lunch.

As for your code, my first though would be that you should consider Spray cache, as it looks like it might fit your requirements. If not, my next thought would be to replace the Stringly typed interface you've currently got and go with something based on typed method calls:

private def save[T](key: String)(value: Future[T]) = this.synchronized {
  dataStorage(key) = value
  value
}

def loadFoo = save("foo"){Future("foo")}
def loadBar = save("bar"){Future(3)}
def loadFooBar = save("foobar"){
  for (foo <- loadFoo;
       bar <- loadBar) yield foo + bar // Or whatever
}
def loadBaz = save("baz"){Future(200L)}
def loadAll = {
  val topLevelFutures = Seq(loadFooBar, loadBaz)
  // Use standard library function to combine futures
  Future.fold(topLevelFutures)(())((u,f) => ())
}

// I don't consider this method necessary, but if you've got a legacy API to support...
def loadData[T](key: String)(implicit manifest: Manifest[T]) = {
  val future = key match {
      case "foo" => loadFoo
      case "bar" => loadBar
      case "foobar" => loadFooBar
      case "baz" => loadBaz
      case "all" => loadAll
  }
  future.mapTo[T]
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top