Question

Imagine a straight-forward supervision hierarchy. The child dies. The father decides to Restart the child. When Restarted, the postRestart and friends are called, but what if the father had decided to resume the child? Does the child actor know that he is being resumed? And btw. does the father have access to the message that caused the exception in his child?

Was it helpful?

Solution

Resume means “nothing really happened, carry on” and in this spirit the child is not even informed. It is a directive which should rarely be used.

The parent does only get the failure itself (i.e. the Throwable), not the message which caused the problem, because that would invite you to entangle the logic of parent and child beyond what is healthy.

OTHER TIPS

The term resume means continue processing messages and is referred to at two points in the documentation.

The first is used in response to an exception state: As per akka documentation:

 As described in Actor Systems supervision describes a dependency relationship between actors: the supervisor delegates tasks to subordinates and therefore must respond to their failures. When a subordinate detects a failure (i.e. throws an exception), it suspends itself and all its subordinates and sends a message to its supervisor, signaling failureDepending on the nature of the work to be supervised and the nature of the failure, the supervisor has a choice of the following four options:

Resume the subordinate, keeping its accumulated internal state
Restart the subordinate, clearing out its accumulated internal state
Terminate the subordinate permanently
Escalate the failure, thereby failing itself

Note that RESTART actually KILLS the original actor. The term resume is used again here meaning to continue processing messages.

As per the akka documentation.

The precise sequence of events during a restart is the following:

- suspend the actor (which means that it will not process normal messages until resumed), and recursively suspend all children
- call the old instance’s preRestart hook (defaults to sending termination requests to all children and calling postStop)
- wait for all children which were requested to terminate (using context.stop()) during preRestart to actually terminate; this—like all actor operations—is non-blocking, the termination notice from the last killed child will effect the progression to the next step
- create new actor instance by invoking the originally provided factory again
- invoke postRestart on the new instance (which by default also calls preStart)
- send restart request to all children which were not killed in step 3; restarted children will follow the same process recursively, from step 2
- resume the actor

You can have the failure bubble up to the Supervisor if you properly set up that kind of behavior in the supervisorStrategy of the supervisor. A little example to show that behavior:

import akka.actor.Actor
import akka.actor.Props
import akka.actor.ActorSystem

object SupervisorTest {
  def main(args: Array[String]) {
    val system = ActorSystem("test")
    val master = system.actorOf(Props[Master], "master")
    master ! "foo"
    Thread.sleep(500)
    val worker = system.actorFor("/user/master/foo")
    worker ! "bar"
  }
}

class Master extends Actor{
  import akka.actor.OneForOneStrategy
  import akka.actor.SupervisorStrategy._
  import scala.concurrent.duration._

  override val supervisorStrategy =
    OneForOneStrategy(maxNrOfRetries = 10, withinTimeRange = 1 minute) {
      case _: Exception                => Escalate
        Escalate
    }
  override def preRestart(ex:Throwable, msg:Option[Any]) = {
    println("In master restart: " + msg)
  }

  def receive = {
    case msg:String =>
      context.actorOf(Props[Worker], msg)
  }
}

class Worker extends Actor{
  override def preRestart(ex:Throwable, msg:Option[Any]) = {
    println("In worker restart: " + msg)
  }
  def receive = {
    case _ =>
      throw new Exception("error!!")
  }  
}

You can see in the Master actor (the supervisor in my example), I am choosing to Escalate a failure of type Exception. This will cause the failure to bubble up to the preRestart in the Master actor. Now I was expecting the msg param to preRestart to be the original offending message that went to the worker actor, but it wasn't. The only way I got that to show was be also overriding the preRestart of the child actor. In my example, you will see the print outs from both the supervisor and child, in that order.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top