Domanda

I would like to know when the various methods on a Storm Spout are called.

I've looked at ISpout javadoc, and that gives me the following mental model:

"instantiated" -- open(...) -----> "activated"
"activated"    -- deactivate() --> "deactivated"
"deactivated"  -- activate() ----> "activated"
"activated"    -- close() -------> "shutdown"
"deactivated"  -- close() -------> "shutdown"

But I am not sure when IComponent.declareOutputFields(...) is called. Before or after open(...)? When do the output streams and fields need to get declared? Within declareOutputFields(...)? Or it is OK to keep a reference to the OutputFieldsDeclarer and define them later on? If so, can it be on a separate thread?

I found this related question (Testing Storm Bolts and Spouts), but the answers don't seem to point at any design principle or specification.

È stato utile?

Soluzione

The method IComponent.declareOutputFields(...) is called on the client machine when the client code calls createTopology() on the TopologyBuilder instance. Please look at the line 226 in the TopologyBuilder.java where this method gets called on the Spout or Bolt component(s).

The callback method IComponent.declareOutputFields(...) is part of the topology life cycle rather part of the Spout or Bolt life cycle. To answer your question, this method gets called before open() method.

The output fields should be declared in the declareOutputFields() method so that Storm serializes the Spout/Bolt object(s) including the configurations and output fields. The serialized instances of Spout/Bolt are then submitted to the Storm cluster after which the other life cycle methods (activate(), open(), etc.) of Spout/Bolt are called.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top