Question

I have a Storm topology running in a distributed environment across 4 Unix nodes.

I have a JMSSpout that receives a message and then forwards it onto a ParseBolt that will parse the raw message and create an object.

To help measure latency my JMSSpout emits the current time as a value and then when the ParseBolt receives this it will get the current time again and take the difference as the latency.

Using this approach I am seeing 200+ ms which doesn't sound right at all. Does anyone have an idea with regards to why this might be?

Was it helpful?

Solution

It's probably a threading issue. Storm uses the same thread for all spout nextTuple() calls and tuples emitted aren't processed until the nextTuple() call ends. There's also a very tight loop that repeatedly calls the nextTuple() method and it can consume a lot of cycles if you don't put at least a short sleep in the nextTuple() implementation.

Try adding a sleep(10) and emitting only one tuple per nextTuple().

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top