I have a stand alone Java process that is reading messages off of a JMS durable topic and submitting them to threadpool for processing. I am doing it this way for obvious concurrency reasons, but maintain the order of processing of those messages, I am still submitting them to a single thread pool. Now here are my concerns relating to JVM crash..

--Non Transactional
I am NOT reading and processing every message in a Transactional context, which I avoided as it slows down my process. So, I am accumulating messages in the blocking queue of the threadpool. But if JVM crashes while say 10 messages are in the threadpool waiting to be processed, I will lose that data.

--Transactional
I believe if I read and process each message in a transaction, if something goes wrong that message will be redelivered to the process when it comes backup.

Since it is a common problem to many people working in low-latency systems, wondering how experienced people approach this problem? Thanks.

有帮助吗?

解决方案

Peter's AFAIK is correct. One way to work around this is, if the pattern is applicable to you, is use some sort of demarcation to create different queues to group messages into. That is to say, this requirement frequently breaks down to mean something like "all messages for one account must be processed in order". So if you have something similar to this, you can create either:

  • One topic and have multiple subscribers each using an exclusive selector (or sub-topic patterns)
  • Multiple topics with each one having a single subscriber.

Then your publisher must determine:

  • The headers on the published message so the corrects selector is effected for the subscriber, or
  • The correct sub-topic to publish on, or
  • The correct topic to publish on

An easilly maintainable pattern for doing this is to use one of your business fields (for example, an account number) and calculate a mod(x) on it where x is the number of subscribers you want to have to share the work load. Hopefully your business key is numeric and will give you a decent distribution, but you can always use some other deterministic algorithm to generate this number by reversing keys and/or hashing their non-numeric values.

As an aside, your outline has much more of a Point-to-Point/JMS Queue feel to it, rather than a Pub-Sub/JMS Topic one. Are you sure you want to use topics ?

If you absolutely cannot lose data, then you should use transactional messages. If you use transactional messages, then you cannot delegate to a thread pool. A JMS message's transactional context (which is the session) is bound to the thread that the message is received on, so unless you enable some "funny-business" to transfer this context to another thread.....

I'm not even sure how to finish that sentence.

==== Update ====

Now that I think about it, if you can benefit from parallelizing the processing of messages, you could retrieve batches of messages, all within one transaction, and delegate them all to an ExecutorService.invokeAll call, wait for completion and commit the transaction when they're all complete. If the invokeAll times out, or one of the tasks throws an exception, then you would have to rollback the transaction or take some sort of compensating action.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top