Question

In our multithreaded java app, we are using LinkedBlockingDeque separate instance for each thread, assume threads (c1, c2, .... c200)

Threads T1 & T2 receive data from socket & add the object to the specific consumer's Q between c1 to c200.

Infinite loop inside the run(), which calls LinkedBlockingDeque.take()

In the load run the CPU usage for the javae.exe itself is 40%. When we sum up the other process in the system the overall CPU usage reaches 90%.

By using JavaVisualVM the run() is taking more CPU and we suspect the LinkedBlockingDeque.take()

So tried alternatives like thread.wait and notify and thread.sleep(0) but no change.

The reason why each consumer having separate Q is for two reason, 1.there might be more than one request for consumer c1 from T1 or T2 2.if we dump all req in single q, the seach time for c1 to c200 will be more and search criteria will extend. 3.and let the consumer have the separate Q to process thier requests

Trying to reduce the CPU usage and in need of your inputs...

SD

Was it helpful?

Solution

  1. do profiling and make sure that the queue methods take relatively much CPU time. Is your message processing so simple that is compared to putting/taking to/from queue? How many messages are processed per second? How many CPUs are there? If each CPU is processing less than 100K messages per second, then it's likely that the reason is not the access to the queues, but message handling itself.

  2. Putting in LinkedBlockingDeque creates an instance of a helper object. And I suspect, each new message is allocated from heap, so 2 creation per message. Try to use a pool of preallocated messages and circular buffers.

  3. 200 threads is a way too many. This means, too many context switches. Try to use actor libraries and thread pools, for example, https://github.com/rfqu/df4j (yes, it's mine).

  4. Check if http://code.google.com/p/disruptor/ would fit for your needs.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top