문제

One day our java web application goes up to 100% CPU usage. A restart solve the incident but not the problem because a few hours after the problem came back. We suspected a infinite loop introduced by a new version but we didn't make any change on the code or on the server.

We managed to find the problem by making several thread dumps with kill -QUIT and by looking and comparing every thread details. We found that one thread call stack appear in all the thread dumps. After analysis, there was a while loop condition that never go false for some data that was regularly updated in the database.

The analysis of several thread dumps of web application is really tedious.

So do you know any better way or tools to find such issue in a production environment ?

도움이 되었습니까?

해결책

After some queries, I found an answer in Monitoring and Managing Java SE 6 Platform Applications :

You can diagnose looping thread by using JDK’s provided tool called JTop that will show the CPU time each thread is using: enter image description here

With the thread name, you can find the stack trace of this thread in the “Threads” tab of by making a thread dump with a kill -QUIT.

You can now focus on the code that cause the infinite loop.

PS.: It seems OK to answer my own question according to http://blog.stackoverflow.com/2008/07/stack-overflow-private-beta-begins/ : […] “yes, it is OK and even encouraged to answer your own questions, if you find a good answer before anyone else.” […]

PS.: In case sun.com domain will no longer exists: You can run JTop as a stand-alone GUI:

$ <JDK>/bin/java -jar <JDK>/demo/management/JTop/JTop.jar

Alternately, you can run it as a JConsole plug-in:

$ <JDK>/bin/jconsole -pluginpath <JDK>/demo/management/JTop/JTop.jar 

다른 팁

Fix the problem before it occurs! Use a static analysis tool like FindBugs or PMD as part of your build system. It won't find everything, but it is a good first step.

Think of using coverage tools like Cobertura. It would have shown you, that you didn't test these code-paths.

Testing sth. like this can become really cumbersome, so try to avoid this by introducing quality measurements.

Anyways tools like VisualVM will give you a nice overview of all threads, so it becomes relatively easy to identify threads which are working for an unexpectedly long time.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top