Question

Update: This looks like a memory issue. A 3.8 Gb Hprof file indicated that the JVM was dumping-its-heap when this "blocking" occurred. Our operations team saw that the site wasn't responding, took a stack trace, then shut down the instance. I believe they shut down the site before the heap dump finished. The log had no errors/exceptions/evidence of problems--probably because the JVM was killed before it could generate an error message.

Original Question We had a recent situation where the application appeared --to the end user--to hang. We got a stack trace before the application restart and I found some surprising results: of 527 threads, 463 had thread state BLOCKED.

In the Past In the past blocked thread usually had this issue: 1) some obvious bottleneck: e.g. some database record lock or file system lock problem which caused other threads to wait. 2) All blocked threads would block on the same class/method (e.g. the jdbc or file system clases)

Unusual Data In this case, I see all sorts of classes/methods blocked, including jvm internal classes, jboss classes, log4j, etc, in addition to application classes (including jdbc and lucene calls)

The question what would cause a JVM to block log4j.Hierarchy.getLogger, java.lang.reflect.Constructor.newInstance? Obviously some resource "is scarce" but which resource?

thanks

will

Stack Trace Excerpts

http-0.0.0.0-80-417" daemon prio=6 tid=0x000000000f6f1800 nid=0x1a00 waiting for monitor entry [0x000000002dd5d000]
   java.lang.Thread.State: BLOCKED (on object monitor)
                at sun.reflect.GeneratedConstructorAccessor68.newInstance(Unknown Source)
                at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
                at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
                at java.lang.Class.newInstance0(Class.java:355)
                at java.lang.Class.newInstance(Class.java:308)
                at org.jboss.ejb.Container.createBeanClassInstance(Container.java:630)

http-0.0.0.0-80-451" daemon prio=6 tid=0x000000000f184800 nid=0x14d4 waiting for monitor entry [0x000000003843d000]
   java.lang.Thread.State: BLOCKED (on object monitor)
                at java.lang.Class.getDeclaredMethods0(Native Method)
                at java.lang.Class.privateGetDeclaredMethods(Class.java:2427)
                at java.lang.Class.getMethod0(Class.java:2670)

"http-0.0.0.0-80-449" daemon prio=6 tid=0x000000000f17d000 nid=0x2240 waiting for monitor entry [0x000000002fa5f000]
   java.lang.Thread.State: BLOCKED (on object monitor)
                at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.register(Http11Protocol.java:638)
                - waiting to lock <0x00000007067515e8> (a org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler)
                at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.createProcessor(Http11Protocol.java:630)


"http-0.0.0.0-80-439" daemon prio=6 tid=0x000000000f701800 nid=0x1ed8 waiting for monitor entry [0x000000002f35b000]
   java.lang.Thread.State: BLOCKED (on object monitor)
                at org.apache.log4j.Hierarchy.getLogger(Hierarchy.java:261)
                at org.apache.log4j.Hierarchy.getLogger(Hierarchy.java:242)
                at org.apache.log4j.LogManager.getLogger(LogManager.java:198)
Was it helpful?

Solution

These are listed roughly in the order I would try them, depending on the evidence collected:

  • Have you looked at GC behavior? Are you under memory pressure? That could result in newInstance() and a few others above being blocked. Run your VM with -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -verbose:gc and log the output. Are you seeing excessive GC times near the time of failure/lockup?
    • Is the condition repeatable? If so, try with varying heap sizes in the JVM (-Xmx) and see if the behavior changes substantially. If so, look for memory leaks or properly size the heap for your app.
    • If the previous is tough, and you're not getting an OutOfMemoryError when you should, you can tune the GC tunables... see JDK6.0 XX options, or JDK6.0 GC Tuning Whitepaper. Look specifically at -XX:+UseGCOverheadLimit and -XX:+GCTimeLimit and related options. (note these are not well documented, but may be useful...)
  • Might there be a deadlock? With only stack trace excerpts, can't determine here. Look for cycles amongst the monitor states that threads are blocked on (vs. what they hold). I believe jconsole can do this for you ... (yep, under the threads tab, "detect deadlocks")
  • Try doing several repeated stacktraces and look for what changes vs. what stays the same...
  • Do the forensics... for each stack entry that says "BLOCKED", go look up the specific line of code and figure out whether there is a monitor there or not. If there's an actual monitor acquisition, it should be fairly easy to identify the limiting resource. However, some of your threads may show blocked without a transparently available monitor, these will be trickier...
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top