Domanda

I have a web app being served by jetty + mysql. I'm running into an issue where my database connection pool gets exhausted, and all threads start blocking waiting for a connection. I've tried two database connection pool libraries: (1) bonecp (2) hikari. Both exhibit the same behavior with my app.

I've done several thread dumps when I see this state, and all the blocked threads are in this state (not picking on bonecp, I'm sure it's something on my end now):

"qtp1218743501-131" prio=10 tid=0x00007fb858295800 nid=0x669b waiting on condition [0x00007fb8cd5d3000]
  java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x0000000763f42d20> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
    at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
    at com.jolbox.bonecp.DefaultConnectionStrategy.getConnectionInternal(DefaultConnectionStrategy.java:82)
    at com.jolbox.bonecp.AbstractConnectionStrategy.getConnection(AbstractConnectionStrategy.java:90)
    at com.jolbox.bonecp.BoneCP.getConnection(BoneCP.java:553)
    at com.me.Foo.start(Foo.java:30)
    ...

I'm not sure where to go from here. I was thinking that I would see some stack traces in the thread dump where my code was stuck doing some lengthly operation, not waiting for a connection. For example, if my code looks like this:

public class Foo {
    public void start() {
        Connection conn = threadPool.getConnection();
        work(conn);
        conn.close();
    }

    public void work(Connection conn) {
        .. something lengthy like scan every row in the database etc ..
    }
}

I would expect one of the threads above to have a stack trace that shows it working away in the work() method:

...
at com.me.mycode.Foo.work()
at com.me.mycode.Foo.start()

but instead they're just all waiting for a connection:

...
at com.jolbox.bonecp.BoneCP.getConnection() // ?
at com.me.mycode.Foo.work()
at com.me.mycode.Foo.start()

Any thoughts on how to continue debugging would be great.

Some other background: the app operates normally for about 45 minutes, mem and thread dumps show nothing out of the ordinary. Then the condition is triggered and the thread count spikes up. I started thinking it might be some combination of sql statements the app is trying to perform which turn into some sort of lock on the mysql side, but again I would expect some of the threads in the stack traces above to show me that they're in that part of the code.

The thread dumps were taken using visualvm.

Thanks

È stato utile?

Soluzione

Take advantage of the configuration options for the connection pool (see BoneCPConfig / HikariCPConfig). First of all, set a connection time-out (HikariCP connectionTimeout) and a leak detection time-out (HikariCP leakDetectionThreshold, I could not find the counterpart in BoneCP). There might be more configuration options that dump stack-traces when something is not quite right.

My guess is that your application does not always return a connection to the pool and after 45 minutes has no connection in the pool anymore (and thus blocks forever trying to get a connection from the pool). Treat a connection like opening/closing a file, i.e. always use try/finally:

public void start() {

    Connection conn = null;
    try {
        work(conn = dbPool.getConnection());
    } finally {
        if (conn != null) {
            conn.close();
        }
    }
}

Finally, both connection pools have options to allow JMX monitoring. You can use this to monitor for strange behavior in the pool.

Altri suggerimenti

I question the whole design.

If you have a waiting block in a multithreaded netIO, you need a better implementation of the connection.

I suggest you take a look at non blocking IO (Java.nio, channels package), or granulate your locks.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top