Question

Tomcat is running a webapp under Windows. After a few days (under very low load), the exception mentioned in the title starts to appear in the logs, no new connections can be established from that point on, the only fix is then to reboot the server.

Environment:

  • Latest Tomcat 6
  • Windows Server 2008 R2
  • JDK 6 update 30
  • SQL Server 2008
  • Kerberos authentication

Evidence collected so far:

  • netstat shows no excessive amount of connections
  • ProcessExplorer shows no excessive amount of open file handles
  • system main memory usage is average
  • JVM heap usage is average
  • restarting Tomcat does not solve the problem

Open questions:

  • if we were leaking connections, shouldn't they show up in netstat?
  • shouldn't a restart of the appserver resolve the problem, because the OS should free all process resources?
  • is there a way to trace the problem to its origin? E.g. installing monitoring software, maybe something similar to lsof etc.?

I'm out of ideas, any hints appreciated!

Was it helpful?

Solution

The reason we got this error is a bug in Windows Server 2008 R2 / Windows 7. The kernel leaks loopback sockets due to a race condition on machines with more than one core, this patch fixes the issue: http://support.microsoft.com/kb/2577795

OTHER TIPS

I was running Alfresco Community 4.0d on Windows 7 64 bit and had the same symptoms and errors.

The problem was fixed with Microsoft's patch: "Kernel sockets leak on a multiprocessor computer that is running Windows Server 2008 R2 or Windows 7" (http://support.microsoft.com/kb/2577795) (ie. Buddy Casino's answer (see below)).

Another observation I'd like to add is that Windows connections (Internet Explorer, Remote Desktop etc) would work again about 5-10 mins after the Alfresco services were shutdown.

Alfresco is an excellent product and I was afraid I would have to scrap it. Fortunately stackoverflow came to the rescue !

Thanks again to Buddy Casino's answer.

Boo to the person who down-voted the Question.

We are seeing the same thing on a similar setup, W2008R2, Tomcat 6.0.29, Java 1.6.0.25. Restarting tomcat does not help, but restarting the server itself does, at least for a while. After the last time we started shutting down individual services and believe we have it narrowed down to either an instance of Alfresco that is also running on the server or the Backup Exec Agent services. After those services (four in total) were stopped, the applications in Tomcat started working again, although we were still seeing the buffer/connections error in the stdout log which was strange. Will need to wait for the problem to return before confirming which are the culprit, which could be anywhere from a few days to a week or more.

Any chance you are running either Alfresco or BE on your server?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top