Question

Hopefully someone can help us as we're reaching as far as investigation can go!

We've got a simple asynchronous socket server written in C# that accepts connections from an ASP.NET web application, is sent a message, performs some processing (usually against a DB but other systems too) and then sends a response back to the client. The client is in charge of closing the connection.

We've been having issues where if the system is under heavy load over a long period of time (days usually), CLOSE_WAIT sockets build up on the server box (netstat -a) to an extent that the process will not accept any further connections. At that point we have to bounce the process and off it runs again.

We've tried running some load tests of our ASP.NET application to attempt to replicate the problem (because inferring some issue from the code wasn't possible). We think we've managed this and ended up with a WireShark packet trace of the issue manifesting itself as a SocketException in the socket server's logs:

System.Net.Sockets.SocketException: An existing connection was forcibly closed by the remote host at System.Net.Sockets.Socket.BeginSend(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags, AsyncCallback callback, Object state)

I've tried to reproduce the issue from the packet trace as a single threaded process directly talking to the socket server (using the same code the ASP.NET app does) and am unable.

Has anybody got any suggestions of next things to try, check for or obvious things we may be doing wrong?

Was it helpful?

Solution

Look at the diagram

http://en.wikipedia.org/wiki/File:Tcp_state_diagram_fixed.svg

Your client closed the connection by calling close(), which sent FIN to the server socket, which ACKed the FIN and the state of which now changed to CLOSE_WAIT, and stays that way unless the server issues close() call on that socket.

Your server program needs to detect whether the client has aborted the connection, and then close() it immediately to free up the port. How? Refer to read(). Upon reading end-of-file (meaning FIN is received), zero is returned.

OTHER TIPS

If your server is accumulating CLOSE_WAIT sockets then it's not closing its socket when the connection is complete. If you take a look at the state diagram in the comment to Chris' post you'll see that CLOSE_WAIT transitions to LAST_ACK once the socket is closed and the FIN has been sent.

You say that it's complex to determine where to do this due to the async nature? This shouldn't be a problem, you should close the socket if the callback from your recv returns 0 bytes (assuming you have nothing else to do once your client closes its side of the connection). If you do need to worry about continuing to send then do a Shutdown(recv) here and make a note that your client has closed, once you're done sending do a Shutdown(send) and a Close.

You MAY be issuing a new read in the callback from the read which returns 0 indicating that the client has closed and this may be causing you problems?

The client is in charge of closing the connection.

Both the client and the server must close and Shutdown the socket. Either the client is not finishing the close (unlikely - since it'd have it's finalizer run) or the server is not shutting down the socket (likely).

using (Socket s = new Socket(/* */)) {
  /* Do stuff */
  s.Shutdown(SocketShutdown.Both);
  s.Close();
}

You shouldn't be leaving the responsibility of closing the TCP sockets only up to the client. What happens if the client process/machine crashes?

Ideally you should have a timeout in place so that if no traffic is received on a connected socket after a certain amount of time then it gets closed by the server.

No matter what happens when all operations on the socket has finished by the client, and it does not need to do any more read operations on the socket, the client should issue a close command.

This issuing of close command, simply tells the listener( the server ) that the connection needs to be shut down.

In simple words, when the server again issues a read command (listener.read() or listener.beginread(...) in async mode), the read will return a 0 bytes read, this in itself indicates that the socket needs to be closed by the listener as any other operations on the socket has ceased by the client.

CLOSE_WAIT's are meant to hang around for a while after a socket is closed, to prevent re-using the same socket number and receiving packets from the old connection. This will only give you grief if you're opening and closing a huuuuge number of sockets really quickly.

EDIT - It should be TIME_WAIT, not CLOSE_WAIT above.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top