Question

I have shipped an online (grid-based) videogame that uses the TCP protocol to ensure reliable communication in a server-client network topology. My game works fairly well, but suffers from higher than expected latency (similar TCP games in the genre seem to do a better job at keeping latency to a minimal).

While investigating, I discovered that the latency is only unexpectedly high for clients running Microsoft Windows (as opposed to Mac OS X clients). Furthermore, I discovered that if a Windows client sets TcpAckFrequency=1 in the registry and restarts their machine, their latency becomes normal.

It would appear that my network design did not take into account delayed acknowledgement:

A design that does not take into account the interaction of delayed acknowledgment, the Nagle algorithm, and Winsock buffering can drastically effect performance. (http://support.microsoft.com/kb/214397)

However, I'm finding it nearly impossible to take into account delayed acknowledgement in my game (or any game). According to MSDN, the Microsoft TCP stack uses the following criteria to decide when to send one ACK on received data packets:

  • If the second data packet is received before the delay timer expires (200ms), the ACK is sent.
  • If there are data to be sent in the same direction as the ACK before the second data packet is received and the delay timer expires, the ACK is piggybacked with the data segment and sent immediately.
  • When the delay timer expires (200ms), the ACK is sent.

(http://support.microsoft.com/kb/214397)

Reading this, one would presume that the workaround for delayed acknowledgement on Microsoft's TCP stack is as follows:

  1. Disable the Nagle algorithm (TCP_NODELAY).
  2. Disable the socket's send buffer (SO_SNDBUF=0), so that a call to send can be expected to send a packet.
  3. When calling send, if no further data is expected to be sent immediately, call send again with a single-byte of data that will be discarded by the receiver.

With this approach, the second data packet will be received by the receiver at around the same time as the previous data packet. As a result, the ACK should get sent immediately from the receiver to the sender (emulating what TcpAckFrequency=1 does in the registry).

However, from my testing, this improved latency only by about a half of what the registry edit does. What am I missing?


Q: Why not use UDP?

A: I chose TCP because every packet I send needs to arrive (and be in order); there are no packets that arn't worth retransmitting if they get lost (or become unordered). Only when packets can be discarded/unordered, can UDP be faster than TCP!

Was it helpful?

Solution

Since Windows Vista, TCP_NODELAY option must be set prior to calling connect, or (on the server) prior to calling listen. If you set TCP_NODELAY after calling connect, it will not actually disable Nagle algorithm, yet GetSocketOption will state that Nagle has been disabled! This all appears to be undocumented, and contradicts what many tutorials/articles on the subject teach.

With Nagle actually disabled, TCP delayed acknowledgements no longer cause latency.

OTHER TIPS

There should be nothing you need to do. All of the workarounds you're suggesting are to help protocols that weren't properly designed to work over TCP. Presumably your protocol was designed to work over TCP, right?

Your problem is almost definitely one or both of these:

  1. You are calling TCP send functions with small bits of data even though there is no reason you couldn't call then with larger chunks.

  2. You did not implement application-level acknowledgements of application protocol data units. Implement these so that the ACKs can piggy-back on them.

Use reliable UDP libraries and write your own congestion control algorithm, this will definitely overcome your TCP latency problem.

this the following library, which i use for reliable UDP transfers:

http://udt.sourceforge.net/

With this approach, the second data packet will be received by the receiver at around the same time as the previous data packet. As a result, the ACK should get sent immediately from the receiver to the sender (emulating what TcpAckFrequency=1 does in the registry).

I'm not convinced that this will always cause a second, separate packet to be sent. I know you have Nagle's disabled and a zero send buffer, but I've seen stranger things. Some wireshark dumps might be helpful.

One idea: Instead of your 'canary' packet being only one byte, send a full MSS's worth of data (typically, what, 1460 bytes on a 1500-MTU network).

To solve the problem, it's necessary to understand the normal functioning of TCP connections. Telnet is a good example to analyze.

TCP guarantees delivery by acknowledging successful data transmission. The " Ack" can be sent as a message by itself, but this introduces quite some overhead - an Ack is very small message itself but the lower level protocols add extra headers. For this reason, TCP prefers to piggyback the Ack message on another packet it's sending anyway. Looking at an interactive shell via Telnet, there's a steady stream of keystrokes and responses. And if there's a small pause in typing, there's nothing to echo on the screen. The only case when the flow stops is if you have output without corresponding input. But since you can only read so fast, it's OK to wait a few hundred milliseconds to see if there's a keystroke to piggyback the Ack on.

So, summarizing, we have a steady flow of packets both ways, and Ack usually piggybacks. If there's a interruption in the flow for application reasons, delaying the Ack won't be noticed.

Back to your protocol: You apparently don't have a request/response protocol. That means the Ack can't be piggy-backed (problem 1). And while the receiving OS will then send separate Acks, it won't spam those.

Your workaround via TCP_NODELAY and two packets on the sending (Windows) side assumes that the receiving side is Windows too, or at least behaves as such. This is wishful thinking, not engineering. The other OS may decide to wait for three packets to send an Ack, which completely breaks your use of TCP_NODELAY to force one extra packet. "Wait for 3 packets" is just an example; there are many other valid algorithms to prevent Ack spam whch would not be fooled by your second one-byte dummy packet.

What is the real solution? Send a response at protocol level. No matter the OS then, it will piggyback the TCP Ack on your protocol response. In turn, this response will also force an Ack in the other direction (the response too is a TCP message) but you don't care about the latency of the response. The response is there just so the receiving OS piggybacks the first Ack.

every packet I send needs to arrive (and be in order);

This requirement is the cause of your latency.

Either you have a network with negligible packet loss, in which UDP would be delivering every packet, or you have loss, in which TCP is doing retransmit, delaying everything by (multiples of) the retransmit interval (which is at least the round-trip time). This delay is not consistent, as it is triggered by lost packets; jitter usually has worse consequences than the predictable delay in acknowledgement caused by packet-combining

Only when packets can be discarded/unordered, can UDP be faster than TCP!

This is an easy assumption to make, but erroneous.

There are other ways to improve drop rates besides ARQ which provide lower latency: forward error correction methods achieve improved latency for drop recovery at the expense of additional bandwidth required.

I would suggest you leave the Nagle alogithm and buffers turned on, as its basic purpose is to collect small writes into full/larger packets (this improves performance a lot), but at the same time use FlushFileBuffers() on the socket after your are done sending for a while.

I assume here, that your game has some sort of a main loop, which processes stuff and then waits for amount of time before going into the next round:

while(run_my_game)
{
    process_game_events_and_send_data_over_network();
    Sleep(20 - time_spent_processing);
};

I would now suggest to insert FlushFileBuffers() before the Sleep() call:

while(run_my_game)
{
    process_game_events_and_send_data_over_network();
    FlushFileBuffers(my_socket);
    Sleep(20 - time_spent_processing);
};

That way, you delay sending pakets at latest to the moment before your application goes to sleep to wait for the next round. You should receive the performance benefit from Nagel's algorithm and minimize delay.

In case this doesn't work, it would be helpful if you post a bit of (pseudo-) code which explains how your program actually works.

EDIT: There were two more thing that came into my head when I thought about your question again:

a) Delayed ACK pakets should indeed NOT cause any lag, as they travel in the opposite direction of the data you are sending. They block at worst the sending queue. This however will be solved by TCP after a few pakets when the bandwith of the connection and memory limits permit it. So unless you machine has really low RAM (not enough to hold a bigger send queue), or you are really trasmitting more data than your connection allows, then delayed ACK pakets are an optimisation and will actually improve performance.

b) You are using a dedicated thread for sending. I wonder why. AFAIK is the Socket API multi-threading safe, thus every producting thread could send the data all by itself - unless your application requires such a queue, I would suggest to also remove this dedicated sending thread and with it the additional synchronisation overhead and delay it might cause.

I' specifically mentioning the delay here. As the operating system might decide to not immediatly schedule the send-thread for executiong again, when it becomes unblocked on its queue. Typicall re-scheduling delays are in the 10ms range, but under load they can skyrock to 50ms or more. As a workarround, you could try fiddeling with the scheduling priorities. But this will not reduce the delay imposed by the operating system itself.

Btw. you can easily benchmark TCP and your network, by just having one thread on the client and one on the server, that just play ping/pong with some data.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top