The basic problem is latency: the time it takes for a network frame/package to reach the destination.
For instance, 1 ms latency limits the speed to at most 1000 frames/second. Latency of 2 ms can handle 500 fps, 10 ms gives 100 fps etc..
In this case, managing 1600 fps (800*2) is expected when latency is 0.5 ms.
I think this is because you manage to send more data per frame. It will fill up the TCP buffer in the client after a while though.
Batch (pipeline) the messages if possible. Send 10 messages from the client in a batch and then wait for the server to reply. The server should send all 10 replies in a single chunk as well. This should make the speed 10x faster in theory.