Question

I've pretty much always used send() with sockets and now I'm moving onto the WSA functions. With send(), I have a sendall() helper that ensured all data is delivered even if it didn't happen in one try and a partial send occurred on first call.

So, instead of learning the hard way or over-complicating code when I don't have to, decided to ask you:

Can a blocking WSASend() send partial data or does it send everything before it returns or fails? Or should I check the bytes sent vs. expected to send and keep at it until everything is delivered?

ANSWER: Overlapped WSASend() does not send partial data but if it does, it means the connection has terminated. I've never encountered the case yet.

Was it helpful?

Solution

From the WSASend docs:

If the socket is non-blocking and stream-oriented, and there is not sufficient space in the transport's buffer, WSASend will return with only part of the application's buffers having been consumed. Given the same buffer situation and a blocking socket, WSASend will block until all of the application buffer contents have been consumed.

I haven't tried this behavior though. BTW, why do you rewrite your code to use WSA functions? Switching from standard bsd socket api just to use the socket basically with the same blocking behavior doesn't really seem to be a good idea for me. Just keep the old blocking code with send with the "retry code", this way its portable and bulletproof. It is not saving 1-2 comparisons is that makes your IO code performant.

Switch to specialized WSA functions only if you are trying to exploit some windows specific strengths, or if you want to use for non-blocking sockets with WSAWaitForMultipleObjects that is a bit better than the standard select but even in that case you can simply go with send and recv as I did it.

In my opinion using epoll/kqueue/iocp (or a library that abstracts these away) with sockets are the way to go. There are some very basic tasks that can be done with blocking sockets but if you cross the line and you need nonblocking socks then switching straight to epoll/kqueue/iocp is the way to go instead of programming painful select or WSAWaitForMultipleObjects based apis. epoll/kqueue/iocp are not only better but also easier to program than the select based alternatives. Really. They are more modern apis that were invented based on more experience. (Although they are not crossplatform, but even select has portability issues...).

The previously mentioned apis for linux/bsd/windows are based on the same concept but in my opinion the simplest and easiest to learn is the epoll api of linux. It is ways better than a select call but its 100x easier to program once you get the idea. If you start using IOCP on windows than it my seem a bit more complicated.

If you haven't yet used these apis then definitely give epoll a go if you are familiar with linux and then on windows implement the same with IOCP that is based on a similar concept with a bit more complicated overlapped IO programming. With IOCP you will have a reason for using WSASend because you can not start overlapped IO on a socket with send but you can do that with WSASend (or WriteFile).

EDIT: If you are going for max performance with IOCP then here are some additional hints:

  • Drop blocking operations. This is very important. A serious networking engine can not afford blocking IO. It simply doesn't scale on any of the platforms. Do overlapped operations for both send and receive, overlapped IO is the big gun of windows.
  • Setup a thread pool that processes the completed IO operations. Setup test clients that bomb your server with real-world-usage-like messages and parallel connection counts and under stress tweak the buffer sizes and thread counts for your actual target hardware.
  • Set the SO_RCVBUF and SO_SNDBUF sizes of your sockets to zero and play around with the size of the buffers that you are using to send and receive data. Setting the rcv/send buf of the socket handle to zero allows the tcp stack to receive/send data directly to/from your buffers avoiding an additional copy between your userspace buffers and the socket buffers. The optimal size for these buffers is also subject to tweaking. I usually use at least a few ten K buffers sizes but sometimes in case of large volume transfers 1-2M buffer sizes are better depending on the number of parallel busy connections. Again, tweak the values while stressing the server with some test clients that do activity similar to real world clients. When you are ready with the first working version of your network engine on top of it lets build a test client that can simulate many (maybe thousands of) parallel clients depending on the real world usage of your server.
  • You will need "per connection software send buffers" inside your network engine and you may (or may not) want to control the max size of the send buffers. In case of reaching the max send buffer size you may want to block or discard messages/data depending on what you want to do, encapsulate this special buffer and provide two nice interfaces to it: one for the threads that are putting data into this buffer and another interface that is used by the IOCP sender code. This buffer is usually a very critical part of the whole thing and I usually had a lot of bugs around this part of the code so make sure to design its interface nicely to minimize the number of bugs. Depending on how your application constructs and puts messages into the queue you can play around a lot with the internal implementation (size of storage chunks, nagle-like optimizations, ...).
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top