Question

I'm working on my own FTP client in C++, but I'm stuck at function recv(). When I get data with recv(), they can be incomplete, because I'm using TCP protocol, so I have to use recv in loop. Problem is that when I call recv after everything that should be received was received server blocks, and my program is stuck. I don't know how many bytes im going to recieve so I can't control it and stop it when its done. I found two not very elegant solutions right now:

  1. is to use string.substr() (or TR1 regex) to find needed expression and then stop calling recv before it blocks
  2. second is to set up timeval structure and then control socket through setsockopt() function. Problem is long response time when i can get incomplete corrupted data.

Question is, is there any clean and elegant solution for this?

Was it helpful?

Solution

The obvious thing to do is to transmit the length of the to-be-received message ahead (many protocols, including for example HTTP do that, to address the exact same issue). That way, you know that when you have received amount X, no more will come.

This will work fine 99.9% of the time and will catastrophically fail in the 0.1% of cases where the server is lying to you or where the server crashes unexpectedly or someone stumbles over the network cable (or something similar happens). Sadly, the "connection" established by TCP is an illusion, and you don't have much of a means to detect when the connection dies. The other end can go down, and you will not notice anything, unless you try to send and get an error (or until several hours later).

Therefore, you also need a backup strategy for when things don't go quite as good as expected. You might either use select or poll to know when data is available, so you don't block forever for a message that will never come.

Using threads to solve the block-at-end problem (as proposed in other answers) is not a very good option since blocking isn't the actual problem. The actual problem is that you don't know when you have reached the end of the transmission. Having a worker thread block at the end of the transmission will "work", but will leave the worker thread blocked indefinitely, consuming resources and with an uncertain, system-dependent fate.

You cannot join the thread before exiting, since it is blocked (so trying to join it would deadlock your main thread). When your process exits and the socket is closed, the thread will unblock, but will (at least on some operating systems, e.g. Windows) be terminated immediately after. This likely won't do much evil, but terminating a thread in an uncontrolled way is always less desirable than having it exit properly. On other operating systems, you may have a lingering thread remaining.

OTHER TIPS

Since you are using C++, there are alternative libraries that greatly simplify network programming compared to stock C. My personal favourite is Boost::Asio, however others are available. These libraries not only save you the pain of coding in C, but also provide asynchronous capabilities to work around your blocking problem.

The typical approach is to use select()/pselect() or poll()/ppoll(). Both allow to specify a timeout in order to exit if there are no incoming data.

However I don't see how you should "call recv after everything that should be received". It would be extremely inefficient to rely on the timeout also when there are not network problems...

Or you send the size of data being sent, before the data, and that's what you read, or the data connection is terminated with an EOF. In this case read() will return -1 and you exit.

I can think of two options that will not require a major rewrite of your existing code and a third one which is more radical:

  1. use non-blocking I/O and poll for data periodically. You can do other work while a message remains incomplete or no further data can be read from the socket.
  2. use a separate worker thread to do the I/O. Even if it blocks on synchronous recv() calls, your main thread can continue to do work. The worker thread can transfer the data it receives to the main thread for processing once a complete message is received via TCP.
  3. use an OS specific feature (I/O completion ports on Windows or aio on Linux), but these are far more complex and you should definitely consider Boost.Asio before going this route.

You can put the recv function in it's own thread and do the processing in another thread.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top