Don't use a thread per peer; past the number of processors, additional threads is likely only to hurt performance. You'd also have been expected to tweak the dwStackSize
so that 1000 idle peers doesn't cost you 1000MB of RAM.
You can use a thread-pool (X
threads handling Y
sockets) to get a performance boost (or, ideally, IO Completion Ports), but this tends to work incredibly well for certain kinds of applications, and not at all for other kinds of applications. Unless you're certain that yours is suited for this, I wouldn't justify taking the risk.
It's entirely permissible to use a single thread and poll/send from a large quantity of sockets. I don't know precisely when large
would have a concernable overhead, but I'd (conservatively) ballpark it somewhere between 2k-5k sockets (on below average hardware).
The workaround for
WSAEWOULDBLOCK
is to have a std::queue<BYTE>
of bytes (not a queue of "packet objects") for each socket in your application (you populate this queue
with the data you want to send), and have a single background-thread whose sole purpose is to drain the queues into the respective socket send
(X
bytes at a time); you can use blocking socket for this now (since it's a background-worker), but if you do use a non-blocking socket and get WSAEWOULDBLOCK
you can just keep trying to drain the queue (here it won't obstruct the flow of your application).