It's impossible to say without looking at the code, but often when a select-based loop starts spinning at ~100% CPU usage, it's because one or more of the sockets you told select()
to watch are ready-for-read (and/or ready-for-write) so that select()
returns right away instead of blocking... but then the code neglects to actually recv()
(or send()
) any data on that socket. After failing to read/write anything, your event loop would try to go back to sleep by calling select()
again, but of course the socket's data (or buffer space, in the ready-for-write case) is still there waiting to be handled, so select()
returns immediately again, the buggy code neglects to do the recv()
(or send()
) again, and around and around we go at top speed :)
Another possibility would be that you are passing in a timeout value to select()
that is either zero or near-zero, causing select()
to return very quickly even when no sockets are ready-for-anything... that often happens when people forget to re-initialize the timeval struct before each call to select()
. You need to re-initialize the timeval struct each time because some implementations of select()
will modify it before returning.
My suggestion is to put some printf's (or your favorite equivalent) immediately before and immediately after your call to select()
, and watch that output as you reproduce the fault. That will show you whether the spinning is happening inside of a single call to select()
, or if something is causing select()
to return immediately over and over again.