How to make thread sleep less than a millisecond on Windows

https://stackoverflow.com/questions/85122

01-07-2019
|

Question

On Windows I have a problem I never encountered on Unix. That is how to get a thread to sleep for less than one millisecond. On Unix you typically have a number of choices (sleep, usleep and nanosleep) to fit your needs. On Windows, however, there is only Sleep with millisecond granularity.

On Unix, I can use the use the select system call to create a microsecond sleep which is pretty straightforward:

int usleep(long usec)
{
    struct timeval tv;
    tv.tv_sec = usec/1000000L;
    tv.tv_usec = usec%1000000L;
    return select(0, 0, 0, 0, &tv);
}

How can I achieve the same on Windows?

Solution 18

On Windows the use of select forces you to include the Winsock library which has to be initialized like this in your application:

WORD wVersionRequested = MAKEWORD(1,0);
WSADATA wsaData;
WSAStartup(wVersionRequested, &wsaData);

And then the select won't allow you to be called without any socket so you have to do a little more to create a microsleep method:

int usleep(long usec)
{
    struct timeval tv;
    fd_set dummy;
    SOCKET s = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP);
    FD_ZERO(&dummy);
    FD_SET(s, &dummy);
    tv.tv_sec = usec/1000000L;
    tv.tv_usec = usec%1000000L;
    return select(0, 0, 0, &dummy, &tv);
}

All these created usleep methods return zero when successful and non-zero for errors.

OTHER TIPS

This indicates a mis-understanding of sleep functions. The parameter you pass is a minimum time for sleeping. There's no guarantee that the thread will wake up after exactly the time specified. In fact, threads don't 'wake up' at all, but are rather chosen for execution by the scheduler. The scheduler might choose to wait much longer than the requested sleep duration to activate a thread, especially if another thread is still active at that moment.

As Joel says, you can't meaningfully 'sleep' (i.e. relinquish your scheduled CPU) for such short periods. If you want to delay for some short time, then you need to spin, repeatedly checking a suitably high-resolution timer (e.g. the 'performance timer') and hoping that something of high priority doesn't pre-empt you anyway.

If you really care about accurate delays of such short times, you should not be using Windows.

Use the high resolution timers available in winmm.lib. See this for an example.

Yes, you need to understand your OS' time quantums. On Windows, you won't even be getting 1ms resolution times unless you change the time quantum to 1ms. (Using for example timeBeginPeriod()/timeEndPeriod()) That still won't really guarantee anything. Even a little load or a single crappy device driver will throw everything off.

SetThreadPriority() helps, but is quite dangerous. Bad device drivers can still ruin you.

You need an ultra-controlled computing environment to make this ugly stuff work at all.

#include <Windows.h>

static NTSTATUS(__stdcall *NtDelayExecution)(BOOL Alertable, PLARGE_INTEGER DelayInterval) = (NTSTATUS(__stdcall*)(BOOL, PLARGE_INTEGER)) GetProcAddress(GetModuleHandle("ntdll.dll"), "NtDelayExecution");

static NTSTATUS(__stdcall *ZwSetTimerResolution)(IN ULONG RequestedResolution, IN BOOLEAN Set, OUT PULONG ActualResolution) = (NTSTATUS(__stdcall*)(ULONG, BOOLEAN, PULONG)) GetProcAddress(GetModuleHandle("ntdll.dll"), "ZwSetTimerResolution");




static void SleepShort(float milliseconds) {
    static bool once = true;
    if (once) {
        ULONG actualResolution;
        ZwSetTimerResolution(1, true, &actualResolution);
        once = false;
    }

    LARGE_INTEGER interval;
    interval.QuadPart = -1 * (int)(milliseconds * 10000.0f);
    NtDelayExecution(false, &interval);
}

yes it uses some undocumented kernel functions, but it works very well, I use SleepShort(0.5); in some of my threds

If you want so much granularity you are in the wrong place (in user space).

Remember that if you are in user space your time is not always precise.

The scheduler can start your thread (or app), and schedule it, so you are depending by the OS scheduler.

If you are looking for something precise you have to go: 1) In kernel space (like drivers) 2) Choose an RTOS.

Anyway if you are looking for some granularity (but remember the problem with user space ) look to QueryPerformanceCounter Function and QueryPerformanceFrequency function in MSDN.

As several people have pointed out, sleep and other related functions are by default dependent on the "system tick". This is the minimum unit of time between OS tasks; the scheduler, for instance, will not run faster than this. Even with a realtime OS, the system tick is not usually less than 1 ms. While it is tunable, this has implications for the entire system, not just your sleep functionality, because your scheduler will be running more frequently, and potentially increasing the overhead of your OS (amount of time for the scheduler to run, vs. amount of time a task can run).

The solution to this is to use an external, high-speed clock device. Most Unix systems will allow you to specify to your timers and such a different clock to use, as opposed to the default system clock.

What are you waiting for that requires such precision? In general if you need to specify that level of precision (e.g. because of a dependency on some external hardware) you are on the wrong platform and should look at a real time OS.

Otherwise you should be considering if there is an event you can synchronize on, or in the worse case just busy wait the CPU and use the high performance counter API to measure the elapsed time.

Generally a sleep will last at least until the next system interrupt occurs. However, this depends on settings of the multimedia timer resources. It may be set to something close to 1 ms, some hardware even allows to run at interrupt periods of 0.9765625 (ActualResolution provided by NtQueryTimerResolution will show 0.9766 but that's actually wrong. They just can't put the correct number into the ActualResolution format. It's 0.9765625ms at 1024 interrupts per second).

There is one exception wich allows us to escape from the fact that it may be impossible to sleep for less than the interrupt period: It is the famous Sleep(0). This is a very powerful tool and it is not used as often as it should! It relinquishes the reminder of the thread's time slice. This way the thread will stop until the scheduler forces the thread to get cpu service again. Sleep(0) is an asynchronous service, the call will force the scheduler to react independent of an interrupt.

A second way is the use of a waitable object. A wait function like WaitForSingleObject() can wait for an event. In order to have a thread sleeping for any time, also times in the microsecond regime, the thread needs to setup some service thread which will generate an event at the desired delay. The "sleeping" thread will setup this thread and then pause at the wait function until the service thread will set the event signaled.

This way any thread can "sleep" or wait for any time. The service thread can be of big complexity and it may offer system wide services like timed events at microsecond resolution. However, microsecond resolution may force the service thread to spin on a high resolution time service for at most one interrupt period (~1ms). If care is taken, this can run very well, particulary on multi-processor or multi-core systems. A one ms spin does not hurt considerably on multi-core system, when the affinity mask for the calling thread and the service thread are carefully handled.

Code, description, and testing can be visited at the Windows Timestamp Project

I have the same problem and nothing seems to be faster than a ms, even the Sleep(0). My problem is the communication between a client and a server application where I use the _InterlockedExchange function to test and set a bit and then I Sleep(0).

I really need to perform thousands of operations per second this way and it doesn't work as fast as I planned.

Since I have a thin client dealing with the user, which in turn invokes an agent which then talks to a thread, I will move soon to merge the thread with the agent so that no event interface will be required.

Just to give you guys an idea how slow this Sleep is, I ran a test for 10 seconds performing an empty loop (getting something like 18,000,000 loops) whereas with the event in place I only got 180,000 loops. That is, 100 times slower!

Actually using this usleep function will cause a big memory/resource leak. (depending how often called)

use this corrected version (sorry can't edit?)

bool usleep(unsigned long usec)
{
    struct timeval tv;
    fd_set dummy;
    SOCKET s = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP);
    FD_ZERO(&dummy);
    FD_SET(s, &dummy);
    tv.tv_sec = usec / 1000000ul;
    tv.tv_usec = usec % 1000000ul;
    bool success = (0 == select(0, 0, 0, &dummy, &tv));
    closesocket(s);
    return success;
}

Like everybody mentioned, there is indeed no guarantees about the sleep time. But nobody wants to admit that sometimes, on an idle system, the usleep command can be very precise. Especially with a tickless kernel. Windows Vista has it and Linux has it since 2.6.16.

Tickless kernels exists to help improve laptops batterly life: c.f. Intel's powertop utility.

In that condition, I happend to have measured the Linux usleep command that respected the requested sleep time very closely, down to half a dozen of micro seconds.

So, maybe the OP wants something that will roughly work most of the time on an idling system, and be able to ask for micro second scheduling! I actually would want that on Windows too.

Also Sleep(0) sounds like boost::thread::yield(), which terminology is clearer.

I wonder if Boost-timed locks have a better precision. Because then you could just lock on a mutex that nobody ever releases, and when the timeout is reached, continue on... Timeouts are set with boost::system_time + boost::milliseconds & cie (xtime is deprecated).

Try using SetWaitableTimer...

Try boost::xtime and a timed_wait()

has nanosecond accuracy.

Just use Sleep(0). 0 is clearly less than a millisecond. Now, that sounds funny, but I'm serious. Sleep(0) tells Windows that you don't have anything to do right now, but that you do want to be reconsidered as soon as the scheduler runs again. And since obviously the thread can't be scheduled to run before the scheduler itself runs, this is the shortest delay possible.

Note that you can pass in a microsecond number to your usleep, but so does void usleep(__int64 t) { Sleep(t/1000); } - no guarantees to actually sleeping that period.

Sleep function that is way less than a millisecond-maybe

I found that sleep(0) worked for me. On a system with a near 0% load on the cpu in task manager, I wrote a simple console program and the sleep(0) function slept for a consistent 1-3 microseconds, which is way less than a millisecond.

But from the above answers in this thread, I know that the amount sleep(0) sleeps can vary much more wildly than this on systems with a large cpu load.

But as I understand it, the sleep function should not be used as a timer. It should be used to make the program use the least percentage of the cpu as possible and execute as frequently as possible. For my purposes, such as moving a projectile across the screen in a videogame much faster than one pixel a millisecond, sleep(0) works, I think.

You would just make sure the sleep interval is way smaller than the largest amount of time it would sleep. You don't use the sleep as a timer but just to make the game use the minimum amount of cpu percentage possible. You would use a separate function that has nothing to do is sleep to get to know when a particular amount of time has passed and then move the projectile one pixel across the screen-at a time of say 1/10th of a millisecond or 100 microseconds.

The pseudo-code would go something like this.

while (timer1 < 100 microseconds) {
sleep(0);
}

if (timer2 >=100 microseconds) {
move projectile one pixel
}

//Rest of code in iteration here

I know the answer may not work for advanced issues or programs but may work for some or many programs.

If your goal is to "wait for a very short amount of time" because you are doing a spinwait, then there are increasing levels of waiting you can perform.

void SpinOnce(ref Int32 spin)
{
   /*
      SpinOnce is called each time we need to wait. 
      But the action it takes depends on how many times we've been spinning:

      1..12 spins: spin 2..4096 cycles
      12..32: call SwitchToThread (allow another thread ready to go on time core to execute)
      over 32 spins: Sleep(0) (give up the remainder of our timeslice to any other thread ready to run, also allows APC and I/O callbacks)
   */
   spin += 1;

   if (spin > 32)
      Sleep(0); //give up the remainder of our timeslice
   else if (spin > 12)
      SwitchTothread(); //allow another thread on our CPU to have the remainder of our timeslice
   else
   {
      int loops = (1 << spin); //1..12 ==> 2..4096
      while (loops > 0)
         loops -= 1;
   }
}

So if your goal is actually to wait only for a little bit, you can use something like:

int spin = 0;
while (!TryAcquireLock()) 
{ 
   SpinOne(ref spin);
}

The virtue here is that we wait longer each time, eventually going completely to sleep.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow