Part of the solution was a simple precision data type problem. Exchange the speed calculation by a constant, and you'll see a extremely smooth movement. Analysing the calculation showed that you're storing the result from QueryPerformanceCounter()
inside a float. QueryPerformanceCounter()
returns a number which looks like this on my computer: 724032629776
. This number requires at least 5 bytes to be stored. How ever a float
uses 4 bytes (and only 24 bits for actual number) to store the value. So precision is lost when you convert the result of QueryPerformanceCounter()
to float
. And sometimes this leads to a deltaTime
of zero causing stuttering.
This explains partly why some users do not experience this problem. It all depends on if the result of QueryPerformanceCounter()
does fit into a float
.
The solution for this part of the problem is: use double
(or as Ben Voigt suggested: store the initial performance counter, and subtract this from new values before converting to float
. This would give you at least more head room, but might eventually hit the float
resolution limit again, when the application runs for a long time (depends on the growth speed of the performance counter).)
After fixing this, the stuttering was much less but did not disappear completely. Analyzing the runtime behaviour showed that a frame is skipped now and then. The application GPU command buffer is flushed by Present
but the present command remains in the application context queue until the next vsync (even though Present
was invoked long before vsync (14ms)). Further analysis showed that a back ground process (f.lux) told the system to set the gamma ramp once in a while. This command required the complete GPU queue to run dry before it was executed. Probably to avoid side effects. This GPU flush was started just before the 'present' command was moved to the GPU queue. The system blocked the video scheduling until the GPU ran dry. This took until the next vsync. So the present packet was not moved to GPU queue until the next frame. The visible effect of this: stutter.
It's unlikely that you're running f.lux on your computer too. But you're probably experiencing a similar background intervention. You'll need to look for the source of the problem on your system yourself. I've written a blog post about how to diagnose frame skips: Diagnose frame skips and stutter in DirectX applications. You'll also find the whole story of diagnosing f.lux as the culprit there.
But even if you find the source of your frame skip, I doubt that you'll achieve stable 60fps while dwm window composition is enabled. The reason is, you're not drawing to the screen directly. But instead you draw to a shared surface of dwm. Since it's a shared resource it can be locked by others for an arbitrary amount of time making it impossible for you to keep the frame rate stable for your application. If you really need a stable frame rate, go full screen, or disable window composition (on Windows 7. Windows 8 does not allow disabling window composition):
#include <dwmapi.h>
...
HRESULT hr = DwmEnableComposition(DWM_EC_DISABLECOMPOSITION);
if (!SUCCEEDED(hr)) {
// log message or react in a different way
}