Question

I am trying to use the Compute Shader with DirectX 11 to do some simple, but expensive calculations (think Mandelbrot Set). The result of the calculation is placed on a texture and are non-overlapping. It is not realtime, as it is expected to take between 1 and 10 seconds, but it will be displayed on the UI as soon as it finishes.

I am using WPF and SharpDX via http://directx4wpf.codeplex.com/ . That library has a DX11 view object with a RenderScene function in which DX render functions are called (including Computer Shader) and is run in the main thread and, to my understanding, is called as often as it can be (will attempt to maximize FPS). Clearly, sticking the Computer Shader there is not an option, as it will block the main thread along with the rest of the UI and even the rest of the OS if it also uses the GPU.

The question is, how should I perform those computations without causing any lag in the UI of the rest of the application?

If this was purely ran on the CPU, I would just run a separate thread. However, with my elementary understanding of the GPU, and I am currently under the impression that the GPU is not very apt at scheduling/context-switching. Therefore, even if I run the computation on another thread (using deferred rendering in DX11), the computation will still block the GPU until it's done. Is that correct?

I have attempted to split to work of the compute shader into smaller pieces (of roughly 8000 threads). This is doable single there is no underlying geometry, I can just add an offset everytime I call the compute shader. That doesn't really work, as there is massive overhead. In fact, dividing the work into N pieces (by calling the compute shader N times in succession) seems to result in a linear slowdown by a factor of N. I'm not sure if this is because of the WPF library linked above, SharpDX, or just an unavoidable fact about using the GPU.

Anyone has suggestions as to how to proceed in such scenarios, perhaps sample projects that do something similar, or perhaps good resources to read? Am I making any false assumptions?

Was it helpful?

Solution

It is correct that performing that 10 second operation would block your GPU. In my experience you can get crazy behaviour/crashes from your GPU driver when your calculations take that long, especially when you try to mix UI and DirectX content. I suggest that you continue down the path of splitting the calculations into smaller thread counts and look for other avenues to optimize your calculations or even refactor your code.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top