Assign instructions / algorithms to specific ALUs with CUDA / OpenCL?

https://stackoverflow.com/questions/19398219

30-06-2022
|

Domanda

I am pretty new to GPU programming. I read some material explaining the basic concepts.

However, I want to know if it is possible to assign a set of instructions or a whole algorithm / binary to a specific ALU, such that I can be sure that these instructions were executed only by this ALU (therefore, bypassing the system that automatically takes care of parallelization)?

I have to "benchmark" the individual ALUs on a GPU regarding any potential computing latencies among them. Thus, I want to assign a (similar) set of instructions to several specific ALUs, measure the time needed for executing this set of instructions and compare the results if there are any differences.

In general I want to check a GPU for certain sources of race conditions. The first I thought of is a potential, minuscule difference in the execution speed of the different ALUs. Maybe you guys know of other potential sources of race conditions.

However, since my goal is rather diametrical to the typical use of a GPU (parllelization, etc.) for me it is rather difficult to see how I can access an inidividual ALU on a low-level with the common tools.

Best regards,

Soluzione

GPUs have individual warp schedulers to which you do not have access.

For the Fermi architecture, for example, the device has a Giga-Thread Scheduler that dispatches the work to the different Streaming Multiprocessors and a Dual-Warp scheduler inside that dispatches the warps to the individual cores. But this is transparent to the user.

What you can do to profile individual or sequence of instructions is to use the NVTX Tracing Library which helps you annotating part of the code to be subsequently profiled by Parallel Nsight traces.

You can find some material on the NVTX library at

CUDA Pro Tip: Generate Custom Application Profile Timelines with NVTX

Optimizing Application Performance with CUDA Profiling Tools

and in Chapter 3 of the book "CUDA Application Design and Development" by Rob Farber.

Concerning using NVTX, have a look at my question here:

Use of NVIDIA Tools Extension under Visual Studio 2010

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow