how to calculate theoretical fp32 instructions per cycle (IPC) on nvidia GPU

https://stackoverflow.com/questions/22884373

28-06-2023
|

Question

I'm having a hard time understanding how the theoretical Instructions per Cycle (IPC) for a Fermi architecture nvidia GPU is 2, according to http://on-demand.gputechconf.com/gtc-express/2011/presentations/Inst_limited_kernels_Oct2011.pdf page 9.

According to section 5.4.1 of the programming guide (http://docs.nvidia.com/cuda/cuda-c-programming-guide/#arithmetic-instructions) for 32-bit floats, there can be 32 fp32-instructions/SM/clock cycle.

How do the two quantities relate?

Solution

Answer provided here on the NVIDIA developer forums:

https://devtalk.nvidia.com/default/topic/722525/cuda-programming-and-performance/how-to-calculate-theoretical-fp32-instructions-per-cycle-ipc-on-nvidia-gpu/

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow