Pregunta

At the moment I am trying to optimize some cuda kernels...

If compile with the option --ptxas-options=-v I get the information about registers %co.

In my case I always get some extra lines, which make no sense for me:

ptxas : info : Compiling entry function '_Z20backprojLinTexInterpP7double3S0_S0_P7double2iiiiiS2_PdPf' for 'sm_20'
ptxas : info : Function properties for _Z20backprojLinTexInterpP7double3S0_S0_P7double2iiiiiS2_PdPf
8 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas : info : Used 47 registers, 32 bytes smem, 112 bytes cmem[0], 56 bytes cmem[16]
ptxas : info : Function properties for __internal_trig_reduction_slowpathd
40 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads

the lines 1 to 4 are clear to me, but what are the last to lines?

Google does not help here....i already tried.

Has anybody some idea what the meaning of those lines is? I get them for every compiled kernel in my program

¿Fue útil?

Solución

__internal_trig_reduction_slowpathd() is an internal subroutine in the CUDA math library. It is used to perform accurate argument reduction for double-precision trig functions (sin, cos, sincos, tan) when the argument is very large in magnitude. A Payne-Hanek style argument reduction is used for these large arguments. For sm_20 and up, this is a called subroutine to minimize code size in apps that invoke trig functions frequently. You can see the code by looking at the file math_functions_dbl_ptx3.h which is in the CUDA include file directory.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top