Question

I am trying to measure the register spilling in my CUDA project in Visual Studio. To do so I am using the flag –Xptxas –v,–abi=no as it is written here http://on-demand.gputechconf.com/gtc-express/2011/presentations/register_spilling.pdf

In my VS 2010 project in properties I tried to put this flag in:

  1. properties / cuda / host / additional compilation flags - no effect.
  2. properties / cuda / command line. The compiling exits with -1.
  3. properties / c / command line. Compilation error

In Cuda properties I have also set to Yes flags : Generate GPU debug information and Verbose PTXAS output. I am looking for the output in Output window. How to do it properly? I have GPU with CC = 2.1.

EDIT: so the correct place to put the flag as answers indicate is the properties/cuda/command line. But I still do not get the expeceted output (even in sample projects). Below I show my other options I have in properties: cuda/device.

  1. C interlaved in PTXAS output - No
  2. Code generation - compute_20, sm_21
  3. generate GPU debug info - Yes
  4. max used register - 0
  5. verbose ptxas output (yes/ no - tested both).
Was it helpful?

Solution

I think the steps are pretty straightforward. I did a clean install of VS2010 Express, followed by an install of CUDA 5.0 for windows 7.

I chose the VectorAdd sample code, which is in the CUDA 5.0 samples package. By default, my project was set up to compile for Win32 and Debug.

The only change I had to make was to select Project...Properties...CUDA C/C++...Command Line

I then added the -Xptxas -v options in the Addtional Options text box at the bottom of the properties dialog, like so: VS2010 project properties dialog

(if you have trouble seeing the above picture clearly, right-click on the picture then click "Save Picture As..." and save it to your hard disk, then open it from there.)

After that, press Apply and OK. Then hit F7 to build the project, and you should see output like this in the Output window (your output window should automatically display "Build" output when you are compiling:

1>------ Rebuild All started: Project: vectorAdd, Configuration: Debug Win32 -----
1> 
1> C:\ProgramData\NVIDIA Corporation\CUDA Samples\v5.0\0_Simple\vectorAdd>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.0\bin\nvcc.exe" -ccbin "C:\Program Files\Microsoft Visual Studio 10.0\VC\bin" -I"../../common/inc" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.0\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.0\include" -G --keep-dir "Debug" -maxrregcount=0 --machine 32 --compile -Xptxas -v -g -DWIN32 -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /Zi /RTC1 /MTd " -o "Win32/Debug/vectorAdd.cu.obj" "C:\ProgramData\NVIDIA Corporation\CUDA Samples\v5.0\0_Simple\vectorAdd\vectorAdd.cu" -clean 
1> Compiling CUDA source file vectorAdd.cu...
1> 
1> C:\ProgramData\NVIDIA Corporation\CUDA Samples\v5.0\0_Simple\vectorAdd>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.0\bin\nvcc.exe" -gencode=arch=compute_10,code=\"sm_10,compute_10\" -gencode=arch=compute_20,code=\"sm_20,compute_20\" -gencode=arch=compute_30,code=\"sm_30,compute_30\" -gencode=arch=compute_35,code=\"sm_35,compute_35\" --use-local-env --cl-version 2010 -ccbin "C:\Program Files\Microsoft Visual Studio 10.0\VC\bin" -I"../../common/inc" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.0\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.0\include" -G --keep-dir "Debug" -maxrregcount=0 --machine 32 --compile -Xptxas -v -g -DWIN32 -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /Zi /RTC1 /MTd " -o "Win32/Debug/vectorAdd.cu.obj" "C:\ProgramData\NVIDIA Corporation\CUDA Samples\v5.0\0_Simple\vectorAdd\vectorAdd.cu" 
1> ptxas : info : 0 bytes gmem
1> ptxas : info : Compiling entry function '_Z9vectorAddPKfS0_Pfi' for 'sm_10'
1> ptxas : info : Used 4 registers, 32 bytes smem, 4 bytes cmem[1]
1> ptxas : info : 0 bytes gmem
1> ptxas : info : Compiling entry function '_Z9vectorAddPKfS0_Pfi' for 'sm_20'
1> ptxas : info : Function properties for _Z9vectorAddPKfS0_Pfi
1> 0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
1> ptxas : info : Used 8 registers, 48 bytes cmem[0]
1> ptxas : info : 0 bytes gmem
1> ptxas : info : Compiling entry function '_Z9vectorAddPKfS0_Pfi' for 'sm_30'
1> ptxas : info : Function properties for _Z9vectorAddPKfS0_Pfi
1> 0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
1> ptxas : info : Used 8 registers, 336 bytes cmem[0]
1> ptxas : info : 0 bytes gmem
1> ptxas : info : Compiling entry function '_Z9vectorAddPKfS0_Pfi' for 'sm_35'
1> ptxas : info : Function properties for _Z9vectorAddPKfS0_Pfi
1> 0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
1> ptxas : info : Used 8 registers, 336 bytes cmem[0]
1> tmpxft_00001438_00000000-39_vectorAdd.compute_10.ii
1> vectorAdd_vs2010.vcxproj -> C:\ProgramData\NVIDIA Corporation\CUDA Samples\v5.0\0_Simple\vectorAdd\../../bin/win32/Debug/vectorAdd.exe
========== Rebuild All: 1 succeeded, 0 failed, 0 skipped ==========

Note that whether or not you see any actual spilling is a function of the code you are compiling. This code has no spilling, but if there were any, this is where the compiler would report it.

You don't need the -abi=no option in order to see the spilling results of the compiler.

Note that individual file options can override project settings (right click on one of your project source files, then click properties), but if you haven't modified any of these, they should not override your project settings.

There are probably other project settings that can interfere with this as well, so my suggestion is to try one of the CUDA sample codes that you haven't modified, and use the above steps as a sanity check to demonstrate that you can get it working there first. Then try it on your project.

Make sure you are modifying the settings (e.g. Win32/x64, Release/Debug) that correspond to the project you are actually building.

EDIT: The above case uses CUDA 5.0. The original question did not specify CUDA version. I found that with a previous version of CUDA in Visual Studio, the command line "Additional options" method did not seem to work, but using the selection/dropdown box to specify Verbose PTXAS output (Yes) did work.

EDIT2: OK I did a clean install of VS2010 followed by a clean install of CUDA 4.2 toolkit, and I was able to reproduce the issue. I used the following steps to be able to see the actual ptxas verbose output:

  1. In Tools...Settings select "Expert Settings"
  2. In Project...Properties...Configuration Properties...CUDA C/C++...Device change the ptxas verbose drop-down box to "Yes (--ptxas-options=-v)"
  3. In Tools...Options...Projects and Solutions...Build and Run change the "MSBuild project build output verbosity" setting from "Minimal" to "Normal"
  4. Then select Build...Rebuild Solution, and you should see the ptxas verbose output in the build output window.

OTHER TIPS

I am using --ptxas-options=-v (without spaces), but admittedly I am still using some older CUDA version.

As for where to put:

  • Ad 1) properties / cuda / host / additional compilation flags -- this will alter your CPU code compilation of the CUDA source (functions marked as __host__). This is -not- where you want to put the flag.
  • Ad 2) properties / cuda / command line -- this should alter your GPU code compilation. If compiling exists with an error, what is the error message?
  • Ad 3) properties / c / command line -- this will affect your native C/C++ compiler which does not understand neither --ptxas-options nor -Xptxas
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top