Question

I have tried to write programm for multiply CSR matrix and vector on C/CUDA 5.5 in MS Visual Studio 2012, but faced with ptx error. Listing of my code:

__global__ void multKernelSymm(double* s, double* u, double* val, int* rowPtr, int* colInd)
    {
        int l = 0, jl, i;
        int idx = blockDim.x*blockIdx.x;

        l = rowPtr[idx] - 1;
        for ( i = 0; i < (rowPtr[idx + 1] - rowPtr[idx]); i++){
            jl = colInd[l] - 1;
            s[idx] = s[idx] + val[l] * u[jl];
            l+=1;
        }

        l = 0;
        l = rowPtr[idx] - 1;
        for (int i = 0; i < (rowPtr[idx + 1] - rowPtr[idx]); i++){
            jl = colInd[l] - 1;
            if(jl > idx)
                atomicAdd(&s[jl], val[l] * u[idx]);
                //s[jl] = s[jl] + val[l] * u[idx];
            l+=1;
        }


    }

    __device__ double atomicAdd(double* address, double val)
    {
         unsigned long long int* address_as_ull =
         (unsigned long long int*)address;
         unsigned long long int old = *address_as_ull, assumed;
         do {
            assumed = old;
            old = atomicCAS(address_as_ull, assumed, __double_as_longlong(val + __longlong_as_double(assumed)));
         } while (assumed != old);
         return __longlong_as_double(old);
    }

This I get in VS:

 C:\Users\Дмитрий\Documents\Visual Studio 2012\Projects\matrix mult simple\matrix mult simple>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.0\bin\nvcc.exe" -gencode=arch=compute_20,code=\"sm_20,compute_20\" --use-local-env --cl-version 2012 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\bin\x86_amd64"  -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.0\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.0\include"  -G   --keep-dir x64\Debug -maxrregcount=0  --machine 64 --compile -cudart static  -g   -DWIN64 -D_DEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /Zi /RTC1 /MDd  " -o x64\Debug\kernel.cu.obj "C:\Users\Дмитрий\Documents\Visual Studio 2012\Projects\matrix mult simple\matrix mult simple\kernel.cu" 
1>  ptxas C:/Users/AACE~1/AppData/Local/Temp/tmpxft_00001b88_00000000-4_kernel.ptx, line 479; : fatal error : Parsing error near '-': syntax error
1>ptxas : fatal error : Ptx assembly aborted due to errors
1>  kernel.cu
1>C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V110\BuildCustomizations\CUDA 6.0.targets(597,9): error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.0\bin\nvcc.exe" -gencode=arch=compute_20,code=\"sm_20,compute_20\" --use-local-env --cl-version 2012 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\bin\x86_amd64"  -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.0\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.0\include"  -G   --keep-dir x64\Debug -maxrregcount=0  --machine 64 --compile -cudart static  -g   -DWIN64 -D_DEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /Zi /RTC1 /MDd  " -o x64\Debug\kernel.cu.obj "C:\Users\Дмитрий\Documents\Visual Studio 2012\Projects\matrix mult simple\matrix mult simple\kernel.cu"" exited with code -1.
1>
 >     ========== Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped ==========

Ho I can fix it?

Was it helpful?

Solution

I was able to get your code to compile by making 3 changes:

  1. As @PaulR indicated, remove the double keyword when calling your custom atomicAdd function.
  2. The first parameter of atomicAdd should be a pointer to the address you want to update (the address you want to add the value to.) Instead you are passing the value at that address. So your function call should look like this:

    atomicAdd(&(s[jl]), val[l] * u[idx]);
    
  3. Finally, move the declaration and definition of your custom atomicAdd function to before your kernel definition.

With those changes I could get your code to compile.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top