سؤال

Issue:

Im trying to use my graphics card to do some computation using cudafy.net. Ive ran 2 versions of my kernel now and i keep getting an errors at specific intervals ie every 2nd location in the array is 0.0 but should be something much larger. Below is a table of what the GPU returns vs what the correct value is. Note: I've read that comparing floats isnt ideal but getting 0.0 when i should be getting something as large as 6.34419e17 seems wrong.

I              GPU    Correct Value

16,777,217     0.0    6.34419E17
16,777,219     0.0    6.34419E17
...            ...    .....

From quickly scanning through them, they seem to be occurring at every 2nd i value.

Checked thus far:

Ive also ran the below code at a different start value as i believed it may be an issue with the data but i still get the same i value for each error.

Ive also changed the order in which the memory is allocated onto the GPU but that doesnt seem to affect the results. Note: since im debugging in VS, im not explicitly clearing the memory on the GPU after i stop. Is this being cleared once i stop debugging? The error is still present once i restart my pc.

Graphics Card:

My graphics card is as follows: EVGA GTX 660 SC.

Code:

My kernel: (Note: i have several variables which arent used below but i havent removed since i wanted to remove 1 thing at a time in order to nail down whats causing this error)

    [Cudafy]
    public static void WorkerKernelOnGPU(GThread thread, float[] value1, float[] value2, float[] value3, float[] dateTime, float[,] output)
    {
        float threadIndex = thread.threadIdx.x;
        float blockIndex = thread.blockIdx.x;
        float threadsPerBlock = thread.blockDim.x;
        int tickPosition = (int)(threadIndex + (blockIndex * threadsPerBlock));

        //Check to ensure threads dont go out of range.
        if (tickPosition < dateTime.Length)
        {
            output[tickPosition, 0] = dateTime[tickPosition];
            output[tickPosition, 1] = -1;
        }
    }

Below is the segment of code which im using to call the Kernel and then check the results.

        CudafyModule km = CudafyTranslator.Cudafy();            
        _gpu = CudafyHost.GetDevice(eGPUType.Cuda);
        _gpu.LoadModule(km);

        float[,] Output = new float[SDS.dateTime.Length,2];
        float[] pm = new float[]{0.004f};

        //Otherwise need to allocate then specify the pointer in the CopyToDevice so it know which pointer to add data to
        float[] dev_tpc = _gpu.CopyToDevice(pm);
        float[] dev_p = _gpu.CopyToDevice(SDS.p);                                         
        float[] dev_s = _gpu.CopyToDevice(SDS.s);                                        
        float[,] dev_o = _gpu.CopyToDevice(Output);                                           
        float[] dev_dt = _gpu.CopyToDevice(SDS.dateTime);                                     


        dim3 grid = new dim3(20000, 1, 1);
        dim3 block = new dim3(1024, 1, 1);

        Stopwatch sw = new Stopwatch();
        sw.Start();

        _gpu.Launch(grid, block).WorkerKernelOnGPU(dev_tpc,dev_p, dev_s, dev_dt, dev_o);
        _gpu.CopyFromDevice(dev_o, Output);

        sw.Stop();      //0.29 seconds
        string resultGPU = sw.Elapsed.ToString();  
        sw.Reset();

        //Variables used to record errors.
        bool failed = false;
        float[,] wrongValues = new float[Output.Length, 3];
        int counterError = 0;

        //Check the GPU values are as expected. If not record GPU value, Expected value, position.
        for (int i = 0; i < 20480000; i++)
        {
            float gpuValue = Output[i, 0];
            if (SDS.dateTime[i] == gpuValue) { }

            else
            {
                failed = true;
                wrongValues[counterError, 0] = gpuValue;
                wrongValues[counterError, 1] = SDS.dateTime[i];
                wrongValues[counterError, 2] = (float)i;
                counterError++;
            }
        }

I only have a single graphics card at my disposal atm so i cant quickly check to see if its an error with the card or not. The card is less then 8 months old and was new when bought.

Any ideas on what could be causing the above error??

Thanks for your time.

Edit: Just tried to reduce my gtx 660 to the stock speeds of a 660. Still experiencing the error though.

Edit2 Ive used _gpu.FreeMemory; to determine if i was exceeding the cards memory. I still have 1,013,202,944 bytes left though.

Edit3 Ive just changed the datatype of the output array to long instead of float. I now seem to have just over 500MB of free space on the card yet i still get the wrong results from the same value ie i = 16,777,217. I guess this seems to suggest it possible something to do with the index thats the issue??

هل كانت مفيدة؟

المحلول

    float threadIndex = thread.threadIdx.x;
    float blockIndex = thread.blockIdx.x;
    float threadsPerBlock = thread.blockDim.x;
    int tickPosition = (int)(threadIndex + (blockIndex * threadsPerBlock));

The issue was that fact i was using float for ThreadIndex etc. Once this was changed to int, the issue was resolved.

Time for this fool to get some time away from the pc.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top