Accessing device memory in Cuda

https://stackoverflow.com/questions/23351331

c
cuda

11-07-2023
|

Question

I am trying to access the device memory after copying the host memory to device memory. And when trying to print the data that is copied to device memory from host memory then the execution result is being improper.. it says segmentation fault where I came to know that I am trying to print the data from device memory which is not available or not able to be accessed.

Help me how to access this device memory and I want to make sure that if i modify host memory data then I want that change to be seen in device memory data when I try to print it.

Here is my code below

// includes, system
#include <stdio.h>
#include <assert.h>

// Simple utility function to check for CUDA runtime errors
void checkCUDAError(const char *msg);


int main( int argc, char** argv)
{
    // pointer and dimension for host memory
    int n, dimA;
    float *h_a;

    // pointers for device memory
    float *d_a, *d_b;

    // allocate and initialize host memory
    /** Bonus: try using cudaMallocHost in place of malloc **/

    dimA = 8;
    size_t memSize = dimA*sizeof(float);
    cudaMallocHost((void**)&h_a, memSize);
    //h_a = (float *) malloc(dimA*sizeof(float));
    for (n=0; n<dimA; n++)
    {
        h_a[n] = (float) n;
    }

    // Part 1 of 5: allocate device memory

    cudaMalloc( (void**)&d_a, memSize );
    cudaMalloc( (void**)&d_b, memSize );

    // Part 2 of 5: host to device memory copy
    cudaMemcpy( d_a, h_a, memSize, cudaMemcpyHostToDevice   );

    // Part 3 of 5: device to device memory copy
    cudaMemcpy( d_b, d_a, memSize, cudaMemcpyDeviceToDevice );

    // clear host memory
    for (n=0; n<dimA; n++)
    {
        printf("Data in host memory h_a %f\n", h_a[n]);
        printf("Data in device memory d_a %f\n", d_a[n]);
        //printf("Data in device memory d_b %f\n", d_b[n]);
        h_a[n] = 0.f;
    }

    // Part 4 of 5: device to host copy
    cudaMemcpy( h_a, d_b, memSize, cudaMemcpyDeviceToHost );

    // Check for any CUDA errors
    checkCUDAError("cudaMemcpy calls");

    // verify the data on the host is correct
    for (n=0; n<dimA; n++)
    {
        assert(h_a[n] == (float) n);
    }

    // Part 5 of 5: free device memory pointers d_a and d_b
    cudaFree( d_b );
    cudaFree( d_a );

    // Check for any CUDA errors
    checkCUDAError("cudaFree");

    // free host memory pointer h_a
    // Bonus: be sure to use cudaFreeHost for memory allocated with cudaMallocHost

    cudaFreeHost(h_a);
    //free(h_a);

    // If the program makes it this far, then the results are correct and
    // there are no run-time errors.  Good work!
    printf("cudaMallocHost is working Correct!\n");

    return 0;
}

void checkCUDAError(const char *msg)
{
    cudaError_t err = cudaGetLastError();
    if( cudaSuccess != err)
    {
        fprintf(stderr, "Cuda error: %s: %s.\n", msg, cudaGetErrorString( err) );
        exit(-1);
    }
}

So in the code after copying memory from d_a to d_b and when i try to print the data in d_b memory, it gives an error. And printing h_a memory gives a good result. Am i doing wrong in trying to print data in d_b memory?

Solution

You cannot access device memory from host code. This line is illegal:

    printf("Data in device memory d_a %f\n", d_a[n]);

It requires dereferencing a device pointer (pointer to device memory) in host code which is illegal in CUDA (excepting Unified Memory usage).

If you want to see that the device memory was set properly, you can copy the data in device memory back to the host memory, which you are doing (and checking) in the next lines of code. So just remove that printf statement. It is illegal.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow