Question

I'm allocating a cl_mem buffer on a GPU and work on it, which works fine until a certain size is exceeded. In that case the allocation itself succeeds, but execution or copying does not. I do want to use the device's memory for faster operation so I allocate like:

buf = clCreateBuffer (cxGPUContext, CL_MEM_WRITE_ONLY, buf_size, NULL, &ciErrNum);

Now what I don't understand is the size limit. I'm copying about 16 Mbyte but should be able to use about 128 Mbyte (see CL_DEVICE_MAX_MEM_ALLOC_SIZE ).

Why do these numbers differ so much ?


Here's some excerpt from oclDeviceQuery:

 CL_PLATFORM_NAME:  NVIDIA
 CL_PLATFORM_VERSION:  OpenCL 1.0 
 OpenCL SDK Version:  4788711

  CL_DEVICE_NAME:          GeForce 8600 GTS
  CL_DEVICE_TYPE:          CL_DEVICE_TYPE_GPU
  CL_DEVICE_ADDRESS_BITS:              32
  CL_DEVICE_MAX_MEM_ALLOC_SIZE:  128 MByte
  CL_DEVICE_GLOBAL_MEM_SIZE:     255 MByte
  CL_DEVICE_LOCAL_MEM_TYPE:      local
  CL_DEVICE_LOCAL_MEM_SIZE:      16 KByte
  CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE:  64 KByte
Was it helpful?

Solution

clCreateBuffer will not actually create a buffer on the device. This makes sense, since at the time of creation the driver does not know which device will use the buffer (recall that a context can have multiple devices). The buffer will be created on the actual device when you enqueue a write or when you launch a kernel that takes the buffer as a parameter.

As for the 16MB limit, are you using the latest driver (195.xx)? If so you should contact NVIDIA either through the forums or directly.

OTHER TIPS

Don't forget whatever other memory you happen to have used on the device (and, if this is also your graphics card, the memory that your display is using).

(Is there a way to get the current available memory, or the largest fragment, or somesuch?)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top