I'm allocating a cl_mem buffer on a GPU and work on it, which works fine until a certain size is exceeded. In that case the allocation itself succeeds, but execution or copying does not. I do want to use the device's memory for faster operation so I allocate like:
buf = clCreateBuffer (cxGPUContext, CL_MEM_WRITE_ONLY, buf_size, NULL, &ciErrNum);
Now what I don't understand is the size limit. I'm copying about 16 Mbyte but should be able to use about 128 Mbyte (see CL_DEVICE_MAX_MEM_ALLOC_SIZE ).
Why do these numbers differ so much ?
Here's some excerpt from oclDeviceQuery:
 CL_PLATFORM_NAME:  NVIDIA
 CL_PLATFORM_VERSION:  OpenCL 1.0 
 OpenCL SDK Version:  4788711
  CL_DEVICE_NAME:          GeForce 8600 GTS
  CL_DEVICE_TYPE:          CL_DEVICE_TYPE_GPU
  CL_DEVICE_ADDRESS_BITS:              32
  CL_DEVICE_MAX_MEM_ALLOC_SIZE:  128 MByte
  CL_DEVICE_GLOBAL_MEM_SIZE:     255 MByte
  CL_DEVICE_LOCAL_MEM_TYPE:      local
  CL_DEVICE_LOCAL_MEM_SIZE:      16 KByte
  CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE:  64 KByte
clCreateBuffer will not actually create a buffer on the device. This makes sense, since at the time of creation the driver does not know which device will use the buffer (recall that a context can have multiple devices). The buffer will be created on the actual device when you enqueue a write or when you launch a kernel that takes the buffer as a parameter.
As for the 16MB limit, are you using the latest driver (195.xx)? If so you should contact NVIDIA either through the forums or directly.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With