I want to optimize my local memory access pattern within my OpenCL kernel. I read at somewhere about configurable local memory. E.g. we should be able to configure which amount is used for local mem and which amount is used for automatic caching.
Also i read that the bank size can be chosen for the latest (Kepler) Nvidia hardware here: http://www.acceleware.com/blog/maximizing-shared-memory-bandwidth-nvidia-kepler-gpus. This point seems to be very crucial for double precision value storing in local memory.
Does Nvidia provide the functionality of setting up the local memory exclusively for CUDA users? I can't find similar methods for OpenCL. So is this maybe called in a different way or does it really not exist?
Unfortunately there is no way to control the L1 cache/local memory configuration when using OpenCL. This functionality is only provided by the CUDA runtime (via cudaDeviceSetCacheConfig or cudaFuncSetCacheConfig).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With