Use dynamic shared memory allocation for two different vectors

Question

In kernel function, I want two vectors of shared memory, both with size length (actually sizeof(float)*size).

Since it is not possible to allocate memory directly in the kernel function if a variable is needed, I had to allocate it dynamically, like:

    myKernel<<<numBlocks, numThreads, 2*sizeof(float)*size>>> (...);

and, inside the kernel:

extern __shared__ float row[];
extern __shared__ float results[];

But, this doesn't work.

Instead of this, I made only one vector extern __shared__ float rowresults[] containing all the data, using the 2*size memory allocated. So row calls are still the same, and results calls are like rowresults[size+previousIndex]. And this does work.

It is not a big problem because I get my expected results anyway, but is there any way to split my dynamically allocated shared memory into two (or more) different variables? Just for beauty.

lmortenson · Accepted Answer

The C Programming guide section on __shared__ includes examples where you allocate multiple arrays from dynamically allocated shared memory:

extern __shared__ float array[];
__device__ void func()      // __device__ or __global__ function
{
    short* array0 = (short*)array; 
    float* array1 = (float*)&array0[128];
    int*   array2 =   (int*)&array1[64];
}

Since you're just getting a pointer to an element and making that a new array, I believe you could adapt that to use dynamic offsets instead of the static offsets they have in the example. They also note that the alignment has to be the same, which shouldn't be an issue in your case.

Use dynamic shared memory allocation for two different vectors

Tags:

cuda

BobCormorano

1 Answers

lmortenson

Recent Activity

Donate For Us

Use dynamic shared memory allocation for two different vectors

Tags:

cuda

BobCormorano

1 Answers

lmortenson

Related questions

Recent Activity

Donate For Us