In CUDA, how to copy an array of device pointers to device memory?

Question

For example, I allocate these following pointers:

float *data_1, *data_2, *data_3, *data_4;

//Use malloc to allocate memory and fill out some data to these pointers
......
//Filling complete

float *data_d1,*data_d2,*data_d3,*data_d4;

cudaMalloc((void **)&data_d1,size1);
cudaMalloc((void **)&data_d2,size2);
cudaMalloc((void **)&data_d3,size3);
cudaMalloc((void **)&data_d4,size4);

cudaMemcpy(data_d1,data_1,size1,cudaMemcpyHostToDevice);
cudaMemcpy(data_d2,data_2,size2,cudaMemcpyHostToDevice);
cudaMemcpy(data_d3,data_3,size3,cudaMemcpyHostToDevice);
cudaMemcpy(data_d4,data_4,size4,cudaMemcpyHostToDevice);

After this, I should already get 4 device pointers containing the exact data as host pointers do. Now I'd like to store these pointers into one array of pointers as following,

float *ptrs[4];

ptrs[0] = data_d1;
ptrs[1] = data_d2;
ptrs[2] = data_d3;
ptrs[3] = data_d4;

Now I'd like to transfer this array of pointers to CUDA kernel. However, I know that since ptrs[4] is actually on host memory, I need to allocate a new pointer on device. So I did this,

float **ptrs_d;
size_t size = 4 * sizeof(float*);
cudaMalloc((void ***)&ptrs_d,size);
cudaMemcpy(ptrs_d,ptrs,size,cudaMemcpyHostToDevice);

And then invoke the kernel:

kernel_test<<<dimGrid,dimBlock>>>(ptrs_d, ...);
//Declaration should be 
//__global__ void kernel_test(float **ptrs_d, ...);

In the kernel_test, load data in the following syntax:

if (threadIdx.x < length_of_data_1d)
{
    float element0 = (ptrs[0])[threadIdx.x];
}

Compiling is OKay, but when debugging, it gives an error of access violation.

Perhaps there're a lot of errors in my code. But I just want to figure out why I can't pass device pointers in this way and what is the proper way to access it if it is allowed in CUDA to pass array of device pointers to kernel function.

So how should I fix this issue? Any suggestions are appreciated. Thanks in advance.

hubs · Accepted Answer

One possibility is to allocate a void pointer, like CUDA expects as as standart, too. When passing it into your kernel, you can cast it to float**. I did it in that way:

void* ptrs_d = 0;
cudaMalloc(&ptrs_d, 4*sizeof(float*));
cudaMemcpy(ptrs_d, ptrs, 4*sizeof(float*), cudaMemcpyHostToDevice);
kernel_test<<<dimGrid, dimBlock>>>((float**)ptrs_d);

In CUDA, how to copy an array of device pointers to device memory?

Tags:

pointers

memory

cuda

Coding_new_bird

1 Answers

hubs

Recent Activity

Donate For Us

In CUDA, how to copy an array of device pointers to device memory?

Tags:

pointers

memory

cuda

Coding_new_bird

1 Answers

hubs

Related questions

Recent Activity

Donate For Us