Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In CUDA, how to copy an array of device pointers to device memory?

For example, I allocate these following pointers:

float *data_1, *data_2, *data_3, *data_4;

//Use malloc to allocate memory and fill out some data to these pointers
......
//Filling complete

float *data_d1,*data_d2,*data_d3,*data_d4;

cudaMalloc((void **)&data_d1,size1);
cudaMalloc((void **)&data_d2,size2);
cudaMalloc((void **)&data_d3,size3);
cudaMalloc((void **)&data_d4,size4);

cudaMemcpy(data_d1,data_1,size1,cudaMemcpyHostToDevice);
cudaMemcpy(data_d2,data_2,size2,cudaMemcpyHostToDevice);
cudaMemcpy(data_d3,data_3,size3,cudaMemcpyHostToDevice);
cudaMemcpy(data_d4,data_4,size4,cudaMemcpyHostToDevice);

After this, I should already get 4 device pointers containing the exact data as host pointers do. Now I'd like to store these pointers into one array of pointers as following,

float *ptrs[4];

ptrs[0] = data_d1;
ptrs[1] = data_d2;
ptrs[2] = data_d3;
ptrs[3] = data_d4;

Now I'd like to transfer this array of pointers to CUDA kernel. However, I know that since ptrs[4] is actually on host memory, I need to allocate a new pointer on device. So I did this,

float **ptrs_d;
size_t size = 4 * sizeof(float*);
cudaMalloc((void ***)&ptrs_d,size);
cudaMemcpy(ptrs_d,ptrs,size,cudaMemcpyHostToDevice);

And then invoke the kernel:

kernel_test<<<dimGrid,dimBlock>>>(ptrs_d, ...);
//Declaration should be 
//__global__ void kernel_test(float **ptrs_d, ...);

In the kernel_test, load data in the following syntax:

if (threadIdx.x < length_of_data_1d)
{
    float element0 = (ptrs[0])[threadIdx.x];
}

Compiling is OKay, but when debugging, it gives an error of access violation.

Perhaps there're a lot of errors in my code. But I just want to figure out why I can't pass device pointers in this way and what is the proper way to access it if it is allowed in CUDA to pass array of device pointers to kernel function.

So how should I fix this issue? Any suggestions are appreciated. Thanks in advance.

like image 506
Coding_new_bird Avatar asked Dec 10 '13 14:12

Coding_new_bird


1 Answers

One possibility is to allocate a void pointer, like CUDA expects as as standart, too. When passing it into your kernel, you can cast it to float**. I did it in that way:

void* ptrs_d = 0;
cudaMalloc(&ptrs_d, 4*sizeof(float*));
cudaMemcpy(ptrs_d, ptrs, 4*sizeof(float*), cudaMemcpyHostToDevice);
kernel_test<<<dimGrid, dimBlock>>>((float**)ptrs_d);
like image 113
hubs Avatar answered Nov 16 '22 16:11

hubs