For example, cudaMalloc((void**)&device_array, num_bytes);
This question has been asked before, and the reply was "because cudaMalloc
returns an error code", but I don't get it - what has a double pointer got to do with returning an error code? Why can't a simple pointer do the job?
If I write
cudaError_t catch_status; catch_status = cudaMalloc((void**)&device_array, num_bytes);
the error code will be put in catch_status
, and returning a simple pointer to the allocated GPU memory should suffice, shouldn't it?
In C, data can be passed to functions by value or via simulated pass-by-reference (i.e. by a pointer to the data). By value is a one-way methodology, by pointer allows for two-way data flow between the function and its calling environment.
When a data item is passed to a function via the function parameter list, and the function is expected to modify the original data item so that the modified value shows up in the calling environment, the correct C method for this is to pass the data item by pointer. In C, when we pass by pointer, we take the address of the item to be modified, creating a pointer (perhaps a pointer to a pointer in this case) and hand the address to the function. This allows the function to modify the original item (via the pointer) in the calling environment.
Normally malloc
returns a pointer, and we can use assignment in the calling environment to assign this returned value to the desired pointer. In the case of cudaMalloc
, the CUDA designers chose to use the returned value to carry an error status rather than a pointer. Therefore the setting of the pointer in the calling environment must occur via one of the parameters passed to the function, by reference (i.e. by pointer). Since it is a pointer value that we want to set, we must take the address of the pointer (creating a pointer to a pointer) and pass that address to the cudaMalloc
function.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With