Let
import pyopencl as cl
import pyopencl.array as cl_array
import numpy
a = numpy.random.rand(50000).astype(numpy.float32)
mf = cl.mem_flags
What is the difference between
a_gpu = cl.Buffer(self.ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a)
and
a_gpu = cl_array.to_device(self.ctx, self.queue, a)
?
And what is the difference between
result = numpy.empty_like(a)
cl.enqueue_copy(self.queue, result, result_gpu)
and
result = result_gpu.get()
?
Buffers are CL's version of malloc
, while pyopencl.array.Array
is a workalike of numpy arrays on the compute device.
So for the second version of the first part of your question, you may write a_gpu + 2
to get a new arrays that has 2 added to each number in your array, whereas in the case of the Buffer
, PyOpenCL only sees a bag of bytes and cannot perform any such operation.
The second part of your question is the same in reverse: If you've got a PyOpenCL array, .get()
copies the data back and converts it into a (host-based) numpy array. Since numpy arrays are one of the more convenient ways to get contiguous memory in Python, the second variant with enqueue_copy
also ends up in a numpy array--but note that you could've copied this data into an array of any size (as long as it's big enough) and any type--the copy is performed as a bag of bytes, whereas .get()
makes sure you get the same size and type on the host.
Bonus fact: There is of course a Buffer underlying each PyOpenCL array. You can get it from the .data
attribute.
To answer the first question, Buffer(hostbuf=...)
can be called with anything that implements the buffer interface (reference). pyopencl.array.to_device(...)
must be called with an ndarray
(reference). ndarray
implements the buffer interface and works in either place. However, only hostbuf=...
would be expected to work with for example a bytearray
(which also implements the buffer interface). I have not confirmed this, but it appears to be what the docs suggest.
On the second question, I am not sure what type result_gpu
is supposed to be when you call get()
on it (did you mean Buffer.get_host_array()
?) In any case, enqueue_copy()
works between combination of Buffer
, Image
and host
, can have offsets and regions, and can be asynchronous (with is_blocking=False
), and I think these capabilities are only available that way (whereas get()
would be blocking and return the whole buffer). (reference)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With