Pyopencl: difference between to_device and Buffer

Question

Let

import pyopencl as cl
import pyopencl.array as cl_array
import numpy
a = numpy.random.rand(50000).astype(numpy.float32)
mf = cl.mem_flags

What is the difference between

a_gpu = cl.Buffer(self.ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a)

and

a_gpu = cl_array.to_device(self.ctx, self.queue, a)

?

And what is the difference between

result =  numpy.empty_like(a)
cl.enqueue_copy(self.queue, result, result_gpu)

and

result = result_gpu.get()

?

Andreas Klöckner · Accepted Answer

Buffers are CL's version of malloc, while pyopencl.array.Array is a workalike of numpy arrays on the compute device.

So for the second version of the first part of your question, you may write a_gpu + 2 to get a new arrays that has 2 added to each number in your array, whereas in the case of the Buffer, PyOpenCL only sees a bag of bytes and cannot perform any such operation.

The second part of your question is the same in reverse: If you've got a PyOpenCL array, .get() copies the data back and converts it into a (host-based) numpy array. Since numpy arrays are one of the more convenient ways to get contiguous memory in Python, the second variant with enqueue_copy also ends up in a numpy array--but note that you could've copied this data into an array of any size (as long as it's big enough) and any type--the copy is performed as a bag of bytes, whereas .get() makes sure you get the same size and type on the host.

Bonus fact: There is of course a Buffer underlying each PyOpenCL array. You can get it from the .data attribute.

Alex I · Answer

To answer the first question, Buffer(hostbuf=...) can be called with anything that implements the buffer interface (reference). pyopencl.array.to_device(...) must be called with an ndarray (reference). ndarray implements the buffer interface and works in either place. However, only hostbuf=... would be expected to work with for example a bytearray (which also implements the buffer interface). I have not confirmed this, but it appears to be what the docs suggest.

On the second question, I am not sure what type result_gpu is supposed to be when you call get() on it (did you mean Buffer.get_host_array()?) In any case, enqueue_copy() works between combination of Buffer, Image and host, can have offsets and regions, and can be asynchronous (with is_blocking=False), and I think these capabilities are only available that way (whereas get() would be blocking and return the whole buffer). (reference)

Pyopencl: difference between to_device and Buffer

Tags:

python

numpy

opencl

pyopencl

petRUShka

2 Answers

Andreas Klöckner

Alex I

Recent Activity

Donate For Us

Pyopencl: difference between to_device and Buffer

Tags:

python

numpy

opencl

pyopencl

petRUShka

2 Answers

Andreas Klöckner

Alex I

Related questions

Recent Activity

Donate For Us