Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Buffers and Memoryview Objects explained for the non-C programmer

Python 2.7 has introduced a new API for buffers and memoryview objects.

I read the documentation on them and I think I got the basic concept (accessing the internal data of an object in a raw form without copying it, which I suppose means a "faster and less memory-hungry" way to get object data), but to really understand the documentation, the reader should have a knowledge of C that is beyond the one I have.

I would be very grateful if somebody would take the time to:

  • explain buffers and memoryview objects in "layman terms" and
  • describe a scenario in which using buffers and memoryview objects would be "the Pythonic way" of doing things
like image 338
mac Avatar asked Jul 18 '11 17:07

mac


People also ask

What is Memoryview?

Memory view memoryview objects allow Python code to access the internal data of an object that supports the buffer protocol without copying. The memoryview() function allows direct read and write access to an object's byte-oriented data without needing to copy it first.

What is a buffer object Python?

Buffer objects are useful as a way to expose the data from another object's buffer interface to the Python programmer. They can also be used as a zero-copy slicing mechanism. Using their ability to reference a block of memory, it is possible to expose any data to the Python programmer quite easily.

When should I use Memoryview?

memoryview objects are great when you need subsets of binary data that only need to support indexing. Instead of having to take slices (and create new, potentially large) objects to pass to another API you can just take a memoryview object. One such API example would be the struct module.


1 Answers

Here's a line from a hash function I wrote:

M = tuple(buffer(M, i, Nb) for i in range(0, len(M), Nb))

This will split a long string, M, into shorter 'strings' of length Nb, where Nb is the number of bytes / characters I can handle at a time. It does this WITHOUT copying any parts of the string, as would happen if I made slices of the string like so:

M = tuple(M[i*Nb:i*Nb+Nb] for i in range(0, len(M), Nb))

I can now iterate over M just as I would had I sliced it:

H = key
for Mi in M:
    H = encrypt(H, Mi)

Basically, buffers and memoryviews are efficient ways to deal with the immutability of strings in Python, and the general copying behavior of slicing etc. A memoryview is just like a buffer, except you can also write to it, not just read.

While the main buffer / memoryview doc is about the C implementation, the standard types page has a bit of info under memoryview: http://docs.python.org/library/stdtypes.html#memoryview-type

Edit: Found this in my bookmarks, http://webcache.googleusercontent.com/search?q=cache:Ago7BXl1_qUJ:mattgattis.com/2010/3/9/python-memory-views+site:mattgattis.com+python&hl=en&client=firefox-a&gl=us&strip=1 is a REALLY good brief writeup.

Edit 2: Turns out I got that link from When should a memoryview be used? in the first place, that question was never answered in detail and the link was dead, so hopefully this helps.

like image 199
agf Avatar answered Sep 29 '22 23:09

agf