I am looking for the Python equivalent of the Python C-API PyObject_CheckBuffer.
I.e. I would like to check if an object supports the buffer protocol, but from Python.
The Python buffer protocol, also known in the community as PEP 3118, is a framework in which Python objects can expose raw byte arrays to other Python objects. This can be extremely useful for scientific computing, where we often use packages such as NumPy to efficiently store and manipulate large arrays of data.
memoryview objects allow Python code to access the internal data of an object that supports the buffer protocol without copying. The memoryview() function allows direct read and write access to an object's byte-oriented data without needing to copy it first.
The memoryview() method in Python returns a memoryview object based on the bytes or bytearray parameter. It allows you to access the data without having to copy/replicate it first. Here without copying means that we will obtain the reference to the data rather than the data itself.
Working with short data snippets on performance-critical code, I had to try out different approaches. Depending on your application, one might be better than the others.
def ensure_bytes__try(data):
try:
# memoryview used only for testing type; 'with' releases the view instantly
with memoryview(data):
return data
except TypeError:
return data.encode()
def ensure_bytes__isinstance(data):
# Explicitly test for some bytes-like types
# - misses array.array, numpy.array and all other types not listed here
return data if isinstance(data, (bytes, bytearray, memoryview)) else data.encode()
def ensure_bytes__hasattr(data):
# Works as long as your bytes-like doesn't have 'encode'
return data.encode() if hasattr(data, "encode") else data
def ensure_bytes__args(data=None, data_bytes=None):
# Avoid autodetection by using explicit arguments
return data_bytes if data is None else data.encode()
The following benchmark shows time used by each implementation on Python 3.7.4:
ensure_bytes__try(b"foo") ▒▒▒▒█████████████████ 438 ns
ensure_bytes__try("foo") ▒▒▒▒▒██████████████████████████████████ 797 ns
ensure_bytes__isinstance(b"foo") ▒▒▒▒█████████ 277 ns
ensure_bytes__isinstance("foo") ▒▒▒▒▒███████████████████ 489 ns
ensure_bytes__hasattr(b"foo") ▒▒▒▒████ 171 ns
ensure_bytes__hasattr("foo") ▒▒▒▒▒█████████ 287 ns
ensure_bytes__args(data_bytes=b"foo") ▒▒▒▒██ 121 ns
ensure_bytes__args(data="foo") ▒▒▒▒▒█████ 216 ns
Shorter bar means faster. The light-shaded part of each bar represents the reference time benchmarked on ref_bytes(b"foo")
(84 ns) and ref_str("foo")
(100 ns):
def ref_bytes(data): return data
def ref_str(data): return data.encode()
I think you're just supposed to use the standard try-it-and-see-if-it-works technique:
# New-style buffer API, for Python 2.7 and 3.x.
# PyObject_CheckBuffer uses the new-style API.
# 2.6 also has the new-style API, but no memoryview,
# so you can't use it or check compatibility from Python code.
try:
memoryview(thing)
except TypeError:
# Doesn't support it!
# Old-style API. Doesn't exist in 3.x.
# Not quite equivalent to PyObject_CheckBuffer.
try:
buffer(thing)
except TypeError:
# Doesn't support it!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With