Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the proper way to determine if an object is a bytes-like object in Python?

I have code that expects str but will handle the case of being passed bytes in the following way:

if isinstance(data, bytes):
    data = data.decode()

Unfortunately, this does not work in the case of bytearray. Is there a more generic way to test whether an object is either bytes or bytearray, or should I just check for both? Is hasattr('decode') as bad as I feel it would be?

like image 629
A. Wilcox Avatar asked Oct 11 '22 13:10

A. Wilcox


People also ask

How do you define a byte like object in Python?

Bytes-like object in python In Python, a string object is a series of characters that make a string. In the same manner, a byte object is a sequence of bits/bytes that represent data. Strings are human-readable while bytes are computer-readable. Data is converted into byte form before it is stored on a computer.

What is bytes object in Python?

Strings and Character Data in Python The bytes object is one of the core built-in types for manipulating binary data. A bytes object is an immutable sequence of single byte values. Each element in a bytes object is a small integer in the range of 0 to 255.

What is a byte like object?

Bytes-like objects are objects that are stored using the bytes data type. Bytes-like objects are not strings and so they cannot be manipulated like a string.

What do bytes look like in Python?

Ball pythons have small inward-sloped teeth. A bite may appear as several teeth marks in a curved shape. The bite may become more severe if you have to pry the python's jaws open to release the bite.


Video Answer


3 Answers

There are a few approaches you could use here.

Duck typing

Since Python is duck typed, you could simply do as follows (which seems to be the way usually suggested):

try:
    data = data.decode()
except (UnicodeDecodeError, AttributeError):
    pass

You could use hasattr as you describe, however, and it'd probably be fine. This is, of course, assuming the .decode() method for the given object returns a string, and has no nasty side effects.

I personally recommend either the exception or hasattr method, but whatever you use is up to you.

Use str()

This approach is uncommon, but is possible:

data = str(data, "utf-8")

Other encodings are permissible, just like with the buffer protocol's .decode(). You can also pass a third parameter to specify error handling.

Single-dispatch generic functions (Python 3.4+)

Python 3.4 and above include a nifty feature called single-dispatch generic functions, via functools.singledispatch. This is a bit more verbose, but it's also more explicit:

def func(data):
    # This is the generic implementation
    data = data.decode()
    ...

@func.register(str)
def _(data):
    # data will already be a string
    ...

You could also make special handlers for bytearray and bytes objects if you so chose.

Beware: single-dispatch functions only work on the first argument! This is an intentional feature, see PEP 433.

like image 98
Elizafox Avatar answered Oct 20 '22 17:10

Elizafox


You can use:

isinstance(data, (bytes, bytearray))

Due to the different base class is used here.

>>> bytes.__base__
<type 'basestring'>
>>> bytearray.__base__
<type 'object'>

To check bytes

>>> by = bytes()
>>> isinstance(by, basestring)
True

However,

>>> buf = bytearray()
>>> isinstance(buf, basestring)
False

The above codes are test under python 2.7

Unfortunately, under python 3.4, they are same....

>>> bytes.__base__
<class 'object'>
>>> bytearray.__base__
<class 'object'>
like image 55
zangw Avatar answered Oct 20 '22 19:10

zangw


>>> content = b"hello"
>>> text = "hello"
>>> type(content)
<class 'bytes'>
>>> type(text)
<class 'str'>
>>> type(text) is str
True
>>> type(content) is bytes
True
like image 20
Jeeva Avatar answered Oct 20 '22 17:10

Jeeva