Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compare string to bytes that works in both Python 2 and 3

What is the best way to compare a string object to a bytes object that works in both Python 2 and Python 3? Assume both are UTF-8. More generally, how does one write a Python 2 and Python 3 compatible comparison of two objects that may each be a string, bytes, or Unicode object?

The problem is that "asdf" == b"asdf" is True in Python 2 and False in Python 3.

Meanwhile, one cannot blindly encode or decode objects, since strings in Python 2 have both encode and decode methods, but strings in Python 3 just have encode methods.

Finally, isinstance(obj, bytes) returns True for any non-unicode string in Python 2 and returns True for only bytes objects in Python 3.

like image 578
Zags Avatar asked Jun 16 '16 21:06

Zags


People also ask

What is the difference between text encoding in Python 2 and Python 3?

In Python 2, the str type was used for two different kinds of values – text and bytes, whereas in Python 3, these are separate and incompatible types. Text contains human-readable messages, represented as a sequence of Unicode codepoints. Usually, it does not contain unprintable control characters such as \0 .

How to decode string to bytes in Python?

If it is a string, you must also give the encoding (and optionally, errors) parameters; bytearray() then converts the string to bytes using str. encode(). If it is an integer, the array will have that size and will be initialized with null bytes.

What is the difference between bytes and string in Python?

Byte objects are sequence of Bytes, whereas Strings are sequence of characters. Byte objects are in machine readable form internally, Strings are only in human readable form. Since Byte objects are machine readable, they can be directly stored on the disk.


1 Answers

In both Python 2 and Python 3, anything that is an instance of bytes has a decode method. Thus, you can do the following:

def compare(a, b, encoding="utf8"):
    if isinstance(a, bytes):
        a = a.decode(encoding)
    if isinstance(b, bytes):
        b = b.decode(encoding)
    return a == b
like image 142
Zags Avatar answered Sep 27 '22 22:09

Zags