Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can you make Python3 give an error when comparing strings to bytes

When converting code from Python 2 to Python 3 one issue is that the behaviour when testing strings and bytes for equality has changed. For example:

foo = b'foo'
if foo == 'foo':
    print("They match!")

prints nothing on Python 3 and "They match!" on Python 2. In this case it is easy to spot but in many cases the check is performed on variables which may have been defined elsewhere so there is no obvious type information.

I would like to make the Python 3 interpreter give an error whenever there is an equality test between string and bytes rather than silently conclude that they are different. Is there any way to accomplish this?

like image 635
Carcophan Avatar asked May 30 '20 09:05

Carcophan


People also ask

Can Python compare string numbers?

Python includes a number of comparison operators that can be used to compare strings. These operators allow you to check how strings compare to each other, and return a True or False value based on the outcome. This tutorial will discuss the comparison operators available for comparing strings in Python.

How does string comparison work in Python?

Python string comparison is performed using the characters in both strings. The characters in both strings are compared one by one. When different characters are found then their Unicode value is compared. The character with lower Unicode value is considered to be smaller.

How do you compare two strings with special characters in Python?

For comparison of two strings, there is no special way. If we directly compare the values of strings, we use the '==' operator. If strings are identical, it returns True, otherwise False. There are diverse comparison operators who are utilized to equate the strings in python.


2 Answers

(EDITED: to fix an issue where I was incorrectly suggesting that modifying __eq__ on the instance would affect the == evaluation as suggested by @user2357112supportsMonica).

Normally, you would do this by overriding the __eq__ method of the type(s) you would like to guard. Unfortunately for you, this cannot be done for built-in types, notably str and bytes, therefore code like this:

foo = b'foo'
bytes.__eq__ = ...  # a custom equal function
# str.__eq__ = ...  # if it were 'foo' == foo (or `type(foo)`)
if foo == 'foo':
    print("They match!")

would just throw:

AttributeError: 'bytes' object attribute '__eq__' is read-only

You may need to manually guard the comparison with something like:

def str_eq_bytes(x, y):
    if isinstance(x, str) and isinstance(y, bytes):
        raise TypeError("Comparison between `str` and `bytes` detected.")
    elif isinstance(x, bytes) and isinstance(y, str):
        raise TypeError("Comparison between `bytes` and `str` detected.")

to be used as follows:

foo = 'foo'
if str_eq_bytes(foo, 'foo') or foo == 'foo':
    print("They match!")
# They match!

foo = 'bar'
if str_eq_bytes(foo, 'foo') or foo == 'foo':
    print("They match!")
# <nothing gets printed>

foo = b'foo'
if str_eq_bytes(foo, 'foo') or foo == 'foo':
    print("They match!")
TypeError: Comparison between `bytes` and `str` detected.

The other option would be to hack in your own Python fork and override __eq__. Note that also Pypy does not allow you to override methods for built-in types.

like image 64
norok2 Avatar answered Sep 24 '22 22:09

norok2


There is an option, -b, you can pass to the Python interpreter to cause it to emit a warning or error when comparing byte / str.

> python --help
usage: /bin/python [option] ... [-c cmd | -m mod | file | -] [arg] ...
Options and arguments (and corresponding environment variables):
-b     : issue warnings about str(bytes_instance), str(bytearray_instance)
         and comparing bytes/bytearray with str. (-bb: issue errors)

This produces a BytesWarning as seen here:

> python -bb -i
Python 3.8.0
Type "help", "copyright", "credits" or "license" for more information.
>>> v1 = b'foo'
>>> v2 = 'foo'
>>> v1 == v2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
BytesWarning: Comparison between bytes and string
like image 31
Carcophan Avatar answered Sep 26 '22 22:09

Carcophan