When converting code from Python 2 to Python 3 one issue is that the behaviour when testing strings and bytes for equality has changed. For example:
foo = b'foo'
if foo == 'foo':
print("They match!")
prints nothing on Python 3 and "They match!" on Python 2. In this case it is easy to spot but in many cases the check is performed on variables which may have been defined elsewhere so there is no obvious type information.
I would like to make the Python 3 interpreter give an error whenever there is an equality test between string and bytes rather than silently conclude that they are different. Is there any way to accomplish this?
Python includes a number of comparison operators that can be used to compare strings. These operators allow you to check how strings compare to each other, and return a True or False value based on the outcome. This tutorial will discuss the comparison operators available for comparing strings in Python.
Python string comparison is performed using the characters in both strings. The characters in both strings are compared one by one. When different characters are found then their Unicode value is compared. The character with lower Unicode value is considered to be smaller.
For comparison of two strings, there is no special way. If we directly compare the values of strings, we use the '==' operator. If strings are identical, it returns True, otherwise False. There are diverse comparison operators who are utilized to equate the strings in python.
(EDITED: to fix an issue where I was incorrectly suggesting that modifying __eq__
on the instance would affect the ==
evaluation as suggested by @user2357112supportsMonica).
Normally, you would do this by overriding the __eq__
method of the type(s) you would like to guard.
Unfortunately for you, this cannot be done for built-in types, notably str
and bytes
, therefore code like this:
foo = b'foo'
bytes.__eq__ = ... # a custom equal function
# str.__eq__ = ... # if it were 'foo' == foo (or `type(foo)`)
if foo == 'foo':
print("They match!")
would just throw:
AttributeError: 'bytes' object attribute '__eq__' is read-only
You may need to manually guard the comparison with something like:
def str_eq_bytes(x, y):
if isinstance(x, str) and isinstance(y, bytes):
raise TypeError("Comparison between `str` and `bytes` detected.")
elif isinstance(x, bytes) and isinstance(y, str):
raise TypeError("Comparison between `bytes` and `str` detected.")
to be used as follows:
foo = 'foo'
if str_eq_bytes(foo, 'foo') or foo == 'foo':
print("They match!")
# They match!
foo = 'bar'
if str_eq_bytes(foo, 'foo') or foo == 'foo':
print("They match!")
# <nothing gets printed>
foo = b'foo'
if str_eq_bytes(foo, 'foo') or foo == 'foo':
print("They match!")
TypeError: Comparison between `bytes` and `str` detected.
The other option would be to hack in your own Python fork and override __eq__
.
Note that also Pypy does not allow you to override methods for built-in types.
There is an option, -b
, you can pass to the Python interpreter to cause it to emit a warning or error when comparing byte / str.
> python --help
usage: /bin/python [option] ... [-c cmd | -m mod | file | -] [arg] ...
Options and arguments (and corresponding environment variables):
-b : issue warnings about str(bytes_instance), str(bytearray_instance)
and comparing bytes/bytearray with str. (-bb: issue errors)
This produces a BytesWarning as seen here:
> python -bb -i
Python 3.8.0
Type "help", "copyright", "credits" or "license" for more information.
>>> v1 = b'foo'
>>> v2 = 'foo'
>>> v1 == v2
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
BytesWarning: Comparison between bytes and string
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With