We can convert numbers to strings using the str() method. We'll pass either a number or a variable into the parentheses of the method and then that numeric value will be converted into a string value. The quotes around the number 12 signify that the number is no longer an integer but is now a string value.
Using the decode() method Python provides the built-in decode() method, which is used to convert bytes to a string. Let's understand the following example.
You had it nearly right in the last line. You want
str(bytes_string, 'utf-8')
because the type of bytes_string
is bytes
, the same as the type of b'abc'
.
Call decode()
on a bytes
instance to get the text which it encodes.
str = bytes.decode()
How to filter (skip) non-UTF8 charachers from array?
To address this comment in @uname01's post and the OP, ignore the errors:
Code
>>> b'\x80abc'.decode("utf-8", errors="ignore")
'abc'
Details
From the docs, here are more examples using the same errors
parameter:
>>> b'\x80abc'.decode("utf-8", "replace")
'\ufffdabc'
>>> b'\x80abc'.decode("utf-8", "backslashreplace")
'\\x80abc'
>>> b'\x80abc'.decode("utf-8", "strict")
Traceback (most recent call last):
...
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0:
invalid start byte
The errors argument specifies the response when the input string can’t be converted according to the encoding’s rules. Legal values for this argument are
'strict'
(raise aUnicodeDecodeError
exception),'replace'
(useU+FFFD
,REPLACEMENT CHARACTER
), or'ignore'
(just leave the character out of the Unicode result).
UPDATED:
TO NOT HAVE ANY
b
and quotes at first and endHow to convert
bytes
as seen to strings, even in weird situations.
As your code may have unrecognizable characters to 'utf-8'
encoding,
it's better to use just str without any additional parameters:
some_bad_bytes = b'\x02-\xdfI#)'
text = str( some_bad_bytes )[2:-1]
print(text)
Output: \x02-\xdfI
if you add 'utf-8'
parameter, to these specific bytes, you should receive error.
As PYTHON 3 standard says, text
would be in utf-8 now with no concern.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With