Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python bytearray ignoring encoding?

Tags:

python

I've got a chunk of code that reads binary data off a string buffer (StringIO object), and tries to convert it to a bytearray object, but it's throwing errors when the value is greater than 127, which the ascii encoding can't handle, even when I'm trying to override it:

file = open(filename, 'r+b')
file.seek(offset)
chunk = file.read(length)
chunk = zlib.decompress(chunk)
chunk = StringIO(chunk)

d = bytearray(chunk.read(10), encoding="iso8859-1", errors="replace")

Running that code gives me:

  d = bytearray(chunk.read(10), encoding="iso8859-1", errors="replace")
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position 3: ordinal not in range(128)

Obviously 240 (decimal of 0xf0) can't fit in the ascii encoding range, but that's why I'm explicitly setting the encoding. But it seems to be ignoring it.

like image 318
MidnightLightning Avatar asked Dec 22 '22 14:12

MidnightLightning


2 Answers

When converting a string to another encoding, its original encoding is taken to be ASCII if it is a str or Unicode if it is a unicode object. When creating the bytearray, the encoding parameter is required only if the string is unicode. Just don't specify an encoding and you will get the results you want.

like image 149
kindall Avatar answered Jan 07 '23 23:01

kindall


I am not quite sure what the problem is.

StringIO is for string IO, not for binary IO. If you want to get a bytearray representing the whole content of the file, use:

with open ('filename', 'r') as file: bytes = bytearray (file.read () )

if you want to get a string with only ascii characters contained in that file, use:

with open ('filename', 'r') as file: asciis = file.read ().decode ('ascii', 'ignore')

(If you run it on windows, you will probably need the binary flag for opening the file.

like image 42
Hyperboreus Avatar answered Jan 07 '23 23:01

Hyperboreus