Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python 3.3: struct.pack won't accept strings

I'm trying to use struct.pack to write a padded string to a file but it seems with the 3.x interpreters this doesn't work anymore. An example of how I'm using it:

mystring = anotherString+" sometext here"
output = struct.pack("30s", mystring);

This seems to be okay in earlier versions of python but with 3 it produces an error demanding a byte object. The docs seem to imply that it supposed to do a conversion of any string to a UTF-8 byte object without complaint (and I don't care if a multi-byte character happens to be truncated):

http://docs.python.org/release/3.1.5/library/struct.html: "The c, s and p conversion codes operate on bytes objects, but packing with such codes also supports str objects, which are encoded using UTF-8."

Am I misreading the docs and how are others using struct.pack with strings?

like image 336
akai Avatar asked Jun 20 '13 15:06

akai


1 Answers

Yes, up until 3.1 struct.pack() erroneously would implicitly encode strings to UTF-8 bytes; this was fixed in Python 3.2. See issue 10783.

The conclusion was that the implicit conversion was a Bad Idea, and it was reverted while the developers still had a chance to do so:

I prefer to break the API today than having to maintain a broken API for 10 or 20 years :-) And we have a very small user base using Python 3, it's easier to change it now, than in the next release.

This is also documented in the porting section of the 3.2 What's New guide:

struct.pack() now only allows bytes for the s string pack code. Formerly, it would accept text arguments and implicitly encode them to bytes using UTF-8. This was problematic because it made assumptions about the correct encoding and because a variable-length encoding can fail when writing to fixed length segment of a structure.

You need to explicitly encode your strings before packing.

like image 124
Martijn Pieters Avatar answered Oct 13 '22 01:10

Martijn Pieters