Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Serializing Python bytestrings to JSON, preserving ordinal character values

Tags:

python

json

I have some binary data produced as base-256 bytestrings in Python (2.x). I need to read these into JavaScript, preserving the ordinal value of each byte (char) in the string. If you'll allow me to mix languages, I want to encode a string s in Python such that ord(s[i]) == s.charCodeAt(i) after I've read it back into JavaScript.

The cleanest way to do this seems to be to serialize my Python strings to JSON. However, json.dump doesn't like my bytestrings, despite fiddling with the ensure_ascii and encoding parameters. Is there a way to encode bytestrings to Unicode strings that preserves ordinal character values? Otherwise I think I need to encode the characters above the ASCII range into JSON-style \u1234 escapes; but a codec like this does not seem to be among Python's codecs.

Is there an easy way to serialize Python bytestrings to JSON, preserving char values, or do I need to write my own encoder?

like image 280
Doctor J Avatar asked Apr 21 '10 02:04

Doctor J


2 Answers

Is there a way to encode bytestrings to Unicode strings that preserves ordinal character values?

The byte -> unicode transformation is called decode, not encode. But yes, decoding with a codec such as iso-8859-1 should indeed "preserve ordinal character values" as you wish.

like image 112
Alex Martelli Avatar answered Oct 05 '22 22:10

Alex Martelli


Could you just use Base64? (Python base64 module, Javascript has several implementations, one of which is here.)

No reason to use escaped ASCII or UTF-8 unless your data is almost all text.

like image 28
Mike DeSimone Avatar answered Oct 06 '22 00:10

Mike DeSimone