Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What charset to use for json with base64 encoded binary data?

What is the most space efficient charset for JSON (UTF-8/16/32) for use of base64 encoded binary data?

{ data: "jA0EAwMCxamDRMfOGV5gyZPnyX1BB" }
like image 697
Sebastian Barth Avatar asked Mar 17 '23 18:03

Sebastian Barth


1 Answers

Base64 is ASCII, so if the bulk of your JSON is Base64-encoded data, the most space-efficient encoding will be UTF-8. UTF-8 encodes ASCII characters (code points 0000–007F) as one byte, whereas UTF-16 and UTF-32 encode them as two and four, respectively.

Furthermore, it's just a good idea to use UTF-8, because it's the default encoding for JSON and not all tools support other encodings. From RFC-7159:

8.1 Character Encoding

JSON text SHALL be encoded in UTF-8, UTF-16, or UTF-32. The default encoding is UTF-8, and JSON texts that are encoded in UTF-8 are interoperable in the sense that they will be read successfully by the maximum number of implementations; there are many implementations that cannot successfully read texts in other encodings (such as UTF-16 and UTF-32).

like image 121
Jordan Running Avatar answered Apr 25 '23 04:04

Jordan Running