Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Default encoding of HTTP POST request with JSON body

What's the default encoding of HTTP POST request when the content-type is "application/json" with no explicit charset given"?

It seems two specs are in conflicts:

  • JSON spec says that "JSON text SHALL be encoded in Unicode. The default encoding is UTF-8."
  • HTTP spec says that "When no explicit charset parameter is provided by the sender, media subtypes of the "text" type are defined to have a default charset value of "ISO-8859-1" when received via HTTP."
like image 892
Kwang Yul Seo Avatar asked Apr 21 '15 02:04

Kwang Yul Seo


1 Answers

The application/json media type is formally defined in RFC 7158 The JavaScript Object Notation (JSON) Data Interchange Format (which obsoletes RFC 4627), and is registered with IANA has having NO required or optional parameters (thus, charset is not defined for application/json).

Section 8.1 Character Encoding says:

JSON text SHALL be encoded in UTF-8, UTF-16, or UTF-32. The default encoding is UTF-8, and JSON texts that are encoded in UTF-8 are interoperable in the sense that they will be read successfully by the maximum number of implementations; there are many implementations that cannot successfully read texts in other encodings (such as UTF-16 and UTF-32).

Implementations MUST NOT add a byte order mark to the beginning of a JSON text. In the interests of interoperability, implementations that parse JSON texts MAY ignore the presence of a byte order mark rather than treating it as an error.

application/... media types are typically defined as binary formats. It is very easy for a JSON parser to differentiate between UTF-8, UTF-16, and UTF-32 just by looking at the first few bytes, so there is no need for a BOM (which is not allowed, as noted above) or an explicit charset (which is not defined).

like image 101
Remy Lebeau Avatar answered Nov 15 '22 03:11

Remy Lebeau