This is a follow-up to Are HTTP headers case-sensitive?.
In the HTTP Content-Type
header, I have seen character set names expressed both in upper- and lower-case form. For example, for the UTF-8
character set:
Content-Type: text/html; charset=UTF-8 Content-Type: text/html; charset=utf-8
Here are some mixed-case variants (the latter two certainly not being likely in the real world):
Content-Type: text/html; charset=Utf-8 Content-Type: text/html; charset=UtF-8 Content-Type: text/html; charset=uTf-8
Are all forms equally valid? Or, are the client and server applications that ignore the case of the character set name merely being flexible? Alternatively, are those applications that recognize only one representation non-compliant?
The type, subtype, and parameter names are not case sensitive.
All HTTP headers are converted from ISO-8859-1 (the character set for HTTP headers as defined in the RFC publications) to UTF-8 in the metadata (and vice versa for requests). All HTTP header keys are converted to lowercase in both directions (since HTTP header keys are defined to be case-insensitive).
HTTP headers are case insensitive. To simplify your code, URL Loading System canonicalizes certain header field names into their standard form. For example, if the server sends a content-length header, it's automatically adjusted to be Content-Length .
The value for charset is case-insensitive. The charset attribute specifies the character encoding used by the document. This is a character encoding declaration. If the attribute is present, its value must be an ASCII case-insensitive match for the string "utf-8".
[Here is the result of my research.]
RFC 2616 clause 3.4 says the following:
HTTP character sets are identified by case-insensitive tokens. The complete set of tokens is defined by the IANA Character Set registry [19].
charset = token
The IANA Character Set registry is now maintained here. At the very top of this document under Note, the second paragraph reads:
The character set names may be up to 40 characters taken from the printable characters of US-ASCII. However, no distinction is made between use of upper and lower case letters.
Conclusion: These two references indicate that case does not matter when specifying a character set name.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With