I tried to find this in the relevant RFC, IETF RFC 3986, but couldn't figure it.
Do URIs for HTTP allow Unicode, or non-ASCII of any kind?
Can you please cite the section and the RFC that supports your answer.
NB: For those who might think this is not programming related - it is. It's related to an ISAPI filter I'm building.
Addendum
I've read section 2.5 of RFC 3986. But RFC 2616, which I believe is the current HTTP protocol, predates 3986, and for that reason I'd suppose it cannot be compliant with 3986. Furthermore, even if or when the HTTP RFC is updated, there still will be the issue of rationalization - in other words, does an HTTP URI support ALL of the RFC3986 provisos, including whatever is appropriate to include non US-ASCII characters?
URLs can only be sent over the Internet using the ASCII character-set. Since URLs often contain characters outside the ASCII set, the URL has to be converted into a valid ASCII format. URL encoding replaces unsafe ASCII characters with a "%" followed by two hexadecimal digits.
JSON allows for both escaped or non-escaped non-ascii characters. It'd be useful for this document to include guidance on which style is preferred, or if there is no preference.
Non-ASCII filenames are stored in a special format called “Unicode”. But in some cases, Unicode offers multiple ways to write things that look exactly the same to humans.
http://en.wikipedia.org/wiki/Internationalized_domain_name
No, they are not allowed. Just check the ABNF in RFC 3986.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With