Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Illegal characters in HTTP headers

I'm creating an HttpUrlConnection and need to set multiple custom headers.

I'd like to do something along the lines of the following, but the contents of the header map needs to come from a single string. Are there any characters that are illegal or extremely rarely used in both HTTP header names and HTTP header values?

HashMap<String, String> headers = new HashMap<String, String>();

// TODO: How can I fill the headers map reliably from a single string?

HttpURLConnection c = (HttpURLConnection) url.openConnection();
for(Map.Entry<String, String> e : headers.entrySet())
    c.setRequestProperty(e.getKey(), e.getValue());

Solution for now

Doesn't seem like any HTTP header names contain any spaces (usually use dash instead?), so I can separate the name with the value using a single space. As for the name-value sets, it seems I'm screwed since the value can contain pretty much anything according to the given answer. So I've just picked a character I'm pretty sure will most likely never be used: §. If it turns out it is actually needed, I'll just have to adjust my code :p

Header1 Value1§Header2 Value2§Header3 Header3
like image 504
Svish Avatar asked Sep 26 '13 12:09

Svish


People also ask

What characters are allowed in HTTP header?

The value of the HTTP request header you want to set can only contain: Alphanumeric characters: a - z and A - Z. The following special characters: _ :;.,\/"'?!(){}[]@<>=-+*#$&`|~^%

Are underscores allowed in HTTP headers?

Please note that using underscores in headers is perfectly valid per the HTTP spec, but Nginx, by default, will ignore them.

Can HTTP header name contains spaces?

No, you shouldn't, and it's just plain invalid. field-name cannot have spaces. In Connection : close \r\n , the field-name is Connection , which is invalid.

How do I set character encoding in HTTP header?

Use the header() function before generating any content, e.g.: header('Content-type: text/html; charset=utf-8'); Java Servlets.


1 Answers

The relevant BNF from RFC7230 is:

field-name = token

token = 1*tchar

tchar = "!" / "#" / "$" / "%" / "&" / "'" / "*" / "+" / "-" / 
        "." / "^" / "_" / "`" / "|" / "~" / DIGIT / ALPHA

The character set is visible USASCII.

RFC 7230 is more recent than your question, but in the relevant particulars, it does not change what was formerly said by RFC 2616.

There's a very strong convention for field names which is much more restrictive than what the RFC allows, and this is enforced to various degrees in various implementations. Field Names usually follow a pattern of a sequence of [ASCII / NUMERAL] words with the first letter (only) of each word being capitalised. The words are separated with a single hyphen.

So, for example "HttpUrlConnection" was supposed to be an HTTP Header name (rather than a java token), you'd call it 'Http-Url-Connection'.

I dimly remember once tracking a bug down to some implementation being strict enough not to admit multiple capitals in one word (which happened to be an acronym). I.e. it pays to follow this more restricted format very strictly.

  • Non ASCII character sets play no part in field-names, though they may be used in field values.

  • Escaping in field names is not supported by the standard. Escaping of values is not hte concern of the HTTP or MIME standards, but you could choose to reuse the standard URL encoding method for encoding a set of name value pairs.

like image 125
mc0e Avatar answered Oct 07 '22 11:10

mc0e