Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to change encoding of UserAgent in HttpWebRequest?

Today I met a problem with the encoding of UserAgent when I tried to use HttpWebRequest to post a request.

Normally, UserAgent consists of Latin letters and punctuation. However, I need to simulate web requests of an iOS app whose UA contains some Unicode (specifically Chinese) characters.

Using Fiddler to get the raw request, I found that the app gave used Unicode encoding in its UA. I couldn't POST it in C#; I got this error:

You may not try to send Unicode in UserAgent. But it is really important for my project. Now I can simulate the request of App without the bytes of UA.

How can I change UA encoding?

like image 658
Patrick Wu Avatar asked Nov 03 '22 17:11

Patrick Wu


1 Answers

according to the standards (rfc 2616 (http/1.1), sec 2.2, 3.8. 14.43 and rfc 2047 (mime, part 3), sec 4, 5), you can't use any other encoding but iso-8859-1 for any http header field like user-agent.

however, you can apply the encoding scheme of rfc 2047 to map unicode string onto 8859-1 - strings; in a nutshell, you wrap your text with a charset identifier and substitute the unicode codepoints by the hex values of the octet sequence in their respective encoding.

example:

User-Agent: Million-€-Browser becomes User-Agent: =?utf-8?q?Million-=e2=82=ac-Browser?=, with e2 82 ac being the utf-8 octet sequence of the euro symbol.

like image 154
collapsar Avatar answered Nov 10 '22 01:11

collapsar