Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

UriBuilder().Query will wrongly encode non-ASCII characters

Tags:

I am working on an asp.net mvc 4 web application. and i am using .net 4.5. now i have the following WebClient() class:

using (var client = new WebClient()) {     var query = HttpUtility.ParseQueryString(string.Empty);      query["model"] = Model;     //code goes here for other parameters....      string apiurl = System.Web.Configuration.WebConfigurationManager.AppSettings["ApiURL"];     var url = new UriBuilder(apiurl);     url.Query = query.ToString();      string xml = client.DownloadString(url.ToString());     XmlDocument doc = new XmlDocument();     //code goes here ....  } 

now i have noted a problem when one of the parameters contain non-ASCII charterers such as £, ¬, etc....

now the final query will have any non-ASCII characters (such as £) encoded wrongly (as %u00a3). i read about this problem and seems i can replace :-

url.Query = query.ToString(); 

with

url.Query = ri.EscapeUriString(HttpUtility.UrlDecode(query.ToString()));     

now using the later approach will encode £ as %C2%A3 which is the correct encoded value.

but the problem i am facing with url.Query = Uri.EscapeUriString(HttpUtility.UrlDecode(query.ToString())); in that case one of the parameters contains & then the url will have the following format &operation=AddAsset&assetName=&.... so it will assume that I am passing empty assetName parameter not value =&??

EDIT

Let me summarize my problem again. I want to be able to pass the following 3 things inside my URL to a third part API :

  1. Standard characters such as A,B ,a ,b ,1, 2, 3 ...

  2. Non-ASCII characters such as £,¬ .

  3. and also special characters that are used in url encoding such as & , + .

now i tried the following 2 approaches :

Approach A:

using (var client = new WebClient()) {     var query = HttpUtility.ParseQueryString(string.Empty);      query["model"] = Model;     //code goes here for other parameters....      string apiurl = System.Web.Configuration.WebConfigurationManager.AppSettings["ApiURL"];     var url = new UriBuilder(apiurl);     url.Query = query.ToString();      string xml = client.DownloadString(url.ToString());     XmlDocument doc = new XmlDocument();     //code goes here ....  } 

In this approach i can pass values such as & ,+ since they are going to be url encoded ,,but if i want to pass non-ASCII characters they will be encoded using ISO-8859-1 ... so if i have £ value , my above code will encoded as %u00a3 and it will be saved inside the 3rd party API as %u00a3 instead of £.

Approach B :

I use :

url.Query = Uri.EscapeUriString(HttpUtility.UrlDecode(query.ToString()));  

instead of

url.Query = query.ToString(); 

now I can pass non-ASCII characters such as £ since they will be encoded correctly using UTF8 instead of ISO-8859-1. but i can not pass values such as & because my url will be read wrongly by the 3rd party API.. for example if I want to pass assetName=& my url will look as follow:

&operation=Add&assetName=& 

so the third part API will assume I am passing empty assetName, while I am trying to pass its value as &...

so not sure how I can pass both non-ASCII characters + characters such as &, + ????

like image 483
john Gu Avatar asked Aug 10 '16 15:08

john Gu


1 Answers

You could use System.Net.Http.FormUrlEncodedContent instead.

This works with a Dictionary for the Name/Value pairing and the Dictionary, unlike the NameValueCollection, does not "incorrectly" map characters such as £ to an unhelpful escaping (%u00a3, in your case).

Instead, the FormUrlEncodedContent can take a dictionary in its constructor. When you read the string out of it, it will have properly urlencoded the dictionary values.

It will correctly and uniformly handle both of the cases you were having trouble with:

  • £ (which exceeds the character value range of urlencoding and needs to be encoded into a hexadecimal value in order to transport)
  • & (which, as you say, has meaning in the url as a parameter separator, so that values cannot contain it--so that it has to be encoded as well).

Here's a code example, that shows that the various kinds of example items you mentioned (represented by item1, item2 and item3) now end up correctly urlencoded:

String item1 = "£"; String item2 = "&"; String item3 = "xyz";  Dictionary<string,string> queryDictionary = new Dictionary<string, string>() {     {"item1", item1},     {"item2", item2},     {"item3", item3} };  var queryString = new System.Net.Http.FormUrlEncodedContent(queryDictionary)         .ReadAsStringAsync().Result; 

queryString will contain item1=%C2%A3&item2=%26&item3=xyz.

like image 79
DWright Avatar answered Sep 18 '22 18:09

DWright