Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

URL Encoding using C#

Tags:

c#

.net

urlencode

I have an application which sends a POST request to the VB forum software and logs someone in (without setting cookies or anything).

Once the user is logged in I create a variable that creates a path on their local machine.

c:\tempfolder\date\username

The problem is that some usernames are throwing "Illegal chars" exception. For example if my username was mas|fenix it would throw an exception..

Path.Combine( _         Environment.GetFolderPath(System.Environment.SpecialFolder.CommonApplicationData), _   DateTime.Now.ToString("ddMMyyhhmm") + "-" + form1.username) 

I don't want to remove it from the string, but a folder with their username is created through FTP on a server. And this leads to my second question. If I am creating a folder on the server can I leave the "illegal chars" in? I only ask this because the server is Linux based, and I am not sure if Linux accepts it or not.

EDIT: It seems that URL encode is NOT what I want.. Here's what I want to do:

old username = mas|fenix new username = mas%xxfenix 

Where %xx is the ASCII value or any other value that would easily identify the character.

like image 824
masfenix Avatar asked Feb 22 '09 18:02

masfenix


People also ask

How do I encode a string with URL?

URL encoding replaces unsafe ASCII characters with a "%" followed by two hexadecimal digits. URLs cannot contain spaces. URL encoding normally replaces a space with a plus (+) sign or with %20.

What is %20 in a URL?

A space is assigned number 32, which is 20 in hexadecimal. When you see “%20,” it represents a space in an encoded URL, for example, http://www.example.com/products%20and%20services.html.

What is UrlEncode C#?

UrlEncode(String, Encoding)Encodes a URL string using the specified encoding object. public: static System::String ^ UrlEncode(System::String ^ str, System::Text::Encoding ^ e); public: static System::String ^ UrlEncode(System::String ^ s, System::Text::Encoding ^ Enc); C# Copy.

What characters are URL encoded?

A URL is composed from a limited set of characters belonging to the US-ASCII character set. These characters include digits (0-9), letters(A-Z, a-z), and a few special characters ( "-" , "." , "_" , "~" ).


2 Answers

I've been experimenting with the various methods .NET provide for URL encoding. Perhaps the following table will be useful (as output from a test app I wrote):

Unencoded UrlEncoded UrlEncodedUnicode UrlPathEncoded EscapedDataString EscapedUriString HtmlEncoded HtmlAttributeEncoded HexEscaped A         A          A                 A              A                 A                A           A                    %41 B         B          B                 B              B                 B                B           B                    %42  a         a          a                 a              a                 a                a           a                    %61 b         b          b                 b              b                 b                b           b                    %62  0         0          0                 0              0                 0                0           0                    %30 1         1          1                 1              1                 1                1           1                    %31  [space]   +          +                 %20            %20               %20              [space]     [space]              %20 !         !          !                 !              !                 !                !           !                    %21 "         %22        %22               "              %22               %22              &quot;      &quot;               %22 #         %23        %23               #              %23               #                #           #                    %23 $         %24        %24               $              %24               $                $           $                    %24 %         %25        %25               %              %25               %25              %           %                    %25 &         %26        %26               &              %26               &                &amp;       &amp;                %26 '         %27        %27               '              '                 '                &#39;       &#39;                %27 (         (          (                 (              (                 (                (           (                    %28 )         )          )                 )              )                 )                )           )                    %29 *         *          *                 *              %2A               *                *           *                    %2A +         %2b        %2b               +              %2B               +                +           +                    %2B ,         %2c        %2c               ,              %2C               ,                ,           ,                    %2C -         -          -                 -              -                 -                -           -                    %2D .         .          .                 .              .                 .                .           .                    %2E /         %2f        %2f               /              %2F               /                /           /                    %2F :         %3a        %3a               :              %3A               :                :           :                    %3A ;         %3b        %3b               ;              %3B               ;                ;           ;                    %3B <         %3c        %3c               <              %3C               %3C              &lt;        &lt;                 %3C =         %3d        %3d               =              %3D               =                =           =                    %3D >         %3e        %3e               >              %3E               %3E              &gt;        >                    %3E ?         %3f        %3f               ?              %3F               ?                ?           ?                    %3F @         %40        %40               @              %40               @                @           @                    %40 [         %5b        %5b               [              %5B               %5B              [           [                    %5B \         %5c        %5c               \              %5C               %5C              \           \                    %5C ]         %5d        %5d               ]              %5D               %5D              ]           ]                    %5D ^         %5e        %5e               ^              %5E               %5E              ^           ^                    %5E _         _          _                 _              _                 _                _           _                    %5F `         %60        %60               `              %60               %60              `           `                    %60 {         %7b        %7b               {              %7B               %7B              {           {                    %7B |         %7c        %7c               |              %7C               %7C              |           |                    %7C }         %7d        %7d               }              %7D               %7D              }           }                    %7D ~         %7e        %7e               ~              ~                 ~                ~           ~                    %7E  Ā         %c4%80     %u0100            %c4%80         %C4%80            %C4%80           Ā           Ā                    [OoR] ā         %c4%81     %u0101            %c4%81         %C4%81            %C4%81           ā           ā                    [OoR] Ē         %c4%92     %u0112            %c4%92         %C4%92            %C4%92           Ē           Ē                    [OoR] ē         %c4%93     %u0113            %c4%93         %C4%93            %C4%93           ē           ē                    [OoR] Ī         %c4%aa     %u012a            %c4%aa         %C4%AA            %C4%AA           Ī           Ī                    [OoR] ī         %c4%ab     %u012b            %c4%ab         %C4%AB            %C4%AB           ī           ī                    [OoR] Ō         %c5%8c     %u014c            %c5%8c         %C5%8C            %C5%8C           Ō           Ō                    [OoR] ō         %c5%8d     %u014d            %c5%8d         %C5%8D            %C5%8D           ō           ō                    [OoR] Ū         %c5%aa     %u016a            %c5%aa         %C5%AA            %C5%AA           Ū           Ū                    [OoR] ū         %c5%ab     %u016b            %c5%ab         %C5%AB            %C5%AB           ū           ū                    [OoR] 

The columns represent encodings as follows:

  • UrlEncoded: HttpUtility.UrlEncode

  • UrlEncodedUnicode: HttpUtility.UrlEncodeUnicode

  • UrlPathEncoded: HttpUtility.UrlPathEncode

  • EscapedDataString: Uri.EscapeDataString

  • EscapedUriString: Uri.EscapeUriString

  • HtmlEncoded: HttpUtility.HtmlEncode

  • HtmlAttributeEncoded: HttpUtility.HtmlAttributeEncode

  • HexEscaped: Uri.HexEscape

NOTES:

  1. HexEscape can only handle the first 255 characters. Therefore it throws an ArgumentOutOfRange exception for the Latin A-Extended characters (eg Ā).

  2. This table was generated in .NET 4.0 (see Levi Botelho's comment below that says the encoding in .NET 4.5 is slightly different).

EDIT:

I've added a second table with the encodings for .NET 4.5. See this answer: https://stackoverflow.com/a/21771206/216440

EDIT 2:

Since people seem to appreciate these tables, I thought you might like the source code that generates the table, so you can play around yourselves. It's a simple C# console application, which can target either .NET 4.0 or 4.5:

using System; using System.Collections.Generic; using System.Text; // Need to add a Reference to the System.Web assembly. using System.Web;  namespace UriEncodingDEMO2 {     class Program     {         static void Main(string[] args)         {             EncodeStrings();              Console.WriteLine();             Console.WriteLine("Press any key to continue...");             Console.Read();         }          public static void EncodeStrings()         {             string stringToEncode = "ABCD" + "abcd"             + "0123" + " !\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~" + "ĀāĒēĪīŌōŪū";              // Need to set the console encoding to display non-ASCII characters correctly (eg the              //  Latin A-Extended characters such as ĀāĒē...).             Console.OutputEncoding = Encoding.UTF8;              // Will also need to set the console font (in the console Properties dialog) to a font              //  that displays the extended character set correctly.             // The following fonts all display the extended characters correctly:             //  Consolas             //  DejaVu Sana Mono             //  Lucida Console              // Also, in the console Properties, set the Screen Buffer Size and the Window Size              //  Width properties to at least 140 characters, to display the full width of the              //  table that is generated.              Dictionary<string, Func<string, string>> columnDetails =                 new Dictionary<string, Func<string, string>>();             columnDetails.Add("Unencoded", (unencodedString => unencodedString));             columnDetails.Add("UrlEncoded",                 (unencodedString => HttpUtility.UrlEncode(unencodedString)));             columnDetails.Add("UrlEncodedUnicode",                 (unencodedString => HttpUtility.UrlEncodeUnicode(unencodedString)));             columnDetails.Add("UrlPathEncoded",                 (unencodedString => HttpUtility.UrlPathEncode(unencodedString)));             columnDetails.Add("EscapedDataString",                 (unencodedString => Uri.EscapeDataString(unencodedString)));             columnDetails.Add("EscapedUriString",                 (unencodedString => Uri.EscapeUriString(unencodedString)));             columnDetails.Add("HtmlEncoded",                 (unencodedString => HttpUtility.HtmlEncode(unencodedString)));             columnDetails.Add("HtmlAttributeEncoded",                 (unencodedString => HttpUtility.HtmlAttributeEncode(unencodedString)));             columnDetails.Add("HexEscaped",                 (unencodedString                     =>                     {                         // Uri.HexEscape can only handle the first 255 characters so for the                          //  Latin A-Extended characters, such as A, it will throw an                          //  ArgumentOutOfRange exception.                                                try                         {                             return Uri.HexEscape(unencodedString.ToCharArray()[0]);                         }                         catch                         {                             return "[OoR]";                         }                     }));              char[] charactersToEncode = stringToEncode.ToCharArray();             string[] stringCharactersToEncode = Array.ConvertAll<char, string>(charactersToEncode,                 (character => character.ToString()));             DisplayCharacterTable<string>(stringCharactersToEncode, columnDetails);         }          private static void DisplayCharacterTable<TUnencoded>(TUnencoded[] unencodedArray,             Dictionary<string, Func<TUnencoded, string>> mappings)         {             foreach (string key in mappings.Keys)             {                 Console.Write(key.Replace(" ", "[space]") + " ");             }             Console.WriteLine();              foreach (TUnencoded unencodedObject in unencodedArray)             {                 string stringCharToEncode = unencodedObject.ToString();                 foreach (string columnHeader in mappings.Keys)                 {                     int columnWidth = columnHeader.Length + 1;                     Func<TUnencoded, string> encoder = mappings[columnHeader];                     string encodedString = encoder(unencodedObject);                      // ASSUMPTION: Column header will always be wider than encoded string.                     Console.Write(encodedString.Replace(" ", "[space]").PadRight(columnWidth));                 }                 Console.WriteLine();             }         }     } } 

Click here to run code on dotnetfiddle.net

like image 121
Simon Tewsi Avatar answered Oct 13 '22 20:10

Simon Tewsi


You should encode only the user name or other part of the URL that could be invalid. URL encoding a URL can lead to problems since something like this:

string url = HttpUtility.UrlEncode("http://www.google.com/search?q=Example"); 

Will yield

http%3a%2f%2fwww.google.com%2fsearch%3fq%3dExample

This is obviously not going to work well. Instead, you should encode ONLY the value of the key/value pair in the query string, like this:

string url = "http://www.google.com/search?q=" + HttpUtility.UrlEncode("Example"); 

Hopefully that helps. Also, as teedyay mentioned, you'll still need to make sure illegal file-name characters are removed or else the file system won't like the path.

like image 26
Dan Herbert Avatar answered Oct 13 '22 19:10

Dan Herbert