Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

shortest encoding for Guid for use in a URL

Mads Kristensen got one down to 00amyWGct0y_ze4lIsj2Mw

Can it go smaller than that?

like image 689
mcintyre321 Avatar asked Aug 14 '09 16:08

mcintyre321


People also ask

What encoding is used in URL?

URL Encoding (Percent Encoding) URLs can only be sent over the Internet using the ASCII character-set. Since URLs often contain characters outside the ASCII set, the URL has to be converted into a valid ASCII format. URL encoding replaces unsafe ASCII characters with a "%" followed by two hexadecimal digits.

Why is %20 used in URLs?

A space is assigned number 32, which is 20 in hexadecimal. When you see “%20,” it represents a space in an encoded URL, for example, http://www.example.com/products%20and%20services.html.

Is Guid an ASCII?

So a 16 byte guid will just fit into 20 Ascii characters using this encoding scheme. A Guid can have 3.1962657931507848761677563491821e+38 discrete values whereas 20 characters of Ascii-85 can have 3.8759531084514355873123178482056e+38 discrete values.

What is URL encoded character?

URL encoding is also called percent encoding since it uses percent sign ( %) as an escape character. Space: One of the most frequent URL Encoded character you’re likely to encounter is space. The ASCII value of space character in decimal is 32, which when converted to hex comes out to be 20.

Do I need to encode alphanumeric ASCII characters in a URL?

The following table is a reference of ASCII characters to their corresponding URL Encoded form. Note that, Encoding alphanumeric ASCII characters are not required. For example, you don’t need to encode the character '0' to %30 as shown in the following table. It can be transmitted as is.

What characters can be used in a URL string?

There are only certain characters that are allowed in the URL string, alphabetic characters, numerals, and a few characters ; , / ? : @ & = + $ - _ . ! ~ * ' ( ) # that can have special meanings. Any character that is not an alphabetic character, a number, or a reserved character being used needs to be encoded.

Can a URL be transmitted as is or encoded?

It can be transmitted as is. But the encoding is still valid as per the RFC. All the characters that are safe to be transmitted inside URLs are colored green in the table. The following table uses rules defined in RFC 3986 for URL encoding.


2 Answers

Looks like there are only 73 characters that can be used unescaped in a URL. IF that's the case, you could convert the 128-bit number to base 73, and have a 21 character URL.

IF you can find 85 legal characters, you can get down to a 20 character URL.

like image 170
retracile Avatar answered Oct 05 '22 07:10

retracile


A GUID looks like this c9a646d3-9c61-4cb7-bfcd-ee2522c8f633 - that's 32 hex digits, each encoding 4 bits, so 128 bits in total

A base64 encoding uses 6 bits per symbol, which is easy to achieve with URL safe chars to give a 22 char encoded string. As others have noted, you could with with 73 url safe symbols and encoded as a base 73 number to give 21 chars.

like image 28
Paul Dixon Avatar answered Oct 05 '22 07:10

Paul Dixon