Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

URL safe UUIDs in the smallest number of characters

Ideally I would want something like example.com/resources/äFg4вNгё5, minimum number of visible characters, never mind that they have to be percent encoded before transmitting them over HTTP.

Can you tell a scheme which encodes 128b UUIDs into the least number of visible characters efficiently, without the results having characters which break URLs?

like image 602
Jesvin Jose Avatar asked Jul 11 '12 11:07

Jesvin Jose


2 Answers

Base-64 is good for this.

{098ef7bc-a96c-43a9-927a-912fc7471ba2}

could be encoded as

vPeOCWypqUOSepEvx0cbog

The usual equal-signs at the end could be dropped, as they always make the string-length a multiple of 4. And instead of + and /, you could use some safe characters. You can pick two from: - . _ ~

More information:

  • RFC 4648
  • Storing UUID as base64 String (Java)
  • guid to base64, for URL (C#)
  • Short GUID (C#)
like image 52
Markus Jarderot Avatar answered Oct 21 '22 10:10

Markus Jarderot


I use a url-safe base64 string. The following is some Python code that does this*.

The last line removes '=' or '==' sign that base 64 encoding likes to put on the end, they make putting the characters into a URL more difficult and are only necessary for de-encoding the information, which does not need to be done here.

import base64
import uuid

# get a UUID - URL safe, Base64
def get_a_Uuid():
    r_uuid = base64.urlsafe_b64encode(uuid.uuid4().bytes)
    return r_uuid.replace('=', '')

Above does not work for Python3. This is what I'm doing instead:

 r_uuid = base64.urlsafe_b64encode(uuid.uuid4().bytes).decode("utf-8")
 return r_uuid.replace('=', '')

* This does follow the standards: base64.urlsafe_b64encode follows RFC 3548 and 4648 see https://docs.python.org/2/library/base64.html. Stripping == from base64 encoded data with known length is allowed see RFC 4648 §3.2. UUID/GUID are specified in RFC 4122; §4.1 Format states "The UUID format is 16 octets". The base64-fucntion encodes these 16 octets.

like image 5
Chris Dutrow Avatar answered Oct 21 '22 09:10

Chris Dutrow