Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Output UUID in Go as a short string

Is there a built in way, or reasonably standard package that allows you to convert a standard UUID into a short string that would enable shorter URL's?

I.e. taking advantage of using a larger range of characters such as [A-Za-z0-9] to output a shorter string.

I know we can use base64 to encode the bytes, as follows, but I'm after something that creates a string that looks like a "word", i.e. no + and /:

id = base64.StdEncoding.EncodeToString(myUuid.Bytes())
like image 568
Jay Avatar asked Jun 21 '16 01:06

Jay


People also ask

Can you truncate UUID?

It is not safe to truncate uuid's. Also, they are designed to be globally unique, so you aren't going to have luck shortening them. Your best bet is to either assign each user a unique number, or let users pick a custom (unique) string (like a username, or nick name) that can be decoded. So you could have edit?

How to generate UUID in Golang?

There are two ways to generate a UUID: Using inbuilt library – os/exec. Using uuid package by google – github.com/google/uuid.

What is the length of a UUID?

What is a UUID. Universally Unique Identifiers, or UUIDS, are 128 bit numbers, composed of 16 octets and represented as 32 base-16 characters, that can be used to identify information across a computer system.

What is the type of UUID in Golang?

Universally Unique Identifier, or UUID for short, is a universally unique identifier consisting of a 128-bit number. It is mainly used to identify information since there is a duplicate value is near zero. A UUID contains 32 hexadecimal values grouped in 5 blocks.


3 Answers

As suggested here, If you want just a fairly random string to use as slug, better to not bother with UUID at all.

You can simply use go's native math/rand library to make random strings of desired length:

import (
"math/rand"
"encoding/hex"
)


b := make([]byte, 4) //equals 8 characters
rand.Read(b) 
s := hex.EncodeToString(b)
like image 94
Karlom Avatar answered Oct 20 '22 17:10

Karlom


A universally unique identifier (UUID) is a 128-bit value, which is 16 bytes. For human-readable display, many systems use a canonical format using hexadecimal text with inserted hyphen characters, for example:

123e4567-e89b-12d3-a456-426655440000

This has length 16*2 + 4 = 36. You may choose to omit the hypens which gives you:

fmt.Printf("%x\n", uuid)
fmt.Println(hex.EncodeToString(uuid))

// Output: 32 chars
123e4567e89b12d3a456426655440000
123e4567e89b12d3a456426655440000

You may choose to use base32 encoding (which encodes 5 bits with 1 symbol in contrast to hex encoding which encodes 4 bits with 1 symbol):

fmt.Println(base32.StdEncoding.EncodeToString(uuid))

// Output: 26 chars
CI7EKZ7ITMJNHJCWIJTFKRAAAA======

Trim the trailing = signs when transmitting, so this will always be 26 chars. Note that you have to append "======" prior to decode the string using base32.StdEncoding.DecodeString().

If this is still too long for you, you may use base64 encoding (which encodes 6 bits with 1 symbol):

fmt.Println(base64.RawURLEncoding.EncodeToString(uuid))

// Output: 22 chars
Ej5FZ-ibEtOkVkJmVUQAAA

Note that base64.RawURLEncoding produces a base64 string (without padding) which is safe for URL inclusion, because the 2 extra chars in the symbol table (beyond [0-9a-zA-Z]) are - and _, both which are safe to be included in URLs.

Unfortunately for you, the base64 string may contain 2 extra chars beyond [0-9a-zA-Z]. So read on.

Interpreted, escaped string

If you are alien to these 2 extra characters, you may choose to turn your base64 string into an interpreted, escaped string similar to the interpreted string literals in Go. For example if you want to insert a backslash in an interpreted string literal, you have to double it because backslash is a special character indicating a sequence, e.g.:

fmt.Println("One backspace: \\") // Output: "One backspace: \"

We may choose to do something similar to this. We have to designate a special character: be it 9.

Reasoning: base64.RawURLEncoding uses the charset: A..Za..z0..9-_, so 9 represents the highest code with alphanumeric character (61 decimal = 111101b). See advantage below.
So whenever the base64 string contains a 9, replace it with 99. And whenever the base64 string contains the extra characters, use a sequence instead of them:

9  =>  99
-  =>  90
_  =>  91

This is a simple replacement table which can be captured by a value of strings.Replacer:

var escaper = strings.NewReplacer("9", "99", "-", "90", "_", "91")

And using it:

fmt.Println(escaper.Replace(base64.RawURLEncoding.EncodeToString(uuid)))

// Output:
Ej5FZ90ibEtOkVkJmVUQAAA

This will slightly increase the length as sometimes a sequence of 2 chars will be used instead of 1 char, but the gain will be that only [0-9a-zA-Z] chars will be used, as you wanted. The average length will be less than 1 additional character: 23 chars. Fair trade.

Logic: For simplicity let's assume all possible uuids have equal probability (uuid is not completely random, so this is not the case, but let's set this aside as this is just an estimation). Last base64 symbol will never be a replaceable char (that's why we chose the special char to be 9 instead of like A), 21 chars may turn into a replaceable sequence. The chance for one being replaceable: 3 / 64 = 0.047, so on average this means 21*3/64 = 0.98 sequences which turn 1 char into a 2-char sequence, so this is equal to the number of extra characters.

To decode, use an inverse decoding table captured by the following strings.Replacer:

var unescaper = strings.NewReplacer("99", "9", "90", "-", "91", "_")

Example code to decode an escaped base64 string:

fmt.Println("Verify decoding:")
s := escaper.Replace(base64.RawURLEncoding.EncodeToString(uuid))
dec, err := base64.RawURLEncoding.DecodeString(unescaper.Replace(s))
fmt.Printf("%x, %v\n", dec, err)

Output:

123e4567e89b12d3a456426655440000, <nil>

Try all the examples on the Go Playground.

like image 30
icza Avatar answered Oct 20 '22 17:10

icza


Another option is math/big. While base64 has a constant output of 22 characters, math/big can get down to 2 characters, depending on the input:

package main

import (
   "encoding/base64"
   "fmt"
   "math/big"
)

type uuid [16]byte

func (id uuid) encode() string {
   return new(big.Int).SetBytes(id[:]).Text(62)
}

func main() {
   var id uuid
   for n := len(id); n > 0; n-- {
      id[n - 1] = 0xFF
      s := base64.RawURLEncoding.EncodeToString(id[:])
      t := id.encode()
      fmt.Printf("%v %v\n", s, t)
   }
}

Result:

AAAAAAAAAAAAAAAAAAAA_w 47
AAAAAAAAAAAAAAAAAAD__w h31
AAAAAAAAAAAAAAAAAP___w 18owf
AAAAAAAAAAAAAAAA_____w 4GFfc3
AAAAAAAAAAAAAAD______w jmaiJOv
AAAAAAAAAAAAAP_______w 1hVwxnaA7
AAAAAAAAAAAA_________w 5k1wlNFHb1
AAAAAAAAAAD__________w lYGhA16ahyf
AAAAAAAAAP___________w 1sKyAAIxssts3
AAAAAAAA_____________w 62IeP5BU9vzBSv
AAAAAAD______________w oXcFcXavRgn2p67
AAAAAP_______________w 1F2si9ujpxVB7VDj1
AAAA_________________w 6Rs8OXba9u5PiJYiAf
AAD__________________w skIcqom5Vag3PnOYJI3
AP___________________w 1SZwviYzes2mjOamuMJWv
_____________________w 7N42dgm5tFLK9N8MT7fHC7

https://golang.org/pkg/math/big

like image 37
Zombo Avatar answered Oct 20 '22 16:10

Zombo