Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert 128-bit hexadecimal string to base-36 string

Tags:

c++

I have a 128-bit number in hexadecimal stored in a string (from md5, security isn't a concern here) that I'd like to convert to a base-36 string. If it were a 64-bit or less number I'd convert it to a 64-bit integer then use an algorithm I found to convert integers to base-36 strings but this number is too large for that so I'm kind of at a loss for how to approach this. Any guidance would be appreciated.

Edit: After Roland Illig pointed out the hassle of saying 0/O and 1/l over the phone and not gaining much data density over hex I think I may end up staying with hex. I'm still curious though if there is a relatively simple way to convert an hex string of arbitrary length to a base-36 string.

like image 595
eco Avatar asked Nov 27 '10 21:11

eco


People also ask

How to convert hex string to number?

To convert a hexadecimal string to a numberUse the ToInt32(String, Int32) method to convert the number expressed in base-16 to an integer. The first argument of the ToInt32(String, Int32) method is the string to convert. The second argument describes what base the number is expressed in; hexadecimal is base 16.

How to convert hex in Python?

hex() function is one of the built-in functions in Python3, which is used to convert an integer number into it's corresponding hexadecimal form. Syntax : hex(x) Parameters : x - an integer number (int object) Returns : Returns hexadecimal string.

How to convert hex string to int in Python?

The most common and effective way to convert hex into an integer in Python is to use the type-casting function int() . This function accepts two arguments: one mandatory argument, which is the value to be converted, and a second optional argument, which is the base of the number format with the default as 10 .

Can you convert string to hex in python?

To convert Python String to hex, use the inbuilt hex() method. The hex() is a built-in method that converts the integer to a corresponding hexadecimal string. For example, use the int(x, base) function with 16 to convert a string to an integer.


2 Answers

A base-36 encoding requires 6 bits to store each token. Same as base-64 but not using 28 of the available tokens. Solving 36^n >= 2^128 yields n >= log(2^128) / log(36) or 25 tokens to encode the value.

A base-64 encoding also requires 6 bits, all possible token values are used. Solving 64^n >= 2^128 yields n >= log(2^128) / log(64) or 22 tokens to encode the value.

Calculating the base-36 encoding requires dividing by powers of 36. No easy shortcuts, you need a division algorithm that can work with 128-bit values. The base-64 encoding is much easier to compute since it is a power of 2. Just take 6 bits at a time and shift by 6, in total 22 times to consume all 128 bits.

Why do you want to use base-36? Base-64 encoders are standard. If you really have a constraint on the token space (you shouldn't, ASCII rulez) then at least use a base-32 encoding. Or any power of 2, base-16 is hex.

like image 138
Hans Passant Avatar answered Oct 11 '22 05:10

Hans Passant


If the only thing that is missing is the support for 128 bit unsigned integers, here is the solution for you:

#include <stdio.h>
#include <inttypes.h>

typedef struct {
        uint32_t v3, v2, v1, v0;
} uint128;

static void
uint128_divmod(uint128 *out_div, uint32_t *out_mod, const uint128 *in_num, uint32_t in_den)
{
        uint64_t x = 0;

        x = (x << 32) + in_num->v3;
        out_div->v3 = x / in_den;
        x %= in_den;
        x = (x << 32) + in_num->v2;
        out_div->v2 = x / in_den;
        x %= in_den;
        x = (x << 32) + in_num->v1;
        out_div->v1 = x / in_den;
        x %= in_den;
        x = (x << 32) + in_num->v0;
        out_div->v0 = x / in_den;
        x %= in_den;

        *out_mod = x;
}

int
main(void)
{
        uint128 x = { 0x12345678, 0x12345678, 0x12345678, 0x12345678 };
        uint128 result;
        uint32_t mod;

        uint128_divmod(&result, &mod, &x, 16);
        fprintf(stdout, "%08"PRIx32" %08"PRIx32" %08"PRIx32" %08"PRIx32" rest %08"PRIx32"\n", result.v3, result.v2, result.v1, result.v0, mod);

        return 0;
}

Using this function you can repeatedly compute the mod-36 result, which leads you to the number encoded as base-36.

like image 30
Roland Illig Avatar answered Oct 11 '22 06:10

Roland Illig