Convert 128-bit hexadecimal string to base-36 string

Tags:

c++

I have a 128-bit number in hexadecimal stored in a string (from md5, security isn't a concern here) that I'd like to convert to a base-36 string. If it were a 64-bit or less number I'd convert it to a 64-bit integer then use an algorithm I found to convert integers to base-36 strings but this number is too large for that so I'm kind of at a loss for how to approach this. Any guidance would be appreciated.

Edit: After Roland Illig pointed out the hassle of saying 0/O and 1/l over the phone and not gaining much data density over hex I think I may end up staying with hex. I'm still curious though if there is a relatively simple way to convert an hex string of arbitrary length to a base-36 string.

595

asked Nov 27 '10 21:11

eco

2 Answers

A base-36 encoding requires 6 bits to store each token. Same as base-64 but not using 28 of the available tokens. Solving 36^n >= 2^128 yields n >= log(2^128) / log(36) or 25 tokens to encode the value.

A base-64 encoding also requires 6 bits, all possible token values are used. Solving 64^n >= 2^128 yields n >= log(2^128) / log(64) or 22 tokens to encode the value.

Calculating the base-36 encoding requires dividing by powers of 36. No easy shortcuts, you need a division algorithm that can work with 128-bit values. The base-64 encoding is much easier to compute since it is a power of 2. Just take 6 bits at a time and shift by 6, in total 22 times to consume all 128 bits.

Why do you want to use base-36? Base-64 encoders are standard. If you really have a constraint on the token space (you shouldn't, ASCII rulez) then at least use a base-32 encoding. Or any power of 2, base-16 is hex.

138

answered Oct 11 '22 05:10

Hans Passant

If the only thing that is missing is the support for 128 bit unsigned integers, here is the solution for you:

#include <stdio.h>
#include <inttypes.h>

typedef struct {
        uint32_t v3, v2, v1, v0;
} uint128;

static void
uint128_divmod(uint128 *out_div, uint32_t *out_mod, const uint128 *in_num, uint32_t in_den)
{
        uint64_t x = 0;

        x = (x << 32) + in_num->v3;
        out_div->v3 = x / in_den;
        x %= in_den;
        x = (x << 32) + in_num->v2;
        out_div->v2 = x / in_den;
        x %= in_den;
        x = (x << 32) + in_num->v1;
        out_div->v1 = x / in_den;
        x %= in_den;
        x = (x << 32) + in_num->v0;
        out_div->v0 = x / in_den;
        x %= in_den;

        *out_mod = x;
}

int
main(void)
{
        uint128 x = { 0x12345678, 0x12345678, 0x12345678, 0x12345678 };
        uint128 result;
        uint32_t mod;

        uint128_divmod(&result, &mod, &x, 16);
        fprintf(stdout, "%08"PRIx32" %08"PRIx32" %08"PRIx32" %08"PRIx32" rest %08"PRIx32"\n", result.v3, result.v2, result.v1, result.v0, mod);

        return 0;
}

Using this function you can repeatedly compute the mod-36 result, which leads you to the number encoded as base-36.

answered Oct 11 '22 06:10

Roland Illig

Related questions
                            
                                Solving linear systems of equations
                            
                                Strange "Could not deduce template argument for 'T'" error
                            
                                Compiling Quantlib via SWIG for C#
                            
                                C/C++ - posix_memalign()
                            
                                How does boost::ptr_vector deep copy the underlying objects?
                            
                                How to detect header changes in make dependency list
                            
                                Thread related issues and debugging them
                            
                                Difference between double comparisons in gtest (C++) and nunit (C#)
                            
                                C++ and Java objects communication
                            
                                Arguments for and against supporting std::wstring exclusively in cross-platform library
                            
                                Unqualified lookup and (maybe-)dependent base classes
                            
                                Qt: Best way to implement "oscilloscope-like" realtime-plotting
                            
                                How do I read a FIFO/named pipe line by line from a C++/Qt Linux app?
                            
                                Assigning a depth to each node
                            
                                Sending a C++ struct over UDP in Java
                            
                                How to get all properties/variables of a class at runtime/dynamically in C++
                            
                                How does stringstream work internally?
                            
                                Finding the angles for the X, Y and Z axis in 3D - OpenGL/C++
                            
                                C++ equivalent of C# 4.0's "dynamic" keyword?
                            
                                memory management & std::allocator

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With