Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compressing a Hex String in JavaScript/NodeJS

My app generates links, which contain hex string like: 37c1fbcabbc31f2f8d2ad31ceb91cd8d0d189ca5963dc6d353188d3d5e75b8b3e401d4e74e9b3e02efbff0792cda5c4620cb3b1f84aeb47b8d2225cd40e761a5. I would really like to make them shorter, like the solution mentioned for Ruby in Compressing a hex string in Ruby/Rails.

Is there a way to do this in JavaScript/NodeJS?

like image 574
Tarandeep Gill Avatar asked Aug 02 '12 09:08

Tarandeep Gill


3 Answers

You could use toString and parseInt method, that basically are doing the same thing of the methods you mentioned in the link:

var hexString = "4b3fc1400";
var b36 = parseInt(hexString, 16).toString(36); // "9a29mgw"

And to convert it back, you just need to do the opposite:

hexString = parseInt(b36, 36).toString(16); // "4b3fc1400"

The only problem with your string, is that is too big to be threat as number in JavaScript. You should split them in chunk. JavaScript's numbers are accurate up to 2^53 (plus sign), so the max positive number you can handle is 0x20000000000000 (in hexadecimal, that is 9007199254740992 in decimal); you can use the accuracy to handle the chunk:

var hexString = "37c1fbcabbc31f2f8d2ad31ceb91cd8d0d189ca5963dc6d353188d3d5e75b8b3e401d4e74e9b3e02efbff0792cda5c4620cb3b1f84aeb47b8d2225cd40e761a5"

var b36 = "", b16 = "";

var chunk, intChunk;

// 14 is the length of 0x20000000000000 (2^53 in base 16)

for (var i = 0, max = 14; i < hexString.length; i += max) {
    chunk = hexString.substr(i, max);
    intChunk = parseInt(chunk, 16);

    if (intChunk.toString(16) !== chunk) {
        intChunk = parseInt(hexString.substr(i, max - 1), 16);
        i -= 1;
    }

    b36 += intChunk.toString(36)
}

// 11 is the length of 2gosa7pa2gv (2^53 in base 36)

for (var i = 0, max = 11; i < b36.length; i += max ) {
    chunk = b36.substr(i, max);
    intChunk = parseInt(chunk, 36);

    if (intChunk.toString(36) !== chunk) {
        intChunk = parseInt(b36.substr(i, max - 1), 36);
        i -= 1;
    }

    b16 += intChunk.toString(16)
}

console.log(hexString);
console.log(b36);
console.log(b16);

Update: You could also use a base 62 instead of 36 to compress more, but notice that JS supports up to base 36, so you need to implement that personal notation manually (I believe there are already some implementation around).

like image 168
ZER0 Avatar answered Nov 17 '22 20:11

ZER0


node int-encoder does this, using the strategy already mentioned.

it also supports large numbers

npm install int-encoder

var en = require('int-encoder');

//simple integer conversion
en.encode(12345678); // "ZXP0"
en.decode('ZXP0'); // 12345678

//convert big hex number using optional base argument
en.encode('e6c6b53d3c8160b22dad35a0f705ec09', 16); // 'hbDcW9aE89tzLYjDgyzajJ'
en.decode('hbDcW9aE89tzLYjDgyzajJ', 16); // 'e6c6b53d3c8160b22dad35a0f705ec09'
like image 20
alzclarke Avatar answered Nov 17 '22 22:11

alzclarke


The simplest and fastest thing to do is define a set of 64 safe characters for use in the URL, such as A-Z, a-z, 0-9, _, and $. Then encode every three hex digits (4 bits each) into two safe characters (6 bits each). This requires no multiplication and division, and it can be used on arbitrarily long strings.

You will need to pick a 65th character to use at the end of the string to indicate if the last four-bit piece is used or not. Otherwise you will have an ambiguity for character strings with an even number of characters. Let's call it 2n. Then there are either 3n-1 or 3n hex digits encoded within, but there is no way to tell which. You can follow the sequence with a special character to indicate one of those cases. E.g. a '.' (period).

Note: The last few characters picked here for the set differ from Base64 encoding, since URLs have their own definition of safe punctuation characters. See RFC 1738.

like image 2
Mark Adler Avatar answered Nov 17 '22 21:11

Mark Adler