Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Byte Array to Uint64 as a String

Let's think about the following situation.

The Go routine creates a byte array where packs a Uint64 number 5577006791947779410 in 8 bytes Big Endian [77, 101, 130, 33, 7, 252, 253, 82].

In JavaScript code I receive these bytes as Uint8Array. We know that JavaScript doesn't currently support Uint64 as safe numeric type and cannot perform bitwise operations on integers larger than 32 bits, so things like buf[0] << 56 will never work.

So what is the process of decoding these bytes directly to numeric string "5577006791947779410"?

P.S. I know there are plenty of libraries for working with big integers in JavaScript, but generally they are huge and provide lots of mathematical operations, which I don't need here. I am looking for a simple modern straightforward solution for just decoding BE-packed Uint64 and Int64 bytes to numeric string. Do you have anything in mind?

like image 776
VisioN Avatar asked Aug 02 '17 08:08

VisioN


2 Answers

EDIT: For converting (U)int64 I would now definitely recommend @LS_DEV's solution. I would use my solution only when having an unknown or larger amount of bytes.

I started with https://stackoverflow.com/a/21668344/3872370 and modified it:

function Int64ToString(bytes, isSigned) {
  const isNegative = isSigned && bytes.length > 0 && bytes[0] >= 0x80;
  const digits = [];
  bytes.forEach((byte, j) => {
    if(isNegative)
      byte = 0x100 - (j == bytes.length - 1 ? 0 : 1) - byte;
    for(let i = 0; byte > 0 || i < digits.length; i++) {
      byte += (digits[i] || 0) * 0x100;
      digits[i] = byte % 10;
      byte = (byte - digits[i]) / 10;
    }
  });
  return (isNegative ? '-' : '') + digits.reverse().join('');
}

const tests = [
  {
    inp: [77, 101, 130, 33, 7, 252, 253, 82],
    signed: false,
    expectation: '5577006791947779410'
  },
  {
    inp: [255, 255, 255, 255, 255, 255, 255, 255],
    signed: true,
    expectation: '-1'
  },
];

tests.forEach(test => {
  const result = Int64ToString(test.inp, test.signed);
  console.log(`${result} ${result !== test.expectation ? '!' : ''}=== ${test.expectation}`);
});

At first the sign gets calculated by checking if the topmost bit is set (bytes[0] > 128). For negative numbers the bits have to be negated (255 - byte) and 1 has to be added to the number (therefore 256 instead of 255 for the last byte).

The basic idea of the forEach loop is to split each byte into its decimal digits (byte % 10 and calculating the overhead (byte - digits[i]) / 10 resp. Math.floor(byte / 10) for the next digit). For the next byte one has to add the shifted result of the last bytes' digits (byte += digits[i] * 256 resp. digits[i] << 8).

That code is optimized for shortness, simplicity and flexibility. If you are working with strings instead of bytes or numbers and don't want to use any libraries it appears that conversion performance doesn't really matter. Otherwise the function could be optimized for performance: Up to four bytes could be treated simultaneously, one only has to replace the 0x100 and 0x80, additionally (with only two byte groups remaining in the case of an (U)Int64) the forEach loop can be unrolled. Grouping the decimal digits probably won't increase performance since the resulting strings would have to be padded with zeros, introducing the need of removing leading zeros in the end result.

like image 160
Stephan Avatar answered Oct 15 '22 12:10

Stephan


Another approach: divide problem in two uint32 to keep calculations manageable.

Consider lower and higher uint32 (l and h). Full number could be written as h*0x100000000+l. Considering decimal, one could also consider lower 9 digits and remaining higher digits (ld and hd): ld=(h*0x100000000+l)%1000000000 and hd=(h*0x100000000+l)/1000000000. With some arithmetic and algebra operators properties, one can break those operation into safe "half" 64bit operations and compose string at ending.

function int64_to_str(a, signed) {
  const negative = signed && a[0] >= 128;
  const H = 0x100000000, D = 1000000000;
  let h = a[3] + a[2] * 0x100 + a[1] * 0x10000 + a[0]*0x1000000;
  let l = a[7] + a[6] * 0x100 + a[5] * 0x10000 + a[4]*0x1000000;
  if(negative) {
     h = H - 1 - h;
     l = H - l;
  }
  const hd = Math.floor(h * H / D + l / D);
  const ld = (((h % D) * (H % D)) % D + l) % D;
  const ldStr = ld + '';
  return (negative ? '-' : '') +
         (hd != 0 ? hd + '0'.repeat(9 - ldStr.length) : '') + ldStr;
}

let result = int64_to_str([77, 101, 130, 33, 7, 252, 253, 82], false);
let expectation = '5577006791947779410';
console.log(result + ' ' + (result === expectation ? '===' : '!==') + ' ' + expectation);

result = int64_to_str([255, 255, 255, 255, 255, 255, 255, 255], true);
expectation = '-1';
console.log(result + ' ' + (result === expectation ? '===' : '!==') + ' ' + expectation);

As detailed in the comments that algorithm works even though (h % D) * (H % D) can get larger than Number.MAX_SAFE_INTEGER, because the lost bits were nevertheless zero.

like image 37
LS_ᴅᴇᴠ Avatar answered Oct 15 '22 10:10

LS_ᴅᴇᴠ