Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Given the length of an unencoded string, what single formula reveals the length of that string after base-64 encoding?

I am trying to ascertain if there is a standard arithmetical formula which, given the length of an unencoded string, will reveal the length of that string when it has been base-64 encoded.

Here is a list of strings and their base-64 encodings:

A : QQ==
AB : QUI=
ABC : QUJD
ABCD : QUJDRA==
ABCDE : QUJDREU=
ABCDEF : QUJDREVG
ABCDEFG : QUJDREVGRw==
ABCDEFGH : QUJDREVGR0g=
ABCDEFGHI : QUJDREVGR0hJ
ABCDEFGHIJ : QUJDREVGR0hJSg==
ABCDEFGHIJK : QUJDREVGR0hJSks=
ABCDEFGHIJKL : QUJDREVGR0hJSktM

Here are the string lengths of the original strings and the lengths of their base-64 encoded strings (not including the = signs sometimes appended to the end of the encoding):

1 : 2
2 : 3
3 : 4
4 : 6
5 : 7
6 : 8
7 : 10
8 : 11
9 : 12
10 : 14
11 : 15
12 : 16

What single formula, when applied to the numbers on the left, results in the numbers on the right?

like image 214
Rounin - Glory to UKRAINE Avatar asked Jan 25 '23 20:01

Rounin - Glory to UKRAINE


1 Answers

Function https://stackoverflow.com/a/57945696/230983 does exactly what Rounin needs. But if you want to support Unicode characters you cannot rely on the length method, so you need something else to count the number of bytes. A simple way to solve this is to use blobs:

/**
 * Guess the number of Base64 characters required by specified string
 *
 * @param {String} str
 * @returns {Number}
 */
function detectB64CharsLength(str) {
  const blob = new Blob([str]);
  return Math.ceil(blob.size * (4 / 3))
}

/**
 * A dirty hack for encoding Unicode characters to Base64
 * 
 * @link https://developer.mozilla.org/en-US/docs/Web/API/WindowBase64/Base64_encoding_and_decoding#The_Unicode_Problem
 * @param {String} data
 * @returns {String}
 */
function utoa(data) {
  return btoa(unescape(encodeURIComponent(data)));
}

// Run some tests and make sure everything is ok
['a', 'ab', 'ββ', '😀'].map(v => {
  console.log(v, detectB64CharsLength(v), utoa(v));
});
like image 187
Victor Avatar answered Jan 28 '23 10:01

Victor