Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Base64: What is the worst possible increase in space usage?

People also ask

How much bigger does Base64 increase size?

Encoded size increase This means that the Base64 version of a string or file will be at least 133% the size of its source (a ~33% increase). The increase may be larger if the encoded data is small.

How much overhead does Base64 add?

Base64 is also widely used for sending e-mail attachments. This is required because SMTP – in its original form – was designed to transport 7-bit ASCII characters only. This encoding causes an overhead of 33–37% (33% by the encoding itself; up to 4% more by the inserted line breaks).

How big can a Base64 string be?

Length of data The output is an ascii string. Base64 uses 4 ascii characters to encode 24-bits (3 bytes) of data. To encode, it splits up the three bytes into 4 6-bit numbers. A 6-bit number can represent 64 possible value.

How many bytes does Base64 add?

Base64 encodes each set of three bytes into four bytes. In addition the output is padded to always be a multiple of four. So, for a 16kB array, the base-64 representation will be ceil(16*1024/3)*4 = 21848 bytes long ~= 21.8kB. A rough approximation would be that the size of the data is increased to 4/3 of the original.


Base64 encodes each set of three bytes into four bytes. In addition the output is padded to always be a multiple of four.

This means that the size of the base-64 representation of a string of size n is:

ceil(n / 3) * 4

So, for a 16kB array, the base-64 representation will be ceil(16*1024/3)*4 = 21848 bytes long ~= 21.8kB.

A rough approximation would be that the size of the data is increased to 4/3 of the original.


From Wikipedia

Note that given an input of n bytes, the output will be (n + 2 - ((n + 2) % 3)) / 3 * 4 bytes long, so that the number of output bytes per input byte converges to 4 / 3 or 1.33333 for large n.

So 16kb * 4 / 3 gives very little over 21.3' kb, or 21848 bytes, to be exact.

Hope this helps


16kb is 131,072 bits. Base64 packs 24-bit buffers into four 6-bit characters apiece, so you would have 5,462 * 4 = 21,848 bytes.


Since the question was about the worst possible increase, I must add that there are usually line breaks at around each 80 characters. This means that if you are saving base64 encoded data into a text file on Windows it will add 2 bytes, on Linux 1 byte for each line.

The increase from the actual encoding has been described above.


This is a future reference for myself. Since the question is on worst case, we should take line breaks into account. While RFC 1421 defines maximum line length to be 64 char, RFC 2045 (MIME) states there'd be 76 char in one line at most.

The latter is what C# library has implemented. So in Windows environment where a line break is 2 chars (\r\n), we get this: Length = Floor(Ceiling(N/3) * 4 * 78 / 76)

Note: Flooring is because during my test with C#, if the last line ends at exactly 76 chars, no line-break follows.

I can prove it by running the following code:

byte[] bytes = new byte[16 * 1024];
Console.WriteLine(Convert.ToBase64String(bytes, Base64FormattingOptions.InsertLineBreaks).Length);

The answer for 16 kBytes encoded to base64 with 76-char lines: 22422 chars

Assume in Linux it'd be Length = Floor(Ceiling(N/3) * 4 * 77 / 76) but I didn't get around to test it on my .NET core yet.