Given data encoded as a Base64-encoded string, can I somehow calculate the actual length of the raw data that has been encoded only by looking at the length of the Base64-encoded string?
I don't want to traverse the string if not necessary (this also includes string operations on the trailling characters of the encoded string to check for padding).
Each Base64 digit represents exactly 6 bits of data. So, three 8-bits bytes of the input string/binary file (3×8 bits = 24 bits) can be represented by four 6-bit Base64 digits (4×6 = 24 bits). This means that the Base64 version of a string or file will be at least 133% the size of its source (a ~33% increase).
Length of data The output is an ascii string. Base64 uses 4 ascii characters to encode 24-bits (3 bytes) of data.
Although Base64 is a relatively efficient way of encoding binary data it will, on average still increase the file size for more than 25%. This not only increases your bandwidth bill, but also increases the download time.
Base64 is also widely used for sending e-mail attachments. This is required because SMTP – in its original form – was designed to transport 7-bit ASCII characters only. This encoding causes an overhead of 33–37% (33% by the encoding itself; up to 4% more by the inserted line breaks).
The exact length cannot be calculated unless you look at the padding. Without looking for padding, the best you can do is calculate an upper bound for the length by multiplying the encoded-string length with 3/4 (the encoded length is guaranteed to be exactly divisible by 4).
The upper bound calculated thus will be either N
, N+1
or N+2
, where N
is the length of the raw data.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With