I am noticing that whenever I base64 encode a string, a "=" is appended at the end. Can I remove this character and then reliably decode it later by adding it back, or is this dangerous? In other words, is the "=" always appended, or only in certain cases?
I want my encoded string to be as short as possible, that's why I want to know if I can always remove the "=" character and just add it back before decoding.
Q Does a base64 string always end with = ? Q Why does an = get appended at the end? A: As a short answer: The last character ( = sign) is added only as a complement (padding) in the final process of encoding a message with a special number of characters.
Need for Padding in Base64:It is mandatory for a proper conversion into Base64 that the resulting data must be converted into sequences of 24 bits each. However, at times, it happens that this length is not satisfied, i.e., a few bits might not be there, or the total bits of the encoded data are fewer than 24.
Although Base64 is a relatively efficient way of encoding binary data it will, on average still increase the file size for more than 25%. This not only increases your bandwidth bill, but also increases the download time.
With padding, a base64 string always has a length that is a multiple of 4 (if it doesn't, the string has been corrupted for sure) and thus code can easily process that string in a loop that processes 4 characters at a time (always converting 4 input characters to three or less output bytes).
The =
is padding. <!------------>
Wikipedia says
An additional pad character is allocated which may be used to force the encoded output into an integer multiple of 4 characters (or equivalently when the unencoded binary text is not a multiple of 3 bytes) ; these padding characters must then be discarded when decoding but still allow the calculation of the effective length of the unencoded text, when its input binary length would not be a multiple of 3 bytes (the last non-pad character is normally encoded so that the last 6-bit block it represents will be zero-padded on its least significant bits, at most two pad characters may occur at the end of the encoded stream).
If you control the other end, you could remove it when in transport, then re-insert it (by checking the string length) before decoding.
Note that the data will not be valid Base64 in transport.
Also, Another user pointed out (relevant to PHP users):
Note that in PHP base64_decode will accept strings without padding, hence if you remove it to process it later in PHP it's not necessary to add it back. – Mahn Oct 16 '14 at 16:33
So if your destination is PHP, you can safely strip the padding and decode without fancy calculations.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With