Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove trailing "=" when base64 encoding

Tags:

base64

I am noticing that whenever I base64 encode a string, a "=" is appended at the end. Can I remove this character and then reliably decode it later by adding it back, or is this dangerous? In other words, is the "=" always appended, or only in certain cases?

I want my encoded string to be as short as possible, that's why I want to know if I can always remove the "=" character and just add it back before decoding.

like image 437
Steve N Avatar asked Dec 20 '10 18:12

Steve N


People also ask

Does Base64 have to end with ==?

Q Does a base64 string always end with = ? Q Why does an = get appended at the end? A: As a short answer: The last character ( = sign) is added only as a complement (padding) in the final process of encoding a message with a special number of characters.

Is Base64 padding necessary?

Need for Padding in Base64:It is mandatory for a proper conversion into Base64 that the resulting data must be converted into sequences of 24 bits each. However, at times, it happens that this length is not satisfied, i.e., a few bits might not be there, or the total bits of the encoded data are fewer than 24.

Does Base64 encoding reduce length?

Although Base64 is a relatively efficient way of encoding binary data it will, on average still increase the file size for more than 25%. This not only increases your bandwidth bill, but also increases the download time.

Why does Base64 have padding?

With padding, a base64 string always has a length that is a multiple of 4 (if it doesn't, the string has been corrupted for sure) and thus code can easily process that string in a loop that processes 4 characters at a time (always converting 4 input characters to three or less output bytes).


1 Answers

The = is padding. <!------------>

Wikipedia says

An additional pad character is allocated which may be used to force the encoded output into an integer multiple of 4 characters (or equivalently when the unencoded binary text is not a multiple of 3 bytes) ; these padding characters must then be discarded when decoding but still allow the calculation of the effective length of the unencoded text, when its input binary length would not be a multiple of 3 bytes (the last non-pad character is normally encoded so that the last 6-bit block it represents will be zero-padded on its least significant bits, at most two pad characters may occur at the end of the encoded stream).

If you control the other end, you could remove it when in transport, then re-insert it (by checking the string length) before decoding.
Note that the data will not be valid Base64 in transport.

Also, Another user pointed out (relevant to PHP users):

Note that in PHP base64_decode will accept strings without padding, hence if you remove it to process it later in PHP it's not necessary to add it back. – Mahn Oct 16 '14 at 16:33

So if your destination is PHP, you can safely strip the padding and decode without fancy calculations.

like image 80
SLaks Avatar answered Oct 18 '22 10:10

SLaks