Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What character replacements should be performed to make base 64 encoding URL safe?

In looking at URL safe base 64 encoding, I've found it to be a very non-standard thing. Despite the copious number of built in functions that PHP has, there isn't one for URL safe base 64 encoding. On the manual page for base64_encode(), most of the comments suggest using that function, wrapped with strtr():

function base64_url_encode($input)
{
     return strtr(base64_encode($input), '+/=', '-_,');
}

The only Perl module I could find in this area is MIME::Base64::URLSafe (source), which performs the following replacement internally:

sub encode ($) {
    my $data = encode_base64($_[0], '');
    $data =~ tr|+/=|\-_|d;
    return $data;
}

Unlike the PHP function above, this Perl version drops the '=' (equals) character entirely, rather than replacing it with ',' (comma) as PHP does. Equals is a padding character, so the Perl module replaces them as needed upon decode, but this difference makes the two implementations incompatible.

Finally, the Python function urlsafe_b64encode(s) keeps the '=' padding around, prompting someone to put up this function to remove the padding which shows prominently in Google results for 'python base64 url safe':

from base64 import urlsafe_b64encode, urlsafe_b64decode

def uri_b64encode(s):
    return urlsafe_b64encode(s).strip('=')

def uri_b64decode(s):
    return urlsafe_b64decode(s + '=' * (4 - len(s) % 4))

The desire here is to have a string that can be included in a URL without further encoding, hence the ditching or translation of the characters '+', '/', and '='. Since there isn't a defined standard, what is the right way?

like image 629
Drew Stephens Avatar asked Sep 11 '09 17:09

Drew Stephens


People also ask

What characters are used for Base64?

Base64 only contains A–Z , a–z , 0–9 , + , / and = . So the list of characters not to be used is: all possible characters minus the ones mentioned above. For special purposes .

Is Base64 safe for URL?

By consisting only of ASCII characters, base64 strings are generally url-safe, and that's why they can be used to encode data in Data URLs.

Is Base64 encoding safe?

Base64 is a commonly used encoding scheme originally designed as a way to represent binary data in an ASCII text format. Like almost every aspect of computer technology today, base64 if not used properly, can result is increased security risk.


2 Answers

There does appear to be a standard, it is RFC 3548, Section 4, Base 64 Encoding with URL and Filename Safe Alphabet:

This encoding is technically identical to the previous one, except for the 62:nd and 63:rd alphabet character, as indicated in table 2.

+ and / should be replaced by - (minus) and _ (understrike) respectively. Any incompatible libraries should be wrapped so they conform to RFC 3548.

Note that this requires that you URL encode the (pad) = characters, but I prefer that over URL encoding the + and / characters from the standard base64 alphabet.

like image 113
Grant Wagner Avatar answered Sep 21 '22 00:09

Grant Wagner


I don't think there is right or wrong. But most popular encoding is

'+/=' => '-_.'

This is widely used by Google, Yahoo (they call it Y64). The most url-safe version of encoders I used on Java, Ruby supports this character set.

like image 39
ZZ Coder Avatar answered Sep 18 '22 00:09

ZZ Coder