In looking at URL safe base 64 encoding, I've found it to be a very non-standard thing. Despite the copious number of built in functions that PHP has, there isn't one for URL safe base 64 encoding. On the manual page for base64_encode()
, most of the comments suggest using that function, wrapped with strtr()
:
function base64_url_encode($input)
{
return strtr(base64_encode($input), '+/=', '-_,');
}
The only Perl module I could find in this area is MIME::Base64::URLSafe (source), which performs the following replacement internally:
sub encode ($) {
my $data = encode_base64($_[0], '');
$data =~ tr|+/=|\-_|d;
return $data;
}
Unlike the PHP function above, this Perl version drops the '=' (equals) character entirely, rather than replacing it with ',' (comma) as PHP does. Equals is a padding character, so the Perl module replaces them as needed upon decode, but this difference makes the two implementations incompatible.
Finally, the Python function urlsafe_b64encode(s) keeps the '=' padding around, prompting someone to put up this function to remove the padding which shows prominently in Google results for 'python base64 url safe':
from base64 import urlsafe_b64encode, urlsafe_b64decode
def uri_b64encode(s):
return urlsafe_b64encode(s).strip('=')
def uri_b64decode(s):
return urlsafe_b64decode(s + '=' * (4 - len(s) % 4))
The desire here is to have a string that can be included in a URL without further encoding, hence the ditching or translation of the characters '+', '/', and '='. Since there isn't a defined standard, what is the right way?
Base64 only contains A–Z , a–z , 0–9 , + , / and = . So the list of characters not to be used is: all possible characters minus the ones mentioned above. For special purposes .
By consisting only of ASCII characters, base64 strings are generally url-safe, and that's why they can be used to encode data in Data URLs.
Base64 is a commonly used encoding scheme originally designed as a way to represent binary data in an ASCII text format. Like almost every aspect of computer technology today, base64 if not used properly, can result is increased security risk.
There does appear to be a standard, it is RFC 3548, Section 4, Base 64 Encoding with URL and Filename Safe Alphabet:
This encoding is technically identical to the previous one, except for the 62:nd and 63:rd alphabet character, as indicated in table 2.
+
and /
should be replaced by - (minus)
and _ (understrike)
respectively. Any incompatible libraries should be wrapped so they conform to RFC 3548.
Note that this requires that you URL encode the (pad) =
characters, but I prefer that over URL encoding the +
and /
characters from the standard base64 alphabet.
I don't think there is right or wrong. But most popular encoding is
'+/=' => '-_.'
This is widely used by Google, Yahoo (they call it Y64). The most url-safe version of encoders I used on Java, Ruby supports this character set.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With