I'm trying to base64 encode a utf8 string containing Thai characters. I'm using the browser's built in btoa
function. It works for ascii text, however Thai is causing it to throw a INVALID_CHARACTER_ERR: DOM Exception 5
exception.
Here's a sample that fails (the character that looks like an "n" is Thai)
btoa('aก')
What do I need to do to base64 encode non-ascii strings?
You can encode arbitrary bytes in base64 (which is why the encoding functions don't return errors). Only decoding can fail. The whole point of Base64 encoding is to take arbitrary bytes and reduce them to printable ASCII characters. There is no such thing as an invalid character for encoding, only for decoding.
btoa(): accepts a string where each character represents an 8bit byte. If you pass a string containing characters that cannot be represented in 8 bits, it will probably break. Probably that's why btoa is deprecated.
Encoding and Decoding Strings with Base64 btoa() and atob() are two Base64 helper functions that are a core part of the HTML specification and available in all modern browsers.
var Base64 = {
encode: function(s) {
return btoa(unescape(encodeURIComponent(s)));
},
decode: function(s) {
return decodeURIComponent(escape(atob(s)));
}
};
Unfortunately btoa/atob aren't specified in any standard, but the implementations in firefox and webkit both fail on multibyte characters so even if they were now specified those builtin functions would not be able to support multibyte characters (as the input and output strings would necessarily change).
It would seem your only option would be to roll your own base64 encode+decode routines
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With