Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Need to escape non-ASCII characters in JavaScript

Tags:

javascript

Is there any function to do the following?

var specialStr = 'ipsum áá éé lore';
var encodedStr = someFunction(specialStr);
// then encodedStr should be like 'ipsum \u00E1\u00E1 \u00E9\u00E9 lore'

I need to encode the characters that are out of ASCII range, and need to do it with that encoding. I don't know its name. Is it Unicode maybe?

like image 953
Hanoi Avatar asked Sep 21 '11 12:09

Hanoi


People also ask

How do you remove non-ASCII characters?

Use . replace() method to replace the Non-ASCII characters with the empty string.

Do you need to escape in JavaScript?

In the HTML we use double-quotes and in the JavaScript single-quotes, so any quotes within the JavaScript code will need to be escaped so that they don't conflict with either the HTML or JavaScript quotes.

What is Escape () in JS?

Definition and Usage The escape() function was deprecated in JavaScript version 1.5. Use encodeURI() or encodeURIComponent() instead. The escape() function encodes a string. This function makes a string portable, so it can be transmitted across any network to any computer that supports ASCII characters.


2 Answers

This should do the trick:

function padWithLeadingZeros(string) {
    return new Array(5 - string.length).join("0") + string;
}

function unicodeCharEscape(charCode) {
    return "\\u" + padWithLeadingZeros(charCode.toString(16));
}

function unicodeEscape(string) {
    return string.split("")
                 .map(function (char) {
                     var charCode = char.charCodeAt(0);
                     return charCode > 127 ? unicodeCharEscape(charCode) : char;
                 })
                 .join("");
}

For example:

var specialStr = 'ipsum áá éé lore';
var encodedStr = unicodeEscape(specialStr);

assert.equal("ipsum \\u00e1\\u00e1 \\u00e9\\u00e9 lore", encodedStr);
like image 167
Domenic Avatar answered Oct 05 '22 21:10

Domenic


If you need hex encoding rather than unicode then you can simplify @Domenic's answer to:

"aäßåfu".replace(/./g, function(c){return c.charCodeAt(0)<128?c:"\\x"+c.charCodeAt(0).toString(16)})

returns: "a\xe4\xdf\xe5fu"
like image 39
Max Murphy Avatar answered Oct 05 '22 21:10

Max Murphy