I have an array containing strings with special unicode characters:
var a = [
["a", 33],
["h\u016B", 44],
["s\u00EF", 51],
...
];
When I loop over this array:
for (i=0;i<a.length;i++) {
document.write(a[i][0] + "<br />");
}
It prints characters with accents:
a
hù
sô
...
and I want:
a
h\u016B
s\u00EF
...
How can I achieve this in Javascript?
In Javascript, the identifiers and string literals can be expressed in Unicode via a Unicode escape sequence. The general syntax is \uXXXX , where X denotes four hexadecimal digits. For example, the letter o is denoted as '\u006F' in Unicode.
If the character string literal has a prefix of N, the literal is treated as a Unicode string. When the N prefix is used, the characters in the literal are read as WCHAR characters. Any string literal with non-ASCII characters is treated as a Unicode literal by default.
While a JavaScript source file can have any kind of encoding, JavaScript will then convert it internally to UTF-16 before executing it. JavaScript strings are all UTF-16 sequences, as the ECMAScript standard says: When a String contains actual textual data, each element is considered to be a single UTF-16 code unit.
Unicode is a standard encoding system that is used to represent characters from almost all languages. Every Unicode character is encoded using a unique integer code point between 0 and 0x10FFFF . A Unicode string is a sequence of zero or more code points.
Something like this?
/* Creates a uppercase hex number with at least length digits from a given number */
function fixedHex(number, length){
var str = number.toString(16).toUpperCase();
while(str.length < length)
str = "0" + str;
return str;
}
/* Creates a unicode literal based on the string */
function unicodeLiteral(str){
var i;
var result = "";
for( i = 0; i < str.length; ++i){
/* You should probably replace this by an isASCII test */
if(str.charCodeAt(i) > 126 || str.charCodeAt(i) < 32)
result += "\\u" + fixedHex(str.charCodeAt(i),4);
else
result += str[i];
}
return result;
}
var a = [
["a", 33],
["h\u016B", 44],
["s\u00EF", 51]
];
var i;
for (i=0;i<a.length;i++) {
document.write(unicodeLiteral(a[i][0]) + "<br />");
}
a h\u016B s\u00EF
JSFiddle
if you have a unicode char and you want it as a string you can do this
x = "h\u016B";
// here the unicode is the second char
uniChar = x.charCodeAt(1).toString(16); // 16b
uniChar = uniChar.toUpperCase(); // it is now 16B
uniChar = "\\u0" + uniChar; // it is now \\u016B
x = x.charAt(0) + uniChar; // x = "h\\u016B" which prints as you wish
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With