Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to print literal unicode string in Javascript?

Tags:

I have an array containing strings with special unicode characters:

var a = [
    ["a", 33],  
    ["h\u016B", 44],
    ["s\u00EF", 51],
    ...
];

When I loop over this array:

for (i=0;i<a.length;i++) {
    document.write(a[i][0] + "<br />");
}

It prints characters with accents:

a
hù
sô
...

and I want:

a
h\u016B
s\u00EF
...

How can I achieve this in Javascript?

like image 314
Jérôme Verstrynge Avatar asked Jun 07 '12 17:06

Jérôme Verstrynge


People also ask

How do you represent Unicode in JavaScript?

In Javascript, the identifiers and string literals can be expressed in Unicode via a Unicode escape sequence. The general syntax is \uXXXX , where X denotes four hexadecimal digits. For example, the letter o is denoted as '\u006F' in Unicode.

What is Unicode string literal?

If the character string literal has a prefix of N, the literal is treated as a Unicode string. When the N prefix is used, the characters in the literal are read as WCHAR characters. Any string literal with non-ASCII characters is treated as a Unicode literal by default.

Is JavaScript string Unicode?

While a JavaScript source file can have any kind of encoding, JavaScript will then convert it internally to UTF-16 before executing it. JavaScript strings are all UTF-16 sequences, as the ECMAScript standard says: When a String contains actual textual data, each element is considered to be a single UTF-16 code unit.

What is Unicode string example?

Unicode is a standard encoding system that is used to represent characters from almost all languages. Every Unicode character is encoded using a unique integer code point between 0 and 0x10FFFF . A Unicode string is a sequence of zero or more code points.


2 Answers

Something like this?

/* Creates a uppercase hex number with at least length digits from a given number */
function fixedHex(number, length){
    var str = number.toString(16).toUpperCase();
    while(str.length < length)
        str = "0" + str;
    return str;
}

/* Creates a unicode literal based on the string */    
function unicodeLiteral(str){
    var i;
    var result = "";
    for( i = 0; i < str.length; ++i){
        /* You should probably replace this by an isASCII test */
        if(str.charCodeAt(i) > 126 || str.charCodeAt(i) < 32)
            result += "\\u" + fixedHex(str.charCodeAt(i),4);
        else
            result += str[i];
    }

    return result;
}

var a = [
    ["a", 33],  
    ["h\u016B", 44],
    ["s\u00EF", 51]
];

var i;
for (i=0;i<a.length;i++) {
    document.write(unicodeLiteral(a[i][0]) + "<br />");
}

Result

a
h\u016B
s\u00EF

JSFiddle

like image 156
Zeta Avatar answered Sep 19 '22 09:09

Zeta


if you have a unicode char and you want it as a string you can do this

x = "h\u016B";
// here the unicode is the second char
uniChar = x.charCodeAt(1).toString(16); // 16b
uniChar = uniChar.toUpperCase(); // it is now 16B
uniChar = "\\u0" + uniChar; // it is now \\u016B
x = x.charAt(0) + uniChar; // x = "h\\u016B" which prints as you wish
like image 29
zeacuss Avatar answered Sep 20 '22 09:09

zeacuss