the following doesn't seem correct
"🚀".charCodeAt(0); // returns 55357 in both Firefox and Chrome
that's a Unicode character named ROCKET (U+1F680), the decimal should be 128640.
this is for a unicode app am writing. Seems most but not ALL chars from unicode 6 all stuck at 55357.
how can I fix it? Thanks.
JavaScript is using UTF-16 encoding; see this article for details:
Characters outside the BMP, e.g. U+1D306 tetragram for centre (𝌆), can only be encoded in UTF-16 using two 16-bit code units: 0xD834 0xDF06. This is called a surrogate pair. Note that a surrogate pair only represents a single character.
The first code unit of a surrogate pair is always in the range from 0xD800 to 0xDBFF, and is called a high surrogate or a lead surrogate.
The second code unit of a surrogate pair is always in the range from 0xDC00 to 0xDFFF, and is called a low surrogate or a trail surrogate.
You can decode the surrogate pair like this:
codePoint = (text.charCodeAt(0) - 0xD800) * 0x400 + text.charCodeAt(1) - 0xDC00 + 0x10000
Complete code can be found can be found in the Mozilla documentation for charCodeAt.
Tried this out:
> "🚀".charCodeAt(0);
55357
> "🚀".charCodeAt(1);
56960
Related questions on SO:
You might want to take a look at this too:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With