Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get the Unicode code point for a character in Javascript?

I'm using a barcode scanner to read a barcode on my website (the website is made in OpenUI5).

The scanner works like a keyboard that types the characters it reads. At the end and the beginning of the typing it uses a special character. These characters are different for every type of scanner.

Some possible characters are:

In my code I use if (oModelScanner.oData.scanning && oEvent.key == "\u2584") to check if the input from the scanner is ▄.

Is there any way to get the code from that character in the \uHHHH style? (with the HHHH being the hexadecimal code for the character)

I tried the charCodeAt but this returns the decimal code.

With the codePointAt examples they make the code I need into a decimal code so I need a reverse of this.

like image 268
Jungkook Avatar asked Dec 28 '17 14:12

Jungkook


People also ask

How do you find the Unicode of a character?

To insert a Unicode character, type the character code, press ALT, and then press X. For example, to type a dollar symbol ($), type 0024, press ALT, and then press X.

How do you represent Unicode in JavaScript?

In Javascript, the identifiers and string literals can be expressed in Unicode via a Unicode escape sequence. The general syntax is \uXXXX , where X denotes four hexadecimal digits. For example, the letter o is denoted as '\u006F' in Unicode.

What is a code point in Unicode?

A code point is a number assigned to represent an abstract character in a system for representing text (such as Unicode). In Unicode, a code point is expressed in the form "U+1234" where "1234" is the assigned number. For example, the character "A" is assigned a code point of U+0041.

What is code point in JavaScript?

In JavaScript, codePointAt() is a string method that is used to retrieve the Unicode code point (that may not be representable in a single UTF-16 code unit) for a character at a specific position in a string.


2 Answers

Javascript strings have a method codePointAt which gives you the integer representing the Unicode point value. You need to use a base 16 (hexadecimal) representation of that number if you wish to format the integer into a four hexadecimal digits sequence (as in the response of Nikolay Spasov).

var hex = "▄".codePointAt(0).toString(16);
var result = "\\u" + "0000".substring(0, 4 - hex.length) + hex;

However it would probably be easier for you to check directly if you key code point integer match the expected code point

oEvent.key.codePointAt(0) === '▄'.codePointAt(0);

Note that "symbol equality" can actually be trickier: some symbols are defined by surrogate pairs (you can see it as the combination of two halves defined as four hexadecimal digits sequence).

For this reason I would recommend to use a specialized library.

you'll find more details in the very relevant article by Mathias Bynens

like image 108
laurent Avatar answered Sep 19 '22 20:09

laurent


If you want to print the multiple code points of a character, e.g., an emoji, you can do this:

const facepalm = "🤦🏼‍♂️";
const codePoints = Array.from(facepalm)
  .map((v) => v.codePointAt(0).toString(16))
  .map((hex) => "\\u{" + hex + "}");
console.log(codePoints);

["\u{1f926}", "\u{1f3fc}", "\u{200d}", "\u{2642}", "\u{fe0f}"]

If you are wondering about the components and the length of 🤦🏼‍♂️, check out this article.

like image 24
Sanghyun Lee Avatar answered Sep 18 '22 20:09

Sanghyun Lee