Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to split Unicode string to characters in JavaScript

For long time we used naive approach to split strings in JS:

someString.split('');

But popularity of emoji forced us to change this approach - emoji characters (and other non-BMP characters) like πŸ˜‚ are made of two "characters'.

String.fromCodePoint(128514).split(''); // array of 2 characters; can't embed due to StackOverflow limitations

So what is modern, correct and performant approach to this task?

like image 839
Ginden Avatar asked Feb 05 '16 11:02

Ginden


People also ask

Can you split a string in JavaScript?

The split() method splits a string into an array of substrings. The split() method returns the new array. The split() method does not change the original string. If (" ") is used as separator, the string is split between words.

How do you split a character in JavaScript?

Answer: Use the split() Method.

Can I use Unicode in JavaScript?

In Javascript, the identifiers and string literals can be expressed in Unicode via a Unicode escape sequence. The general syntax is \uXXXX , where X denotes four hexadecimal digits. For example, the letter o is denoted as '\u006F' in Unicode.

What is Unicode character set in JavaScript?

Unicode is a superset of ASCII and Latin-1 and supports virtually every written language currently used on the planet. ECMAScript 3 requires JavaScript implementations to support Unicode version 2.1 or later, and ECMAScript 5 requires implementations to support Unicode 3 or later.


1 Answers

Using spread in array literal :

const str = "πŸŒπŸ€–πŸ˜ΈπŸŽ‰";
console.log([...str]);

Using for...of :

function split(str){
  const arr = [];
  for(const char of str)
    arr.push(char)
   
  return arr;
}

const str = "πŸŒπŸ€–πŸ˜ΈπŸŽ‰";
console.log(split(str));
like image 101
Omkar76 Avatar answered Sep 27 '22 20:09

Omkar76