How does any textarea in my browser handle a seemingly 2 chars represented as one? For example: <pre class="prettyprint"><code>"👍".length // -> 2 </code></pre> More examples here: https://jsbin.com/zazexenigi/edit?js,console

I believe rpadovani answered your "why" question best, but for an implementation that will get you a proper glyph count in this situation, Lodash has tacked this problem in their toArray module. For example, <pre class="prettyprint"><code>_.toArray('12👪').length; // --> 3 </code></pre> Or, if you want to knock a few arbitrary characters off a string, you manipulate and rejoin the array, like: <pre class="prettyprint"><code>_.toArray("👪trimToEightGlyphs").splice(0,8).join(''); // --> '👪trimToE' </code></pre>

Why is "👍".length === 2?

Tags:

javascript

utf-8

emoji

How does any textarea in my browser handle a seemingly 2 chars represented as one?

For example:

"👍".length
// -> 2

More examples here: https://jsbin.com/zazexenigi/edit?js,console

974

asked Jul 13 '16 07:07

filype

2 Answers

Javascript uses UTF-16 (source) to manage strings.

In UTF-16 there are 1,112,064 possible characters. Now, each character uses code points to be represented(*). In UTF-16 one code-point use two bytes (16 bits) to be saved. This means that with one code point you can have only 65536 different characters.

This means some characters has to be represented with two code points.

String.length() returns the number of code units in the string, not the number of characters.

MDN explains quite well the thing on the page about String.length()

This property returns the number of code units in the string. UTF-16, the string format used by JavaScript, uses a single 16-bit code unit to represent the most common characters, but needs to use two code units for less commonly-used characters, so it's possible for the value returned by length to not match the actual number of characters in the string.

(*): Actually some chars, in the range 010000 – 03FFFF and 040000 – 10FFFF can use up to 4 bytes (32 bits) per code point, but this doesn't change the answer: some chars requires more than 2 bytes to be represented, so they need more than 1 code point.

This means that some chars that need more than 16 bits have a length of 1 anyway. Like 0x03FFFF, it needs 21 bits, but it uses only one code unit in UTF-16, so its String.length is 1.

console.log(String.fromCharCode(0x03FFFF).length)

answered Sep 20 '22 03:09

rpadovani

I believe rpadovani answered your "why" question best, but for an implementation that will get you a proper glyph count in this situation, Lodash has tacked this problem in their toArray module.

For example,

_.toArray('12👪').length; // --> 3

Or, if you want to knock a few arbitrary characters off a string, you manipulate and rejoin the array, like:

_.toArray("👪trimToEightGlyphs").splice(0,8).join(''); // --> '👪trimToE'

answered Sep 23 '22 03:09

Evan Rusackas

Related questions
                            
                                Animated glowing border using CSS/ JS?
                            
                                How to get JSON data from the URL (REST API) to UI using jQuery or plain JavaScript?
                            
                                jQuery - How to adjust CSS filter (blur)?
                            
                                AngularJS Directive: How do I hide the alert using timeout?
                            
                                paper.js how to set up multiple canvases using only javascript
                            
                                How to convert password into md5 in jquery? [duplicate]
                            
                                Asynchronous Vs synchronous in NodeJS
                            
                                Bootstrap image popup
                            
                                Disabling horizontal split in Ace Editor
                            
                                Export Highcharts to PDF (using javascript and local server - no internet connection)
                            
                                array.sort() does not work in IE 11 with compareFunction [duplicate]
                            
                                How can I focus on a selectize select box?
                            
                                File download a byte array as a file in javascript / Extjs
                            
                                javascript - Better Way to Escape Dollar Signs in the String Used By String.prototype.replace
                            
                                JavaScript Not able to set focus to first li element within ul
                            
                                How to insert tabs inside ion content in Ionic
                            
                                How to iterate over a JSON array in Node.js?
                            
                                React.js :You called `setState` with a callback that isn't callable
                            
                                What is the best way to pass functions between inner components in AngularJS 1.5?
                            
                                Do I have to save react component files with a jsx extension

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With