JavaScript remove ZERO WIDTH SPACE (unicode 8203) from string

Q: How do you remove zero width space from a string?

To remove zero-width space characters from a JavaScript string, we can use the JavaScript string replace method that matches all zero-width characters and replace them with empty strings. Zero-width characters in Unicode includes: U+200B zero width space. U+200C zero-width non-joiner Unicode code point.

Q: How do you find the zero width of a character?

Format character that affects the layout of text or the operation of text processes, but is not normally rendered. Signified by the Unicode designation "Cf" (other, format). The value is 15. The unicode codepoint 0x200b is known as "zero width space".

Q: How do you write a zero width Unicode character?

Encoding. The zero-width space character is encoded in Unicode as U+200B ZERO WIDTH SPACE ( &NegativeMediumSpace;, &NegativeThickSpace;, &NegativeThinSpace;, &NegativeVeryThinSpace;, &ZeroWidthSpace;), and input as &#8203; or &#x200B; .

Q: What is U200b?

U200b is a Unicode non-printing space. It's meant to assist typographers in doing page layouts, and it's extremely useful in certain languages that don't use the Roman alphabet.

Tags:

javascript

regex

unicode

I'm writing some javascript that processes website content. My efforts are being thwarted by SharePoint text editor's tendency to put the "zero width space" character in the text when the user presses backspace. The character's unicode value is 8203, or B200 in hexadecimal. I've tried to use the default "replace" function to get rid of it. I've tried many variants, none of them worked:

var a = "om"; //the invisible character is between o and m  var b = a.replace(/\u8203/g,''); = a.replace(/\uB200/g,''); = a.replace("\\uB200",'');

and so on and so forth. I've tried quite a few variations on this theme. None of these expressions work (tested in Chrome and Firefox) The only thing that works is typing the actual character in the expression:

var b = a.replace("",''); //it's there, believe me

This poses potential problems. The character is invisible so that line in itself doesn't make sense. I can get around that with comments. But if the code is ever reused, and the file is saved using non-Unicode encoding, (or when it's deployed to SharePoint, there's not guarantee it won't mess up encoding) it will stop working. Is there a way to write this using the unicode notation instead of the character itself?

[My ramblings about the character]

In case you haven't met this character, (and you probably haven't, seeing as it's invisible to the naked eye, unless it broke your code and you discovered it while trying to locate the bug) it's a real a-hole that will cause certain types of pattern matching to malfunction. I've caged the beast for you:

[] <- careful, don't let it escape.

If you want to see it, copy those brackets into a text editor and then iterate your cursor through them. You'll notice you'll need three steps to pass what seems like 2 characters, and your cursor will skip a step in the middle.

845

asked Jun 13 '14 12:06

Shaggydog

1 Answers

The number in a unicode escape should be in hex, and the hex for 8203 is 200B (which is indeed a Unicode zero-width space), so:

var b = a.replace(/\u200B/g,'');

Live Example:

var a = "om"; //the invisible character is between o and m var b = a.replace(/\u200B/g,''); console.log("a.length = " + a.length);      // 3 console.log("a === 'om'? " + (a === 'om')); // false console.log("b.length = " + b.length);      // 2 console.log("b === 'om'? " + (b === 'om')); // true

105

answered Sep 22 '22 15:09

T.J. Crowder

Related questions
                            
                                Angular module config not called
                            
                                remove object from array with just the object's reference
                            
                                How to change icon on Google map marker
                            
                                Passing cookies in NodeJs http request
                            
                                Sequelize Many to Many - How to create a new record and update join table
                            
                                When do I need to use hasOwnProperty()?
                            
                                Spec has no expectations - Jasmine testing the callback function
                            
                                Why is let=0 valid but not var=0? [duplicate]
                            
                                How can I detect if Dark Mode is enabled on my website?
                            
                                @typescript-eslint/no-unused-vars false positive in type declarations
                            
                                JS validator alternatives to JSLint?
                            
                                Open branch when clicking on a node?
                            
                                Is there a way to use Webkit Inspector Remote Debugging in iPad?
                            
                                How to convert the result of jQuery .find() function to an array?
                            
                                convert textareas string value to JavaScript array separated by new lines
                            
                                A numeric up and down in HTML?
                            
                                Make Node.js support the shebang (#!) for JavaScript files
                            
                                Node.js read and write file lines
                            
                                How to add a splash screen/placeholder image for a YouTube video [closed]
                            
                                Create separate JavaScript bundles with a shared common library using Browserify and Gulp

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With