I'm trying to get a ‌
with innerHTML
The output should be
This div contains a zero-width‌non-joiner, a non-breaking space & an ampersand
But the output is:
This div contains a zero-widthnon-joiner, a non-breaking space & an ampersand
How can I get the ‌
?
alert(document.getElementsByTagName('div')[0].innerHTML)
<div>This div contains a zero-width‌non-joiner, a non-breaking space & an ampersand</div>
Fiddle: https://jsfiddle.net/yst1Lanv/
The ZWNJ is encoded in Unicode as U+200C ZERO WIDTH NON-JOINER ( ‌).
The character's code point is U+200D ZERO WIDTH JOINER ( ‍). In the InScript keyboard layout for Indian languages, it is typed by the key combination Ctrl+Shift+1. However, many layouts use the position of QWERTY's ']' key for this character.
The \u200c character is ZERO WIDTH NON-JOINER.
Use the str. replace() method to remove zero width space characters from a string, e.g. result = my_str. replace('\u200c', '') .
You can search for it using its unicode \u200c
. Then replace it with ‌
string.
alert(document.getElementsByTagName('div')[0].innerHTML.replace(/\u200c/g, '‌'))
<div>This div contains a zero-width‌non-joiner, a non-breaking space & an ampersand</div>
Your character is in the extracted (innerHTML
) text, just not encoded as its HTML entity.
If you want you can replace the character with its entity:
alert(document.getElementsByTagName('div')[0].innerHTML.replace(//g, '‌'));
<div>This div contains a zero-width‌non-joiner, a non-breaking space & an ampersand</div>
Yong Quan posted some nicer code than me, if you want your app to be more maintainable use the unicode. My regex above is pretty confusing, this is easier to read:
.replace(/\u200c/g, '‌')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With