How can I replace diacritics (ă,ş,ţ etc) with their "normal" form (a,s,t) in javascript?
You can use the JavaScript replace() method to replace the occurrence of any character in a string. However, the replace() will only replace the first occurrence of the specified character. To replace all the occurrence you can use the global ( g ) modifier.
replace(/[^a-z0-9]/gi,'') . However a more intuitive solution (at least for the user) would be to replace accented characters with their "plain" equivalent, e.g. turn á , á into a , and ç into c , etc.
string = string. replaceAll("[^\\p{ASCII}]", "");
In modern browsers and node.js you can use unicode normalization to decompose those characters followed by a filtering regex.
str.normalize('NFKD').replace(/[^\w]/g, '')
If you wanted to allow characters such as whitespaces, dashes, etc. you should extend the regex to allow them.
str.normalize('NFKD').replace(/[^\w\s.-_\/]/g, '')
var str = 'áàâäãéèëêíìïîóòöôõúùüûñçăşţ'; var asciiStr = str.normalize('NFKD').replace(/[^\w]/g, ''); console.info(str, asciiStr);
NOTES: This method does not work with characters that do not have unicode composed varian. i.e. ø
and ł
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With