Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replacing diacritics in Javascript

Tags:

How can I replace diacritics (ă,ş,ţ etc) with their "normal" form (a,s,t) in javascript?

like image 859
Paul Grigoruta Avatar asked May 14 '09 14:05

Paul Grigoruta


People also ask

How do you replace a letter in a string JavaScript?

You can use the JavaScript replace() method to replace the occurrence of any character in a string. However, the replace() will only replace the first occurrence of the specified character. To replace all the occurrence you can use the global ( g ) modifier.

How do you change an accented character to a regular character?

replace(/[^a-z0-9]/gi,'') . However a more intuitive solution (at least for the user) would be to replace accented characters with their "plain" equivalent, e.g. turn á , á into a , and ç into c , etc.

How do I remove the accented character in Java?

string = string. replaceAll("[^\\p{ASCII}]", "");


1 Answers

In modern browsers and node.js you can use unicode normalization to decompose those characters followed by a filtering regex.

str.normalize('NFKD').replace(/[^\w]/g, '')

If you wanted to allow characters such as whitespaces, dashes, etc. you should extend the regex to allow them.

str.normalize('NFKD').replace(/[^\w\s.-_\/]/g, '')

var str = 'áàâäãéèëêíìïîóòöôõúùüûñçăşţ';  var asciiStr = str.normalize('NFKD').replace(/[^\w]/g, '');  console.info(str, asciiStr);

NOTES: This method does not work with characters that do not have unicode composed varian. i.e. ø and ł

like image 162
pakopa Avatar answered Nov 02 '22 16:11

pakopa