Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to remove non-letter characters but keep accented letters

I have strings in Spanish and other languages that may contain generic special characters like (),*, etc. That I need to remove. But the problem is that it also may contain special language characters like ñ, á, ó, í etc and they need to remain. So I am trying to do it with regexp the following way:

var desired = stringToReplace.replace(/[^\w\s]/gi, '');

Unfortunately it is removing all special characters including the language related. Not sure how to avoid that. Maybe someone could suggest?

like image 811
devjs11 Avatar asked Dec 01 '11 11:12

devjs11


People also ask

How do I remove non alphabets from a string?

A common solution to remove all non-alphanumeric characters from a String is with regular expressions. The idea is to use the regular expression [^A-Za-z0-9] to retain only alphanumeric characters in the string. You can also use [^\w] regular expression, which is equivalent to [^a-zA-Z_0-9] .

How to remove all non-alphanumeric characters from a string in c#?

Using Regular Expression We can use the regular expression [^a-zA-Z0-9] to identify non-alphanumeric characters in a string. Replace the regular expression [^a-zA-Z0-9] with [^a-zA-Z0-9 _] to allow spaces and underscore character.

How do you replace special characters in regex?

If you are having a string with special characters and want's to remove/replace them then you can use regex for that. Use this code: Regex. Replace(your String, @"[^0-9a-zA-Z]+", "")

How do I remove special characters from a string in Golang?

In the following program ReplaceAllString() method is used, which allows us to replace original string with another string if the specified string matches with the specified regular expression.


1 Answers

I would suggest using Steven Levithan's excellent XRegExp library and its Unicode plug-in.

Here's an example that strips non-Latin word characters from a string: http://jsfiddle.net/b3awZ/1/

var regex = XRegExp("[^\\s\\p{Latin}]+", "g");
var str = "¿Me puedes decir la contraseña de la Wi-Fi?"
var replaced = XRegExp.replace(str, regex, "");

See also this answer by Steven Levithan himself:

Regular expression Spanish and Arabic words

like image 118
Tim Down Avatar answered Oct 25 '22 18:10

Tim Down