Which regular expression can I use to match (allow) any kind of letter from any language?
I need to match any letter including any diacritics (e.g., á, ü, ñ) and exclude any kind of symbol (math symbols, currency signs, dingbats, box-drawing characters, etc.) and punctuation characters.
I'm using ASP.NET MVC 2 with .NET 4. I’ve tried this annotation in my view model
[RegularExpression(@"\p{L}*", ...
and this one
[RegularExpression(@"\p{L}\p{M}*", ...
but client-side validation rejects accented characters.
UPDATE: Thank you for all your answers. Your suggestions work but only for .NET, and the problem here is that it also uses the regex for client-side validation with JavaScript.
I had to go with
[^0-9_\|°¬!#\$%/\\\(\)\?¡¿\+\{\}\[\]:\.\,;@ª^\*<>=&]
which is very ugly and does not cover all scenarios but is the closest thing to what I need.
To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \. matches "." ; regex \+ matches "+" ; and regex \( matches "(" . You also need to use regex \\ to match "\" (back-slash).
Regular expressions are easy to learn, self-containing (its syntax is rarely changed or updated), very powerful and language agnostic, since they work for all natural languages and with majority of programming languages.
[A-Za-z] will match all the alphabets (both lowercase and uppercase).
Use square brackets [] to match any characters in a set. Use \w to match any single alphanumeric character: 0-9 , a-z , A-Z , and _ (underscore). Use \d to match any single digit. Use \s to match any single whitespace character.
\p{L}*
should match "any kind of letter from any language". It should work, I used it in a i18n-proof uppercase/lowercase recognition regex in .NET.
Your problem is more likely to the fact that you will only have to have one alpha-char, because the regex will match anything that has at least one char.
By adding ^
as prefix and $
as postfix, the whole sentence should comply to your regex. So this prob works:
^\p{L}*$
Regexbuddy explains:
^
Assert position at beginning of the string\p{L}
A character with the Unicode property 'letter' (any kind of letter from any kind of language)
2a. Between zero and unlimited times, as many as possible (greedy)$
Assert position at the end of the stringIf you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With