Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Foreign language characters in Regular expression in C#

In C# code, I am trying to pass chinese characters: " 中文ABC123".

When I use alphanumeric in general using "^[a-zA-Z0-9\s]+$",

it doesn't pass for "中文ABC123" and regex validation fails.

What other expressions do I need to add for C#?

like image 605
user2683269 Avatar asked Jan 26 '15 18:01

user2683269


1 Answers

To match any letter character from any language use:

\p{L}

If you also want to match numbers:

[\p{L}\p{Nd}]+

\p{L} ... matches a character of the unicode category letter.
                it is the short form for [\p{Ll}\p{Lu}\p{Lt}\p{Lm}\p{Lo}]
                  \p{Ll} ... matches lowercase letters. (abc)
                  \p{Lu} ... matches uppercase letters. (ABC)
                  \p{Lt} ... matches titlecase letters.
                  \p{Lm} ... matches modifier letters.
                  \p{Lo} ... matches letters without case. (中文)

\p{Nd} ... matches a character of the unicode category decimal digit.

Just replace: ^[a-zA-Z0-9\s]+$ with ^[\p{L}0-9\s]+$

like image 110
Andie2302 Avatar answered Sep 27 '22 21:09

Andie2302