Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regular expressions with the Cyrillic alphabet?

Tags:

c#

regex

latin

I am currently writing some validation that will validate inputted data. I am using regular expressions to do so, working with C#.

Password = @"(?!^[0-9]*$)(?!^[a-zA-Z]*$)^([a-zA-Z0-9]{6,18})$"

Validate Alpha Numeric = [^a-zA-Z0-9ñÑáÁéÉíÍóÓúÚüÜ¡¿{0}]

The above work fine on the latin alphabet, but how can I expand such to working with the Cyrillic alphabet?

like image 579
amateur Avatar asked Feb 16 '13 02:02

amateur


3 Answers

The basic approach to covering ranges of characters using regular expressions is to construct an expression of the form [A-Za-z], where A is the first letter of the range, and Z is the last letter of the range.

The problem is, there is no such thing as "The" Cyrillic alphabet: the alphabet is slightly different depending on the language. If you would like to cover Russian version of the Cyrillic, use [А-Яа-я]. You would use a different range, say, for Serbian, because the last letter in their Cyrillic is Ш, not Я.

Another approach is to list all characters one-by-one. Simply find an authoritative reference for the alphabet that you want to put in a regexp, and put all characters for it into a pair of square brackets:

[АБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгдеёжзийклмнопрстуфхцчшщъыьэюя]
like image 98
Sergey Kalinichenko Avatar answered Sep 17 '22 23:09

Sergey Kalinichenko


You can use character classes if you need to allow characters of particular language or particular type:

@"\p{IsCyrillic}+" // Cyrillic letters
@"[\p{Ll}\p{Lt}]+" // any upper/lower case letters in any language

In your case maybe "not a whitespace" would be enough: @"[^\s]+" or maybe "word character (which includes numbers and underscores) - @"\w+".

like image 21
Alexei Levenkov Avatar answered Sep 17 '22 23:09

Alexei Levenkov


Password = @"(?!^[0-9]*$)(?!^[А-Яа-я]*$)^([А-Яа-я0-9]{6,18})$"

Validate Alpha Numeric = [^а-яА-Я0-9ñÑáÁéÉíÍóÓúÚüÜ¡¿{0}]
like image 24
KJW Avatar answered Sep 17 '22 23:09

KJW