Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Matching Cyrillic symbols in C#

Tags:

c#

regex

I have a huge file of code with many lines like this:

Enterprise::TextMessageBox::Show(String::Format(S"Възникнал е проблем:\n\n{0}", e->Message), S"Грешка");

What i`m trying to do is to find every part of the code with string of cyrilic symbols with another text that i provide. My problem is that i cant seem to make good enought expressions so i can catch the lines. Another problem is that some times the lines contain only one such string but other times they contain 2 or more on one line.

Every such string is similar and it look like this:

S"some cyrilic symbols"

I tried to make it with Regex class but I can't seem to make good enough pattern to the strings.

like image 212
Jordan Avatar asked Oct 28 '11 07:10

Jordan


1 Answers

OK you have the possibility to match for Unicode properties. Try something like this

Regex TheRegex = new Regex(@"S""[\p{IsCyrillic}\p{P}\p{N}\s]*""");

\p{IsCyrillic} matches any cyrillic character

\p{P} is the unicode category for punctuation

\p{N} is the unicode category for a number in any language

\s matches a whitespace

See here on msdn for more infos about unicode categories and here on regular-expressions.info.

like image 85
stema Avatar answered Sep 19 '22 07:09

stema