Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Custom character class in C# regex

Tags:

c#

regex

Is there any way to define custom character class in C# regex?

In flex it is done in very obvious way:

DIGIT    [0-9]
%%
{DIGIT}+    {printf( "An integer: %s (%d)\n", yytext, atoi( yytext ) );}

http://westes.github.io/flex/manual/Simple-Examples.html#Simple-Examples

As explained in this answer, in PHP defining a custom character class works like this:

(?(DEFINE)(?<a>[acegikmoqstz@#&]))\g<a>(?:.*\g<a>){2}

Is there a way to achieve this result in c#, without repeating the full character class definition each time it is used?

like image 873
PiotrB Avatar asked Oct 21 '25 05:10

PiotrB


2 Answers

Custom character classes aren't supported in C# but you may be able to use named blocks and character class subtraction to get a similar effect.

.NET defines a large number of named blocks that correspond to Unicode character categories like math or Greek symbols. There may be a block that already matches your requirements.

Character class subtraction allows you to exclude the characters in one class or block from the characters in a broader class. The syntax is :

[ base_group -[ excluded_group ]]

The following example, copied from the linked documentation, matches all Unicode characters except whitespace, Greek characters, punctuation and newlines:

[\u0000-\uFFFF-[\s\p{P}\p{IsGreek}\x85]]
like image 163
Panagiotis Kanavos Avatar answered Oct 22 '25 19:10

Panagiotis Kanavos


Nope, not supported in C#. This link will give you a nice overview of the .NET Regex engine. Note that nothing really stops you from defining variables and using them to construct your Regex string:

var digit = "[0-9]";
var regex = new Regex(digit + "[A-Z]");
like image 22
Haney Avatar answered Oct 22 '25 18:10

Haney