Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiplication with .NET regular expressions

In the spirit of polygenelubricants' efforts to do silly things with regular expressions, I currently try to get the .NET regex engine to multiplicate for me.

This has, of course, no practical value and is meant as a purely theoretical exercise.

So far, I've arrived at this monster, that should check if the number of 1s multiplied by the number of 2s equals the number of 3s in the string.

Regex regex = new Regex(
@"
^
(1(?<a>))*  # increment a for each 1
(2(?<b>))*  # increment b for each 2
    (?(a)   # if a > 0
        (                   
            (?<-a>)             # decrement a
            (3(?<c-b>))*        # match 3's, decrementing b and incrementing c until
                                # there are no 3's left or b is zero
            (?(b)(?!))          # if b != 0, fail
            (?<b-c>)*           # b = c, c = 0
        )
    )*      # repeat
(?(a)(?!))  # if a != 0, fail
(?(c)(?!))  # if c != 0, fail
$
", RegexOptions.IgnorePatternWhitespace);

Unfortunately, its not working, and I am at a loss why. I commented it to show you what I think the engine should be doing, but I may be off here. Examples of output:

regex.IsMatch("123") // true, correct
regex.IsMatch("22") // true, correct
regex.IsMatch("12233") // false, incorrect
regex.IsMatch("11233"); // true, correct

Any thought are welcome!

like image 840
Jens Avatar asked Sep 24 '10 09:09

Jens


People also ask

Can you use regex in C#?

In C#, Regular Expression is a pattern which is used to parse and check whether the given input text is matching with the given pattern or not. In C#, Regular Expressions are generally termed as C# Regex. The . Net Framework provides a regular expression engine that allows the pattern matching.

What regex does .NET use?

In . NET, regular expression patterns are defined by a special syntax or language, which is compatible with Perl 5 regular expressions and adds some additional features such as right-to-left matching. For more information, see Regular Expression Language - Quick Reference.

What's the difference between () and [] in regular expression?

[] denotes a character class. () denotes a capturing group. (a-z0-9) -- Explicit capture of a-z0-9 . No ranges.

What is A+ in regular expression?

The character + in a regular expression means "match the preceding character one or more times". For example A+ matches one or more of character A. The plus character, used in a regular expression, is called a Kleene plus . Regular Expression. Matches.


1 Answers

I'm pretty sure the problem is in this line:

(?<b-c>)*

From what I can tell, with no text to match in there, the Regex refuses to match it more than one time. I slimmed down the Regex to the following:

(1(?<a>))*
(?(a)(?<-a>))*
(?(a)(?!))

Which passes on 1 but fails on 111. Also tried (?<-a>)*. No difference. However, changing it to

(1(?<a>))*
(?(a)((?<-a>)(2(?<b>))(?<-b>)))*
(?(a)(?!))

passes on both 12 and 111222. So going from a match of "" to a match with something causes the Regex to work as expected.

Getting back to your original Regex, my guess is that (?<b-c>)* is only matching 0-1 times, which explains why having one 2 in your string works, but having more than one fails.

Using a string of 11 also fails, which follows the same logic, as that makes the entire match "", which most likely means it only matches once, causing (?(a)(?!)) to fail.

like image 54
Joel Rondeau Avatar answered Oct 26 '22 16:10

Joel Rondeau