Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Matching a^n A^n with regular expression

Tags:

regex

We're learning about the difference between regular languages and regex, and the teacher explained that the language

a^n b^n

is not regular, but she said that most regex flavors can match

a^n A^n

and she gave this problem for our extra credit homework problem. We've been struggling for a couple of days now, and could really use some guidance.

like image 225
user282886 Avatar asked Dec 22 '22 01:12

user282886


1 Answers

The teacher gave a HUGE hint by limiting the alphabet to {a, A}. The key to solving this problem is realizing that in case-insensitive mode, a matches A and vice versa. The other component of the problem is matching on backreference.

This pattern will match a{n}A{n} for some n (as seen on rubular.com):

^(?=(a*)A*$)\1(?i)\1$

How it works

The pattern works as follows:

  • ^(?=(a*)A*$) - Anchored at the beginning of the string, we use positive lookahead to see if we can match (a*)A* until the end of the string
    • If we can succesfully make this assertion, then \1 captures the sequence of a{n}
    • If the assertion fails, then there's no way the string can be a{n}A{n}, since it's not even a*A*
  • We then match \1(?i)\1$, that is, the a{n} captured by \1, then in case-insensitive mode (?i), we match a{n} again until the end of the string
  • Thus, since:
    • The string is a*A*
    • \1 is a{n}
    • \1(?i)\1 can only be a{n}A{n}

Related questions

  • Case sensitive and insensitive in the same pattern
  • How does the regular expression ‘(?<=#)[^#]+(?=#)’ work?

References

  • regular-expressions.info/Lookaround, Grouping and backreferences
  • Modifiers (esp. Turning Modes On and Off for Only Part of The Regular Expression)

See also

  • Why is {a^nb^n | n >= 0} not regular?
  • Wikipedia/Regular expression: Patterns for non-regular languages
like image 140
polygenelubricants Avatar answered Jan 02 '23 12:01

polygenelubricants