Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Look Behind Regex

Tags:

c#

regex

I'm starting with Regex (always used from the net the ones I needed)

I need something that given the input:

Input: AAABBBCCC
Index: 012345678

The regex matches would be:

  • AA from 0,1
  • AA from 1,2 (even though the A from 1 is already consumed)
  • BB from 3,4
  • BB from 4,5 (even though the B from 4 is already consumed)
  • CC from 6,7
  • CC from 7,8 (even though the B from 7 is already consumed)

The regex I have now is (A{2}|B{2}|C{2}). It is not my real problem, but I have different workings Regexes for the As, Bs and Cs.

I think that I should use some look behind operator but trying: ((A{2}|B{2}|C{2})$1) or (?<=(A{2}|B{2}|C{2})) won't work.

Here's an example.

Note: My problem is in c#, if that matters

like image 357
RMalke Avatar asked Oct 20 '25 10:10

RMalke


1 Answers

You do need lookaround but I'd use a positive lookahead assertion for that:

(?=(([ABC])\2))

Your match results will be in match.Groups(1) of each match object.

Explanation:

(?=       # Look ahead to check that the following matches:
 (        # Match and capture in group number 1:
  (       # Match and capture in group number 2:
   [ABC]  # Any letter A, B or C
  )       # End of capturing group 2
  \2      # Now match that same letter again.
 )        # End of group 1. It now contains AA, BB or CC
)         # End of lookahead assertion

A simpler solution:

(?=(AA|BB|CC))
like image 141
Tim Pietzcker Avatar answered Oct 21 '25 22:10

Tim Pietzcker