Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regex - multiple matches after a specific word

Tags:

regex

pcre

Simplified example: consider the string aabaabaabaabaacbaabaabaabaa

I want to match all aa occurrences only after the c in the middle, using one regex expression.

The closest I've come to is c.*\Kaa but it only matches the last aa, and only the first aa with the ungreedy flag.

I'm using the regex101 website for testing.

like image 344
ormxmi Avatar asked Sep 16 '25 03:09

ormxmi


2 Answers

You can use

(?:\G(?!^)|c).*?\Kaa

See the regex demo. Details:

  • (?:\G(?!^)|c) - either the end of the previous successful match (\G(?!^)) or (|) a c char
  • .*? - any zero or more chars other than line break chars, as few as possible
  • \K - forget the text matched so far
  • aa - an aa string.
like image 59
Wiktor Stribiżew Avatar answered Sep 18 '25 17:09

Wiktor Stribiżew


If it is known that the string contains exactly one 'c' just match

aa(?!.*c)

Demo

(?!.*c) is a negative lookahead that asserts that 'c' does not appear later in the string.


If it is not known whether the string contains zero, one or more than one 'c', and 'aa' is to be matched if and only if the string contains at least one 'c' and 'aa' is not followed later in the string by a 'c', one can match the regular expression

^.*c\K|(?!^)aa

Demo

The regular expression can be broken down as follows.

^      # match the beginning of the string
.*     # match zero or more chars, as many as possible
c      # match 'c'
\K     # reset match pointer in string and discard all previously
       # matched characters
|      # or
(?!^)  # negative lookahead asserts current string position is not
       # at the beginning of the string
aa     # match 'aa'

Note that is the string contains no 'c' there will be no match.

like image 23
Cary Swoveland Avatar answered Sep 18 '25 18:09

Cary Swoveland