Greedy operator result is differ in positive and negative lookahead?

Question

I'm confused about the greedy operator on positive and negative look ahead.

script for positive lookahead

foreach (<DATA>){
$_ = m/AAA.+(?=BBB)/g;
print "$&
";
} 
__DATA__
AAA 1121 BBB
AAA 443  CCC
AAA 4431 BBB
ABC 321  EACA
AAA 321  BBB
ACD 431 MAKN
AAA 751  ABC

It's output

AAA 1121 

AAA 4431 

AAA 321

Negative Lookahead

foreach (<DATA>){
$_ = m/AAA.+(?!BBB)/g; 
print "$&
";
}

It's output

AAA 1121 BBB
AAA 443  CCC
AAA 4431 BBB

AAA 321  BBB

AAA 751  ABC

In the execution of the negative lookahed do not consider the (?!BBB). Because I'm using the greedy operator preceding the (?!BBB). In that case, postive look ahead greed operator considers the (?=BBB). Which is why gives the different result?

I can easily achieve the OP by the code $_ = m/AAA\s\d+(?!.+BBB)/g;.

But i don't know what is the execution of my code?

Lucas Trzesniewski · Accepted Answer

Let's consider the first example:

AAA 1121 BBB
\_/\_______/^
 |     |    |
 |     |    +--- this (the empty string right there) satisfies (?!BBB)
 |     |
 |     +-------- matched by .+
 |     
 +-------------- matched by AAA

This is because the greedy .+ consumes 1121 BBB including the BBB. After it has consumed the rest of the line, (?!BBB) is checked against the remaining empty string. And this empty string satisfies (?!BBB) since it's not followed by BBB?

Negative lookahead

The algorithm executes like the following. ^ is the current position (there's a current position in the string and a current position in the pattern _{(kind of)}).

Initial state:

AAA 1121 BBB          AAA.+(?!BBB)
^                     ^

Match AAA

AAA 1121 BBB          AAA.+(?!BBB)
   ^                     ^

Match .+

AAA 1121 BBB          AAA.+(?!BBB)
            ^              ^

Check (?!BBB)

AAA 1121 BBB          AAA.+(?!BBB)
            ^                     ^

No BBB match at this position => Success!
```
AAA 1121 BBB
\__________/
```

Positive lookahead

Now, let's see why exactly AAA.+(?=BBB) yields a match:

Initial state:

AAA 1121 BBB          AAA.+(?=BBB)
^                     ^

Match AAA

AAA 1121 BBB          AAA.+(?=BBB)
   ^                     ^

Match .+

AAA 1121 BBB          AAA.+(?=BBB)
            ^              ^

Check (?=BBB)
```
AAA 1121 BBB          AAA.+(?=BBB)
            ^              ^
```
No BBB match at this position => Backtrack (consume one less char by .+)
Check (?=BBB)
```
AAA 1121 BBB          AAA.+(?=BBB)
           ^               ^
```
No BBB match at this position => Backtrack (consume one less char by .+)
Check (?=BBB)
```
AAA 1121 BBB          AAA.+(?=BBB)
          ^                ^
```
No BBB match at this position => Backtrack (consume one less char by .+)

Check (?=BBB)

AAA 1121 BBB          AAA.+(?=BBB)
         ^                        ^

We do have a BBB match at this position => Success!
```
AAA 1121 BBB
\_______/
```

ikegami · Answer

There is no difference in how it works in your two cases, and .+ is greedy in both of your cases.

When matching AAA.+(?=BBB) against AAA 1121 BBB, the most .+ can match starting after AAA is <spc>1121<spc>. Anything longer will cause (?=BBB) to fail.

When matching AAA.+(?!BBB) against AAA 1121 BBB, the most .+ can match starting after AAA is <spc>1121<spc>BBB. Being the rest of the string, it can't possibly match anything longer.

Note that the end of the string isn't followed by BBB, so (?!BBB) matches at the end of the string.

(?:(?!STRING).)* is to STRING as [^CHAR]* is to CHAR.

I'd go with

say $1 if /^(AAA\s+\S+)\s+(?:(?!BBB)\s)*\z/;

On second thought, I'd go with

my @F = split;
say "$F[0] $F[1]" if $F[0] eq 'AAA' && $F[2] ne 'BBB';

Greedy operator result is differ in positive and negative lookahead?

Tags:

regex

perl

mkHun

2 Answers

Negative lookahead

Positive lookahead

Lucas Trzesniewski

ikegami

Recent Activity

Donate For Us

Greedy operator result is differ in positive and negative lookahead?

Tags:

regex

perl

mkHun

2 Answers

Negative lookahead

Positive lookahead

Lucas Trzesniewski

ikegami

Related questions

Recent Activity

Donate For Us