I'm learning Perl and noticed a rather peculiar quirk -- attempting to match one of multiple regex conditions in a while loop results in that loop going on for infinity:
#!/usr/bin/perl
my $hivar = "this or that";
while ($hivar =~ m/this/ig || $hivar =~ m/that/ig) {
print "$&\n";
}
The output of this program is:
this
that
that
that
that
[...]
I'm wondering why this is? Are there any workarounds that are less clumsy than this:
#!/usr/bin/perl
my $hivar = "this or that";
while ($hivar =~ m/this|that/ig) {
print "$&\n";
}
This is a simplification of a real-world problem I am encountering, and while I am interested in this in a practical standpoint, I also would like to know what behind-the-scenes is triggering this behavior. This is a question that doesn't seem to be very Google-compatible.
Thanks!
Tom
The thing is that there's a hidden value associated with each string, not with each match, that controls where a /g
match will attempt to continue, and accessible through pos($string)
. What happens is:
pos($hivar)
is 0, /this/
matches at position 0 and resets pos($hivar)
to 4. The second match isn't attempted because the or operator is already true. $&
becomes "this" and gets printed.pos($hivar)
is 4, /this/
fails to match because there's no "this" at position 4 or beyond. The failing match resets pos($hivar)
to 0. /that/
matches at position 6 and resets pos($hivar)
to 10. $&
becomes "that" and gets printed.pos($hivar)
is 10, /this/
fails to match because there's no "this" at position 10 or beyond. The failing match resets pos($hivar)
to 0./that/
matches at position 6 and resets pos($hivar)
to 10. $&
becomes "that" and gets printed.and steps 4 and 5 repeat indefinitely.
Adding the c
regex flag (which tells the engine not to reset pos
on a failed match) solves the problem in the example code you provided, but it might or might not be the ideal solution to a more complex problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With