As a common example, say we want to match some word pattern $word_pattern
but there may be whitespace surrounding it. This is very common usage of regex. Normally people will write
/\s*$word_pattern\s*/
But that is inefficient in case of failure isn't it? Shouldn't the efficient code be:
/(?>\s*)$word_pattern\s*/
But I never see that actually written...
Addition: yes I did now benchmark it, and since one of the responders may have issues with whitespace here, I don't want to use it.
So now I have a very long file a.txt
(1GB) filled entirely with character a
.
And then
perl -ne 'print !/a*b/' < a.txt
perl -ne 'print !/(?>a*)b/' < a.txt
both take significant, but SAME, amount of time (over and above the time it takes to read in the file itself).
I don't understand that at all . Can someone explain how can that be?? Perl documentation clearly says, that in the first case, there would be backtracking going on.
"Inefficient" no, but less efficient in case of failure and in case of success. You can see a real difference for a certain amount of data.
(?>\s*)
or \s*+
have two consequences:
You can read this topic: http://www.perlmonks.org/?node_id=664545 on the subject.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With