This question is inspired by this other one.
Comparing s/,(\d)/$1/
to s/,(?=\d)//
: the former uses a capture group to replace only the digit but not the comma, the latter uses a lookahead to determine whether the comma is succeeded by a digit. Why is the latter sometimes faster, as discussed in this answer?
The two approaches do different things and have different kinds of overhead costs. When you capture, perl has to make a copy of the captured text. Look-ahead matches without consuming; it has to mark the location where it starts. You can see what's happening by using the re 'debug'
pragma:
use re 'debug';
my $capture = qr/,(\d)/;
Compiling REx ",(\d)" Final program: 1: EXACT (3) 3: OPEN1 (5) 5: DIGIT (6) 6: CLOSE1 (8) 8: END (0) anchored "," at 0 (checking anchored) minlen 2 Freeing REx: ",(\d)"
use re 'debug';
my $lookahead = qr/,(?=\d)/;
Compiling REx ",(?=\d)" Final program: 1: EXACT (3) 3: IFMATCH[0] (8) 5: DIGIT (6) 6: SUCCEED (0) 7: TAIL (8) 8: END (0) anchored "," at 0 (checking anchored) minlen 1 Freeing REx: ",(?=\d)"
I'd expect look-ahead to be faster than capturing in most cases, but as noted in the other thread regex performance can be data dependent.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With