Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient way to remove text after a regular expression match

Tags:

regex

perl

Let's say I'd like to remove a space after every word. In reality, the regular expression before the space is more complex.

$text =~ s/(\w+) /$1/g;

works as expected, but I don't like the need for the $1, because it doesn't seem to be very efficient to match something, remove it and insert it again. I tried a positive lookahead, but this doesn't work:

$text =~ s/(?=\w+) //g;

I understand that it doesn't work because the "position" does not change with this lookahead. Is there another way to get rid of the $1?

like image 525
André Avatar asked Jan 31 '26 05:01

André


1 Answers

s/// doesn't modify the original string, but builds up an a new one, so it's gonna copy the prefix anyway.[1] The capture itself can slow things down, but I think that's been improved. That said, \K does exactly what you want.

$text =~ s/\w\K //g;

  1. You can see the original scalar doesn't change until every substitution is complete:

    $ perl -e'$_ = "a-b-c-d"; s{-}{ CORE::say; "+" }eg; CORE::say;'
    a-b-c-d
    a-b-c-d
    a-b-c-d
    a+b+c+d
    
like image 114
ikegami Avatar answered Feb 02 '26 21:02

ikegami



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!