I have a text file that contains a bunch of sentences. The sentences contain white space (spaces, tabs, new lines) to separate out words consisting of letter and/or digits. I want to find the word "123" or "-123" and insert a dot (.) before the digits begin. So all occurrences of "123" and "-123" will be converted to ".123" and "-.123".
I was trying this with the following:
$line =~ s/(\s+-*123\s+)/getNewWord($1)/ge
Where $line contains a line read from the file and the function getNewWord word will put the dot(.) at appropriate place in the matched word.
But it's not working for cases where there are two consecutive "123" like " 123 123 ". As the first "123" is replaced by a " .123 " the space following the word has already been matched and the second "123" is not matched since the regex engine can't match the preceding space with that word.
Can anyone help me with this? Thanks!
I agree with MRAB (and have +1'd his/her answer), but there's no real need for the getNewWord
function. I'd change the entire statement to something like one of these:
$line =~ s/((?:^|\s)-?)(123)(?=\s|$)/$1.$2/g;
$line =~ s/(?:^|(?<=\s))(-?)(123)(?=\s|$)/$1.$2/g;
$line =~ s/(?:^|(?<=\s)|(?<=\s-))(?=123(?:\s|$))/./g;
It might be slightly faster (no explicit capture) and it allows a file without leading/trailing whitespace:
$ echo '123 -123 -123 123' | perl -pe's/(?:^|\s+)\K(?=-?123\b)/./g'
.123 .-123 .-123 .123
To put .
after -
:
$ echo '123 -123 -123 123' | perl -pe's/(?:^|\s+)-*\K(?=123\b)/./g'
.123 -.123 -.123 .123
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With