I'm trying to use a perl one-liner to update some code that spans multiple lines and am seeing some strange behavior. Here's a simple text file that shows the problem I'm seeing:
ABCD START STOP EFGH
I expected the following to work but it doesn't end up replacing anything:
perl -pi -e 's/START\s+STOP/REPLACE/s' input.txt
After doing some experimenting I found that the \s+
in the original regex will match the newline but not any of the whitespace on the 2nd line, and adding a second \s+
doesn't work either. So for now I'm doing the following workaround, which is to add an intermediate regex that only removes the newline:
perl -pi -e 's/START\s+/START/s' input.txt
This creates the following intermediate file:
ABCD START STOP EFGH
Then I can run the original regex (although the /s
is no longer needed):
perl -pi -e 's/START\s+STOP/REPLACE/s' input.txt
This creates the final, desired file:
ABCD REPLACE EFGH
It seems like the intermediate step should not be necessary. Am I missing something?
Solution. Use /m , /s , or both as pattern modifiers. /s lets . match newline (normally it doesn't). If the string had more than one line in it, then /foo.
i think this will work,using the /s modifier, which mnemonically means to "treat string as a single line". This changes the behaviour of "." to match newline characters as well. In order to match the beginning of this comment to the end, we add the /s modifier like this: $str =~ s/<!
Multiline option, or the m inline option, enables the regular expression engine to handle an input string that consists of multiple lines. It changes the interpretation of the ^ and $ language elements so that they match the beginning and end of a line, instead of the beginning and end of the input string.
The m flag indicates that a multiline input string should be treated as multiple lines. For example, if m is used, ^ and $ change from matching at only the start or end of the entire string to the start or end of any line within the string. You cannot change this property directly.
perl -p
processes the file one line at a time. The regex you have is correct, but it is never matched against the multi-line string.
A simple strategy, assuming the file will fit in memory, is to read the whole thing (do this without -p
):
$/ = undef; $file = <>; $file =~ s/START\s+STOP/REPLACE/sg; print $file;
Note, I have added the /g
modifier to specify global replacement.
As a shortcut for all that extra boilerplate, you can use your existing script with the -0777
option: perl -0777pi -e 's/START\s+STOP/REPLACE/sg'
. Adding /g
is still needed if you may need to make multiple replacements within the file.
A hiccup that you might run into, although not with this regex: if the regex were START.+STOP
, and a file contains multiple START/STOP pairs, greedy matching of .+
will eat everything from the first START to the last STOP. You can use non-greedy matching (match as little as possible) with .+?
.
If you want to use the ^
and $
anchors for line boundaries anywhere in the string, then you also need the /m
regex modifier.
You were close. You need either -00
or -0777
:
perl -0777 -pi -e 's/START\s+/START/' input.txt
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With