perl regex not matching string with newline character \n

Q: How do I add a new line in regex?

"\n" matches a newline character.

Q: Does match newline regex?

By default in most regex engines, . doesn't match newline characters, so the matching stops at the end of each logical line. If you want . to match really everything, including newlines, you need to enable "dot-matches-all" mode in your regex engine of choice (for example, add re. DOTALL flag in Python, or /s in PCRE.

Q: What is \W in Perl regex?

A \w matches a single alphanumeric character (an alphabetic character, or a decimal digit) or _ , not a whole word. Use \w+ to match a string of Perl-identifier characters (which isn't the same as matching an English word).

Tags:

regex

perl

I'm trying to use perl (v5.14.2) via a bash shell (GNU Bash-4.2) in Kubuntu (GNU/Linux) to search and replace a string that includes a newline character, but I'm not succeeding yet.

Here's the text file I'm searching:

<!-- filename: prac1.html -->

hello
kitty

blah blah blah

When I use a text editor's (Kate's) search-and-replace functionality or when I use a regex tester (http://regexpal.com/), I can easily get this regex to work:

hello\nkitty

But when using perl in the command line, none of the following commands have worked:

perl -p -i -e 's,hello\nkitty,newtext,' prac1.html
perl -p -i -e 's,hello.kitty,newtext,s' prac1.html
perl -p -i -e 's,hello.*kitty,newtext,s' prac1.html
perl -p -i -e 's,hello[\S\s]kitty,newtext,' prac1.html
perl -p -i -e 's,hello[\S\s]*kitty,newtext,' prac1.html

Actually, I got desperate and tried many other patterns, including all of these (different permutations in the "single-line" and "multi-line" modes):

perl -p -i -e 's,hello\nkitty,newtext,' prac1.html
perl -p -i -e 's,hello.kitty,newtext,' prac1.html
perl -p -i -e 's,hello\nkitty,newtext,s' prac1.html
perl -p -i -e 's,hello.kitty,newtext,s' prac1.html
perl -p -i -e 's,hello\nkitty,newtext,m' prac1.html
perl -p -i -e 's,hello.kitty,newtext,m' prac1.html
perl -p -i -e 's,hello\nkitty,newtext,ms' prac1.html
perl -p -i -e 's,hello.kitty,newtext,ms' prac1.html

perl -p -i -e 's,hello[\S\s]kitty,newtext,' prac1.html
perl -p -i -e 's,hello[\S\s]*kitty,newtext,' prac1.html
perl -p -i -e 's,hello$[\S\s]^kitty,newtext,' prac1.html
perl -p -i -e 's,hello$[\S\s]*^kitty,newtext,' prac1.html
perl -p -i -e 's,hello[\S\s]kitty,newtext,s' prac1.html
perl -p -i -e 's,hello[\S\s]*kitty,newtext,s' prac1.html
perl -p -i -e 's,hello$[\S\s]^kitty,newtext,s' prac1.html
perl -p -i -e 's,hello$[\S\s]*^kitty,newtext,s' prac1.html
perl -p -i -e 's,hello[\S\s]kitty,newtext,m' prac1.html
perl -p -i -e 's,hello[\S\s]*kitty,newtext,m' prac1.html
perl -p -i -e 's,hello$[\S\s]^kitty,newtext,m' prac1.html
perl -p -i -e 's,hello$[\S\s]*^kitty,newtext,m' prac1.html
perl -p -i -e 's,hello[\S\s]kitty,newtext,ms' prac1.html
perl -p -i -e 's,hello[\S\s]*kitty,newtext,ms' prac1.html
perl -p -i -e 's,hello$[\S\s]^kitty,newtext,ms' prac1.html
perl -p -i -e 's,hello$[\S\s]*^kitty,newtext,ms' prac1.html

(I also tried using \r \r\n \R \f \D etc., and global mode as well.)

Can anyone spot the issue or suggest a solution?

345

asked Feb 16 '13 01:02

zeroparallax

1 Answers

Try doing this, I make this possible by modifying the input record separator (a newline by default) :

perl -i -p00e 's,hello\nkitty,newtext,' prac1.html

from perldoc perlrun :

-0[octal/hexadecimal]

specifies the input record separator ($/ ) as an octal or hexadecimal number. If there are no digits, the null character is the separator. Other switches may precede or follow the digits. For example, if you have a version of find which can print filenames terminated by the null character, you can say this:
find . -name '*.orig' -print0 | perl -n0e unlink
The special value 00 will cause Perl to slurp files in paragraph mode. Any value 0400 or above will cause Perl to slurp files whole, but by convention the value 0777 is the one normally used for this purpose.

answered Sep 23 '22 20:09

Gilles Quenot

Related questions
                            
                                Regex that matches valid Ruby local variable names
                            
                                Regex that finds consecutive words with first letter capitalized
                            
                                PHP filter array
                            
                                How to replace a String which has escape sequence inside a File using Perl?
                            
                                Vim: Aligning columns by whitespace
                            
                                Ruby Regex to round trailing zeros
                            
                                adding rel="nofollow" while saving data
                            
                                Matching Multiple Patterns using Java Regex
                            
                                Building regexp from JS variables not working
                            
                                JavaScript regular expression to catch kanji
                            
                                Check if character value is a valid R object name
                            
                                Javascript regex: find a word NOT followed by space character
                            
                                Split String with regex \w \w*? \w+?
                            
                                Javascript regex - remove all special characters except semi colon
                            
                                Regex 'Ignore Case' option doesn't work when the 'Compiled' option is specified
                            
                                C# Reverse all numbers in string?
                            
                                php preg_match help getting facebook username/id from URL
                            
                                Can anyone tell me why this C# email validation regular expression (regex) hangs?
                            
                                Replace last comma with or using ColdFusion
                            
                                Javascript Regular Expression for Removing all Spaces except for what between double quotes

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With