I have a file like this: <pre class="prettyprint"><code>01 00 01 14 c0 00 01 10 01 00 00 16 00 00 00 64 00 00 00 65 00 00 01 07 40 00 00 22 68 61 6c 2e 6f 70 65 6e 65 74 2e 63 6f 6d 3b 30 30 30 30 30 30 30 30 32 3b 30 00 00 00 00 01 08 40 00 00 1e 68 61 6c 2e 6f 70 65 6e 65 74 2d 74 65 6c 65 63 6f 6d 2e 6c 61 6e 00 00 00 00 01 28 40 00 00 21 72 65 61 6c 6d 31 2e 6f 70 65 6e 65 74 2d 74 65 6c 65 63 6f 6d 2e 6c 61 6e 00 00 00 00 00 01 25 40 00 00 1e 68 61 6c 2e 6f 70 65 6e 65 74 2d 74 65 6c 65 63 6f 6d 2e 6c 61 6e 00 00 00 00 01 1b 40 00 00 20 72 65 61 6c 6d 2e 6f 70 65 6e 65 74 2d 74 65 6c 65 63 6f 6d 2e 6c 61 6e 00 00 01 02 40 00 00 0c 01 00 00 16 00 00 01 a0 40 00 00 0c 00 00 00 01 00 00 01 9f 40 00 00 0c 00 00 00 00 00 00 01 16 40 00 00 0c 00 00 00 00 00 00 01 bb 40 00 00 28 00 00 01 c2 40 00 00 0c 00 00 00 00 00 00 01 bc 40 00 00 13 31 39 37 37 31 31 31 32 32 33 31 00 </code></pre> I am reading the file and then finding certain octets and replacing them with tags: <pre class="prettyprint"><code>while(<FH>){ $line =~ s/(00 00 00 64)/<incr4> /g; $line =~ s/(00 00 00 65)/<incr4> /g; $line =~ s/(30 30 30 30 30 32)/<incr6ascii:999999:0>/g; $line =~ s/(31 31 32 32 33 31)/<incr6ascii:999999:0>/g; print OUTPUT $line; } </code></pre> So for example, <code>00 00 00 64</code> would be replaced by the <code><incr4></code> tag. This was working fine, but it doesn't seem to able to match over multiple lines any more. For example the pattern <code>31 31 32 32 33 31</code> runs over multiple lines, and the regular expression doesn't seem to catch it. I tried using /m /s pattern modifiers to ignore new lines but they didn't match it either. The only way around it I can come up with, is to read the whole file into a string using: <pre class="prettyprint"><code>undef $/; my $whole_file = <FH>; my $line = $whole_file; $line =~ s/(00 00 00 64)/<incr4> /g; $line =~ s/(00 00 00 65)/<incr4> /g; $line =~ s/(30 30 30 30 30 32)/<incr6ascii:999999:0>/g; $line =~ s/(31 31 32 32 33 31)/<incr6ascii:999999:0>/g; print OUTPUT $line; </code></pre> This works, the tags get inserted correctly, but the structure of the file is radically altered. It is all dumped out on a single line. I would like to retain the structure of the file as it appears here. Any ideas as to how I might do this? /john

The trick here is to match the class of all space like characters <code>\s</code>: <pre class="prettyprint"><code>my $file = do {local (@ARGV, $/) = 'filename.txt'; <>}; # slurp file my %tr = ( # setup a translation table '00 00 00 64' => '<incr4>', '00 00 00 65' => '<incr4>', '00 30 30 30 30 32' => '<incr6ascii:999999:0>', '31 31 32 32 33 31' => '<incr6ascii:999999:0>', ); for (keys %tr) { my $re = join '\s+' => split; # construct new regex $file =~ s{($re)}{ $1 =~ /\n/ ? "\n$tr{$_}" : $tr{$_} # if octets contained \n, add \n }ge # match multiple times, execute the replacement block as perl code } print $file; </code></pre>

Match over multiple lines perl regular expression

Tags:

perl

I have a file like this:

01 00 01 14 c0 00 01 10 01 00 00 16 00 00 00 64
00 00 00 65 00 00 01 07 40 00 00 22 68 61 6c 2e
6f 70 65 6e 65 74 2e 63 6f 6d 3b 30 30 30 30 30
30 30 30 32 3b 30 00 00 00 00 01 08 40 00 00 1e
68 61 6c 2e 6f 70 65 6e 65 74 2d 74 65 6c 65 63
6f 6d 2e 6c 61 6e 00 00 00 00 01 28 40 00 00 21
72 65 61 6c 6d 31 2e 6f 70 65 6e 65 74 2d 74 65
6c 65 63 6f 6d 2e 6c 61 6e 00 00 00 00 00 01 25
40 00 00 1e 68 61 6c 2e 6f 70 65 6e 65 74 2d 74
65 6c 65 63 6f 6d 2e 6c 61 6e 00 00 00 00 01 1b
40 00 00 20 72 65 61 6c 6d 2e 6f 70 65 6e 65 74
2d 74 65 6c 65 63 6f 6d 2e 6c 61 6e 00 00 01 02
40 00 00 0c 01 00 00 16 00 00 01 a0 40 00 00 0c
00 00 00 01 00 00 01 9f 40 00 00 0c 00 00 00 00
00 00 01 16 40 00 00 0c 00 00 00 00 00 00 01 bb
40 00 00 28 00 00 01 c2 40 00 00 0c 00 00 00 00
00 00 01 bc 40 00 00 13 31 39 37 37 31 31 31 32
32 33 31 00

I am reading the file and then finding certain octets and replacing them with tags:

while(<FH>){
    $line =~ s/(00 00 00 64)/<incr4>    /g;
    $line =~ s/(00 00 00 65)/<incr4>    /g;
    $line =~ s/(30 30 30 30 30 32)/<incr6ascii:999999:0>/g;
    $line =~ s/(31 31 32 32 33 31)/<incr6ascii:999999:0>/g;
    print OUTPUT $line;
}

So for example, 00 00 00 64 would be replaced by the <incr4> tag. This was working fine, but it doesn't seem to able to match over multiple lines any more. For example the pattern 31 31 32 32 33 31 runs over multiple lines, and the regular expression doesn't seem to catch it. I tried using /m /s pattern modifiers to ignore new lines but they didn't match it either. The only way around it I can come up with, is to read the whole file into a string using:

undef $/;
my $whole_file = <FH>;
my $line = $whole_file;
$line =~ s/(00 00 00 64)/<incr4>    /g;
$line =~ s/(00 00 00 65)/<incr4>    /g;
$line =~ s/(30 30 30 30 30 32)/<incr6ascii:999999:0>/g;
$line =~ s/(31 31 32 32 33 31)/<incr6ascii:999999:0>/g;
print OUTPUT $line;

This works, the tags get inserted correctly, but the structure of the file is radically altered. It is all dumped out on a single line. I would like to retain the structure of the file as it appears here. Any ideas as to how I might do this?

/john

637

asked May 17 '10 20:05

John

1 Answers

The trick here is to match the class of all space like characters \s:

my $file = do {local (@ARGV, $/) = 'filename.txt'; <>}; # slurp file

my %tr = (  # setup a translation table
    '00 00 00 64'       => '<incr4>',
    '00 00 00 65'       => '<incr4>',
    '00 30 30 30 30 32' => '<incr6ascii:999999:0>',
    '31 31 32 32 33 31' => '<incr6ascii:999999:0>',
);

for (keys %tr) {
    my $re = join '\s+' => split;  # construct new regex

    $file =~ s{($re)}{
       $1 =~ /\n/ ? "\n$tr{$_}" : $tr{$_}  # if octets contained \n, add \n
    }ge  # match multiple times, execute the replacement block as perl code
}
print $file;

161

answered Oct 05 '22 18:10

Eric Strom

Related questions
                            
                                Mocking up Apache session data for unit testing
                            
                                Setting up headless Firefox with MozRepl
                            
                                Perl Cygwin fun. Module is loaded, but not being found by the program
                            
                                Why does Time::HiRes::stat break list subscripting?
                            
                                Caching & avoiding Cache Stampedes - multiple simultaneous calculations
                            
                                Does perlbrew work with cygwin?
                            
                                Perl: Dynamic module loading, object inheritance and "common helper files"
                            
                                Garbage collection in Perl threads
                            
                                Why is my Perl program failing with Tie::File and Unicode/UTF-8 encoding?
                            
                                Why is lookahead (sometimes) faster than capturing?
                            
                                Best way to write an init.d script for start_server and starman?
                            
                                Swap keyboard numbers to symbols
                            
                                binmode + mod_perl 2.0.5 + Parse::RecDescent = segmentaion fault
                            
                                Date::Manip Not Installing
                            
                                Perl6: getc in raw mode
                            
                                Can SQLite DB files be made read-only?
                            
                                Sending a signal to a perl script while it is closing a filehandle [duplicate]
                            
                                Create a VSTRING from a scalar variable without using eval
                            
                                Find out which scripts are calling a perl package
                            
                                Moose::Error::Croak error reporting not from perspective of caller

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With