I have a file like this:
01 00 01 14 c0 00 01 10 01 00 00 16 00 00 00 64
00 00 00 65 00 00 01 07 40 00 00 22 68 61 6c 2e
6f 70 65 6e 65 74 2e 63 6f 6d 3b 30 30 30 30 30
30 30 30 32 3b 30 00 00 00 00 01 08 40 00 00 1e
68 61 6c 2e 6f 70 65 6e 65 74 2d 74 65 6c 65 63
6f 6d 2e 6c 61 6e 00 00 00 00 01 28 40 00 00 21
72 65 61 6c 6d 31 2e 6f 70 65 6e 65 74 2d 74 65
6c 65 63 6f 6d 2e 6c 61 6e 00 00 00 00 00 01 25
40 00 00 1e 68 61 6c 2e 6f 70 65 6e 65 74 2d 74
65 6c 65 63 6f 6d 2e 6c 61 6e 00 00 00 00 01 1b
40 00 00 20 72 65 61 6c 6d 2e 6f 70 65 6e 65 74
2d 74 65 6c 65 63 6f 6d 2e 6c 61 6e 00 00 01 02
40 00 00 0c 01 00 00 16 00 00 01 a0 40 00 00 0c
00 00 00 01 00 00 01 9f 40 00 00 0c 00 00 00 00
00 00 01 16 40 00 00 0c 00 00 00 00 00 00 01 bb
40 00 00 28 00 00 01 c2 40 00 00 0c 00 00 00 00
00 00 01 bc 40 00 00 13 31 39 37 37 31 31 31 32
32 33 31 00
I am reading the file and then finding certain octets and replacing them with tags:
while(<FH>){
$line =~ s/(00 00 00 64)/<incr4> /g;
$line =~ s/(00 00 00 65)/<incr4> /g;
$line =~ s/(30 30 30 30 30 32)/<incr6ascii:999999:0>/g;
$line =~ s/(31 31 32 32 33 31)/<incr6ascii:999999:0>/g;
print OUTPUT $line;
}
So for example, 00 00 00 64
would be replaced by the <incr4>
tag. This was working fine, but it doesn't seem to able to match over multiple lines any more. For example the pattern 31 31 32 32 33 31
runs over multiple lines, and the regular expression doesn't seem to catch it. I tried using /m /s pattern modifiers to ignore new lines but they didn't match it either. The only way around it I can come up with, is to read the whole file into a string using:
undef $/;
my $whole_file = <FH>;
my $line = $whole_file;
$line =~ s/(00 00 00 64)/<incr4> /g;
$line =~ s/(00 00 00 65)/<incr4> /g;
$line =~ s/(30 30 30 30 30 32)/<incr6ascii:999999:0>/g;
$line =~ s/(31 31 32 32 33 31)/<incr6ascii:999999:0>/g;
print OUTPUT $line;
This works, the tags get inserted correctly, but the structure of the file is radically altered. It is all dumped out on a single line. I would like to retain the structure of the file as it appears here. Any ideas as to how I might do this?
/john
Use /m , /s , or both as pattern modifiers. /s lets . match newline (normally it doesn't). If the string had more than one line in it, then /foo. *bar/s could match a "foo" on one line and a "bar" on a following line.
The Special Character Classes in Perl are as follows: Digit \d[0-9]: The \d is used to match any digit character and its equivalent to [0-9]. In the regex /\d/ will match a single digit. The \d is standardized to “digit”.
The operator =~ associates the string with the regex match and produces a true value if the regex matched, or false if the regex did not match.
The trick here is to match the class of all space like characters \s
:
my $file = do {local (@ARGV, $/) = 'filename.txt'; <>}; # slurp file
my %tr = ( # setup a translation table
'00 00 00 64' => '<incr4>',
'00 00 00 65' => '<incr4>',
'00 30 30 30 30 32' => '<incr6ascii:999999:0>',
'31 31 32 32 33 31' => '<incr6ascii:999999:0>',
);
for (keys %tr) {
my $re = join '\s+' => split; # construct new regex
$file =~ s{($re)}{
$1 =~ /\n/ ? "\n$tr{$_}" : $tr{$_} # if octets contained \n, add \n
}ge # match multiple times, execute the replacement block as perl code
}
print $file;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With