Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I extract lines between two line delimiters in Perl?

I have an ASCII log file with some content I would like to extract. I've never taken time to learn Perl properly, but I figure this is a good tool for this task.

The file is structured like this:

... 
... some garbage 
... 
... garbage START
what i want is 
on different
lines 
END 
... 
... more garbage ...
next one START 
more stuff I want, again
spread 
through 
multiple lines 
END 
...
more garbage

So, I'm looking for a way to extract the lines between each START and END delimiter strings. How can I do this?

So far, I've only found some examples on how to print a line with the START string, or other documentation items that are somewhat related with what I'm looking for.

like image 267
jbatista Avatar asked Jul 31 '09 14:07

jbatista


4 Answers

You want the flip-flop operator (also known as the range operator) ..

#!/usr/bin/env perl
use strict;
use warnings;

while (<>) {
  if (/START/../END/) {
    next if /START/ || /END/;
    print;
  }
}

Replace the call to print with whatever you actually want to do (e.g., push the line into an array, edit it, format it, whatever). I'm next-ing past the lines that actually have START or END, but you may not want that behavior. See this article for a discussion of this operator and other useful Perl special variables.

like image 65
Telemachus Avatar answered Nov 16 '22 21:11

Telemachus


From perlfaq6's answer to How can I pull out lines between two patterns that are themselves on different lines?


You can use Perl's somewhat exotic .. operator (documented in perlop):

perl -ne 'print if /START/ .. /END/' file1 file2 ...

If you wanted text and not lines, you would use

perl -0777 -ne 'print "$1\n" while /START(.*?)END/gs' file1 file2 ...

But if you want nested occurrences of START through END, you'll run up against the problem described in the question in this section on matching balanced text.

Here's another example of using ..:

while (<>) {
    $in_header =   1  .. /^$/;
    $in_body   = /^$/ .. eof;
# now choose between them
} continue {
    $. = 0 if eof;  # fix $.
}
like image 32
brian d foy Avatar answered Nov 16 '22 22:11

brian d foy


How can I grab multiple lines after a matching line in Perl?

How's that one? In that one, the END string is $^, you can change it to your END string.

I am also a novice, but the solutions there provide quite a few methods... let me know more specifically what it is you want that differs from the above link.

like image 1
Dirk Avatar answered Nov 16 '22 23:11

Dirk


while (<>) {
    chomp;      # strip record separator
    if(/END/) { $f=0;}
    if (/START/) {
        s/.*START//g;
        $f=1;
    }
    print $_ ."\n" if $f;
}

try to write some code next time round

like image 1
ghostdog74 Avatar answered Nov 16 '22 21:11

ghostdog74