Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to print lines between two patterns, inclusive or exclusive (in sed, AWK or Perl)?

I have a file like the following and I would like to print the lines between two given patterns PAT1 and PAT2.

1 2 PAT1 3    - first block 4 PAT2 5 6 PAT1 7    - second block PAT2 8 9 PAT1 10    - third block 

I have read How to select lines between two marker patterns which may occur multiple times with awk/sed but I am curious to see all the possible combinations of this, either including or excluding the pattern.

How can I print all lines between two patterns?

like image 978
fedorqui 'SO stop harming' Avatar asked Aug 16 '16 10:08

fedorqui 'SO stop harming'


People also ask

How do I print lines between two patterns?

The sed command will, by default, print the pattern space at the end of each cycle. However, in this example, we only want to ask sed to print the lines we need. Therefore, we've used the -n option to prevent the sed command from printing the pattern space. Instead, we'll control the output using the p command.

How do I print on awk?

To print a blank line, use print "" , where "" is the empty string. To print a fixed piece of text, use a string constant, such as "Don't Panic" , as one item. If you forget to use the double-quote characters, your text is taken as an awk expression, and you will probably get an error.


2 Answers

Print lines between PAT1 and PAT2

$ awk '/PAT1/,/PAT2/' file PAT1 3    - first block 4 PAT2 PAT1 7    - second block PAT2 PAT1 10    - third block 

Or, using variables:

awk '/PAT1/{flag=1} flag; /PAT2/{flag=0}' file 

How does this work?

  • /PAT1/ matches lines having this text, as well as /PAT2/ does.
  • /PAT1/{flag=1} sets the flag when the text PAT1 is found in a line.
  • /PAT2/{flag=0} unsets the flag when the text PAT2 is found in a line.
  • flag is a pattern with the default action, which is to print $0: if flag is equal 1 the line is printed. This way, it will print all those lines occurring from the time PAT1 occurs and up to the next PAT2 is seen. This will also print the lines from the last match of PAT1 up to the end of the file.

Print lines between PAT1 and PAT2 - not including PAT1 and PAT2

$ awk '/PAT1/{flag=1; next} /PAT2/{flag=0} flag' file 3    - first block 4 7    - second block 10    - third block 

This uses next to skip the line that contains PAT1 in order to avoid this being printed.

This call to next can be dropped by reshuffling the blocks: awk '/PAT2/{flag=0} flag; /PAT1/{flag=1}' file.

Print lines between PAT1 and PAT2 - including PAT1

$ awk '/PAT1/{flag=1} /PAT2/{flag=0} flag' file PAT1 3    - first block 4 PAT1 7    - second block PAT1 10    - third block 

By placing flag at the very end, it triggers the action that was set on either PAT1 or PAT2: to print on PAT1, not to print on PAT2.

Print lines between PAT1 and PAT2 - including PAT2

$ awk 'flag; /PAT1/{flag=1} /PAT2/{flag=0}' file 3    - first block 4 PAT2 7    - second block PAT2 10    - third block 

By placing flag at the very beginning, it triggers the action that was set previously and hence print the closing pattern but not the starting one.

Print lines between PAT1 and PAT2 - excluding lines from the last PAT1 to the end of file if no other PAT2 occurs

This is based on a solution by Ed Morton.

awk 'flag{         if (/PAT2/)            {printf "%s", buf; flag=0; buf=""}         else             buf = buf $0 ORS      }      /PAT1/ {flag=1}' file 

As a one-liner:

$ awk 'flag{ if (/PAT2/){printf "%s", buf; flag=0; buf=""} else buf = buf $0 ORS}; /PAT1/{flag=1}' file 3    - first block 4 7    - second block  # note the lack of third block, since no other PAT2 happens after it 

This keeps all the selected lines in a buffer that gets populated from the moment PAT1 is found. Then, it keeps being filled with the following lines until PAT2 is found. In that point, it prints the stored content and empties the buffer.

like image 91
4 revs Avatar answered Oct 07 '22 05:10

4 revs


What about the classic sed solution?

Print lines between PAT1 and PAT2 - include PAT1 and PAT2

sed -n '/PAT1/,/PAT2/p' FILE 

Print lines between PAT1 and PAT2 - exclude PAT1 and PAT2

GNU sed
sed -n '/PAT1/,/PAT2/{/PAT1/!{/PAT2/!p}}' FILE 
Any sed1
sed -n '/PAT1/,/PAT2/{/PAT1/!{/PAT2/!p;};}' FILE 

or even (Thanks Sundeep):

GNU sed
sed -n '/PAT1/,/PAT2/{//!p}' FILE 
Any sed
sed -n '/PAT1/,/PAT2/{//!p;}' FILE 

Print lines between PAT1 and PAT2 - include PAT1 but not PAT2

The following includes just the range start:

GNU sed
sed -n '/PAT1/,/PAT2/{/PAT2/!p}' FILE 
Any sed
sed -n '/PAT1/,/PAT2/{/PAT2/!p;}' FILE 

Print lines between PAT1 and PAT2 - include PAT2 but not PAT1

The following includes just the range end:

GNU sed
sed -n '/PAT1/,/PAT2/{/PAT1/!p}' FILE 
Any sed
sed -n '/PAT1/,/PAT2/{/PAT1/!p;}' FILE 

1 Note about BSD/Mac OS X sed

A command like this here:

sed -n '/PAT1/,/PAT2/{/PAT1/!{/PAT2/!p}}' FILE 

Would emit an error:

▶ sed -n '/PAT1/,/PAT2/{/PAT1/!{/PAT2/!p}}' FILE sed: 1: "/PAT1/,/PAT2/{/PAT1/!{/ ...": extra characters at the end of p command 

For this reason this answer has been edited to include BSD and GNU versions of the one-liners.

like image 40
hek2mgl Avatar answered Oct 07 '22 05:10

hek2mgl