Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Delete lines before and after a match in bash (with sed or awk)?

Tags:

shell

sed

awk

I'm trying to delete two lines either side of a pattern match from a file full of transactions. Ie. find the match then delete two lines before it, then delete two lines after it and then delete the match. The write this back to the original file.

So the input data is

D28/10/2011
T-3.48
PINITIAL BALANCE
M
^

and my pattern is

sed -i '/PINITIAL BALANCE/,+2d' test.txt

However this is only deleting two lines after the pattern match and then deleting the pattern match. I can't work out any logical way to delete all 5 lines of data from the original file using sed.

like image 604
juliushibert Avatar asked Aug 03 '12 10:08

juliushibert


People also ask

How do you delete a line in a file using sed?

To delete a line, we'll use the sed “d” command. Note that you have to declare which line to delete. Otherwise, sed will delete all the lines.

How do I delete an AWK record?

To delete line 1, use awk 'NR!= 1'. The default action is to print the line. All of your '{next} {print}' terms can be removed.


3 Answers

an awk one-liner may do the job:

awk '/PINITIAL BALANCE/{for(x=NR-2;x<=NR+2;x++)d[x];}{a[NR]=$0}END{for(i=1;i<=NR;i++)if(!(i in d))print a[i]}' file

test:

kent$  cat file
######
foo
D28/10/2011
T-3.48
PINITIAL BALANCE
M
x
bar
######
this line will be kept
here
comes
PINITIAL BALANCE
again
blah
this line will be kept too
########

kent$  awk '/PINITIAL BALANCE/{for(x=NR-2;x<=NR+2;x++)d[x];}{a[NR]=$0}END{for(i=1;i<=NR;i++)if(!(i in d))print a[i]}' file
######
foo
bar
######
this line will be kept
this line will be kept too
########

add some explanation

  awk '/PINITIAL BALANCE/{for(x=NR-2;x<=NR+2;x++)d[x];}   #if match found, add the line and +- 2 lines' line number in an array "d"
      {a[NR]=$0} # save all lines in an array with line number as index
      END{for(i=1;i<=NR;i++)if(!(i in d))print a[i]}' #finally print only those index not in array "d"
     file  # your input file
like image 120
Kent Avatar answered Oct 11 '22 13:10

Kent


sed will do it:

sed '/\n/!N;/\n.*\n/!N;/\n.*\n.*PINITIAL BALANCE/{$d;N;N;d};P;D'

It works this way:

  • if sed has only one string in pattern space it joins another one
  • if there are only two it joins the third one
  • if it does natch to pattern LINE + LINE + LINE with BALANCE it joins two following strings, deletes them and goes at the beginning
  • if not, it prints the first string from pattern and deletes it and goes at the beginning without swiping the pattern space

To prevent the appearance of pattern on the first string you should modify the script:

sed '1{/PINITIAL BALANCE/{N;N;d}};/\n/!N;/\n.*\n/!N;/\n.*\n.*PINITIAL BALANCE/{$d;N;N;d};P;D'

However, it fails in case you have another PINITIAL BALANCE in string which are going to be deleted. However, other solutions fails too =)

like image 36
rush Avatar answered Oct 11 '22 13:10

rush


For such a task, I would probably reach for a more advanced tool like Perl:

perl -ne 'push @x, $_;
          if (@x > 4) {
              if ($x[2] =~ /PINITIAL BALANCE/) { undef @x }
                  else { print shift @x }
          }
          END { print @x }' input-file > output-file

This will remove 5 lines from the input file. These lines will be the 2 lines before the match, the matched line, and the two lines afterwards. You can change the total number of lines being removed modifying @x > 4 (this removes 5 lines) and the line being matched modifying $x[2] (this makes the match on the third line to be removed and so removes the two lines before the match).

like image 2
choroba Avatar answered Oct 11 '22 13:10

choroba