Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I print 2 lines if the second line contains the same match as the first line?

Let's say I have a file with several million lines, organized like this:

@1:N:0:ABC
XYZ

@1:N:0:ABC
ABC

I am trying to write a one-line grep/sed/awk matching function that returns both lines if the NCCGGAGA line from the first line is found in the second line.

When I try to use grep -A1 -P and pipe the matches with a match like '(?<=:)[A-Z]{3}', I get stuck. I think my creativity is failing me here.

like image 305
Ryan Ward Avatar asked Dec 13 '22 17:12

Ryan Ward


2 Answers

With awk

$ awk -F: 'NF==1 && $0 ~ s{print p ORS $0} {s=$NF; p=$0}' ip.txt
@1:N:0:ABC
ABC
  • -F: use : as delimiter, makes it easy to get last column
  • s=$NF; p=$0 save last column value and entire line for printing later
  • NF==1 if line doesn't contain :
  • $0 ~ s if line contains the last column data saved previously
    • if search data can contain regex meta characters, use index($0,s) instead to search literally
  • note that this code assumes input file having line containing : followed by line which doesn't have :


With GNU sed (might work with other versions too, syntax might differ though)

$ sed -nE '/:/{N; /.*:(.*)\n.*\1/p}' ip.txt
@1:N:0:ABC
ABC
  • /:/ if line contains :
  • N add next line to pattern space
  • /.*:(.*)\n.*\1/ capture string after last : and check if it is present in next line

again, this assumes input like shown in question.. this won't work for cases like

@1:N:0:ABC
@1:N:0:XYZ
XYZ
like image 142
Sundeep Avatar answered May 06 '23 10:05

Sundeep


This might work for you (GNU sed):

sed -n 'N;/.*:\(.*\)\n.*\1/p;D' file

Use grep-like option -n to explicitly print lines. Read two lines into the pattern space and print both if they meet the requirements. Always delete the first and repeat.

like image 22
potong Avatar answered May 06 '23 10:05

potong