Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to not print a line if it and the following line starts with the same pattern?

Tags:

regex

sed

awk

I have a file.fa:

>ABC
TGTGTGT
AGAGAGA
TGTAGTA
>BDC
>DTR
>EDF
AGAGGTG
AGTGACA
CAGTGAC

I want to keep the lines without ">", and lines with ">" only if the immediate following line does not have ">":

>ABC
TGTGTGT
AGAGAGA
TGTAGTA
>EDF
AGAGGTG
AGTGACA
CAGTGAC

Looking at the answer for this post, I see that awk '/^>/{x=$0} !/^>/{if(x){print x;x=0;}}' file.fa prints out the header lines (with '>') that I want:

>ABC
>EDF

but how do I also get the lines of text without '>'?

like image 288
hmg Avatar asked Jul 29 '21 19:07

hmg


People also ask

How do you print one line at a time in Python?

Use . readlines() . That code does print out line by line.

How do I print a previous line in Python?

go to the (start of the) previous line: \033[F.

How to print a string without adding a new line?

The new line character in Python is [&n&]. It is used to indicate the end of a line of text. You can print strings without adding a new line with end = <character>, which <character> is the character that will be used to separate the lines.

How to print only lines which do not start with prefix?

Given a text file, read the content of that text file line by line and print only those lines which do not start with defined prefix. Also store those printed lines in another text file. There are following ways in which this task can be done: Method 1: Using loop and startswith ().

How do I find a new line in a text file?

When you see a new line in a text file, a new line character \n has been inserted. You can check this by reading the file with <file>.readlines(), like this: with open("names.txt", "r") as f: print(f.readlines()) The output is: As you can see, the first three lines of the text file end with a new line \n character that works "behind the scenes."

How do you check if a line begins with “Geeks”?

We check if that line begins with “Geeks” using regular exapression. If that line begins with “Geeks” we skip that line, and we print the rest of the lines and store those lines in another file.


Video Answer


1 Answers

Using sed:

$  sed '/^>/ { N; /\n>/ D; }' input.txt
>ABC
TGTGTGT
AGAGAGA
TGTAGTA
>EDF
AGAGGTG
AGTGACA
CAGTGAC

If a line starts with >, read the next line and append it to the pattern space. If it also starts with >, delete the first line of the pattern space and repeat with the second line just read as the new input line to look at. Print everything else.

like image 98
Shawn Avatar answered Nov 16 '22 02:11

Shawn