The following is a sample of a large file named AT5G60410.gff:
Chr5 TAIR10 gene 24294890 24301147 . + . ID=AT5G60410;Note=protein_coding_gene;Name=AT5G60410 Chr5 TAIR10 mRNA 24294890 24301147 . + . ID=AT5G60410.1;Parent=AT5G60410;Name=AT5G60410.1;Index=1 Chr5 TAIR10 protein 24295226 24300671 . + . ID=AT5G60410.1-Protein;Name=AT5G60410.1;Derives_from=AT5G60410.1 Chr5 TAIR10 exon 24294890 24295035 . + . Parent=AT5G60410.1 Chr5 TAIR10 five_prime_UTR 24294890 24295035 . + . Parent=AT5G60410.1 Chr5 TAIR10 exon 24295134 24295249 . + . Parent=AT5G60410.1 Chr5 TAIR10 five_prime_UTR 24295134 24295225 . + . Parent=AT5G60410.1 Chr5 TAIR10 CDS 24295226 24295249 . + 0 Parent=AT5G60410.1,AT5G60410.1-Protein; Chr5 TAIR10 exon 24295518 24295598 . + . Parent=AT5G60410.1
I am having some trouble extracting specific lines from this using grep. I wanted to extract all lines that are of type "gene" or type "exon", specified in the third column. I was suprised when this did not work:
grep 'gene|exon' AT5G60410.gff
No results are returned. Where have I gone wrong?
The basic grep syntax when searching multiple patterns in a file includes using the grep command followed by strings and the name of the file or its path. The patterns need to be enclosed using single quotes and separated by the pipe symbol. Use the backslash before pipe | for regular expressions.
To match a character that is special to grep –E, put a backslash ( \ ) in front of the character. It is usually simpler to use grep –F when you don't need special pattern matching.
The main difference between grep and egrep is that grep is a command that allows searching content according to the given regular expression and displaying the matching lines while egrep is a variant of grep that helps to search content by applying extended regular expressions to display the machining lines.
You need to escape the |
. The following should do the job.
grep "gene\|exon" AT5G60410.gff
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With