Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to combine multiple grep commands?

Tags:

linux

grep

I have a long .txt file (LONG.txt). In that txt file, I want to search for 3 types of patterns and then I want to capture the grep result into a new txt file (SHORT.txt).

Patterns:

  1. AAAAA

  2. BBBBB

  3. CCCCC

NOTE:

When pattern AAAAA or BBBBB is found, I want to print only that line which contain AAAAA or BBBBB.

When pattern CCCCC is found, I want to print that line which contain CCCCC + next 1 line.

Example:

LONG.txt:

bla bla 
bla bla 
bla bla 
something something AAAAA something something
bla bla 
bla bla 
something something CCCCC something something
bla bla 
bla bla 
bla bla 
bla bla 
bla bla 
bla bla 
something something BBBBB something something
bla bla 
bla bla 
bla bla 
something something AAAAA something something
bla bla 
something something AAAAA something something
bla bla 
something something BBBBB something something
bla bla 
bla bla 
bla bla 
something something CCCCC something something
bla bla
bla bla
bla bla

Output should be:

something something AAAAA something something
something something CCCCC something something
bla bla 
something something BBBBB something something
something something AAAAA something something
something something AAAAA something something
something something BBBBB something something
something something CCCCC something something
bla bla

What I tried is:

grep -B0 "AAAAA" LONG.txt > SHORT.txt
grep -B0 "BBBBB" LONG.txt > SHORT.txt
grep -B1 "CCCCC" LONG.txt > SHORT.txt

But this doesn't give me desired output.

like image 876
Jatin Avatar asked Jan 06 '23 20:01

Jatin


2 Answers

Your code would keep overwriting the file because you used a single arrow.

Use a single arrow the first time and double arrows subsequent times to append to the file.

grep "AAAAA" LONG.txt > SHORT.txt
grep "BBBBB" LONG.txt >> SHORT.txt
grep -A1 "CCCCC" LONG.txt >> SHORT.txt

The first two grep commands print just the line with the match and the last one prints the line and one line after.


Additional explanation of grep:

By default it returns just the matching lines. If you pass the -A flag with a number it will show the matching lines and that number of lines after. E.g. -A1 prints the matching line and the next line as per your request. Similarly, the -B flag prints lines before the match.

Remember: -A = After, -B = Before.


UPDATE

There's the additional requirement that the output retain the order in which they appeared in the original file.

Here's a script to do it:

grep -n "AAAAA" LONG.txt > SHORT.txt
grep -n "BBBBB" LONG.txt >> SHORT.txt
grep -n -A1 "CCCCC" LONG.txt >> SHORT.txt
sort -n -o SHORT.txt SHORT.txt

sed -i 's/^[0-9]\+//' SHORT.txt
sed -i 's/^.//g' SHORT.txt

Only main difference here is that I use the -n flag in the grep to print the line numbers then I use sort to sort the file by these line numbers. The line numbers will still be present in this output file, so you may want to remove those.

like image 103
njachowski Avatar answered Jan 09 '23 10:01

njachowski


awk '/AAA|BBB|CCC/ {print; if ($0 ~ /CCC/) {getline; print;} }'

like image 38
gudok Avatar answered Jan 09 '23 09:01

gudok