Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use grep to extract multiple groups

Tags:

regex

grep

pcre

Say I have this file data.txt:

a=0,b=3,c=5
a=2,b=0,c=4
a=3,b=6,c=7

I want to use grep to extract 2 columns corresponding to the values of a and c:

0 5
2 4
3 7

I know how to extract each column separately:

grep -oP 'a=\K([0-9]+)' data.txt
0
2
3

And:

grep -oP 'c=\K([0-9]+)' data.txt
5
4
7

But I can't figure how to extract the two groups. I tried the following, which didn't work:

grep -oP 'a=\K([0-9]+),.+c=\K([0-9]+)' data.txt
5
4
7
like image 550
usual me Avatar asked Oct 15 '14 12:10

usual me


People also ask

How to grep multiple patterns in grep?

If you want to find exact matches for multiple patterns, pass the -w flag to the grep command. As you can see, the results are different. The first command shows all lines with the strings you used. The second command shows how to grep exact matches for multiple strings.

How do you grep multiple lines after a match?

Use the -A argument to grep to specify how many lines beyond the match to output. And use -B n to grep lines before the match. And -C in grep to add lines both above and below the match!

How do I grep multiple Word documents?

To search multiple files with the grep command, insert the filenames you want to search, separated with a space character. The terminal prints the name of every file that contains the matching lines, and the actual lines that include the required string of characters. You can append as many filenames as needed.


2 Answers

I am also curious about grep being able to do so. \K "removes" the previous content that is stored, so you cannot use it twice in the same expression: it will just show the last group. Hence, it should be done differently.

In the meanwhile, I would use sed:

sed -r 's/^a=([0-9]+).*c=([0-9]+)$/\1 \2/' file

it catches the digits after a= and c=, whenever this happens on lines starting with a= and not containing anything else after c=digits.

For your input, it returns:

0 5
2 4
3 7
like image 109
fedorqui 'SO stop harming' Avatar answered Sep 20 '22 17:09

fedorqui 'SO stop harming'


You could try the below grep command. But note that , grep would display each match in separate new line. So you won't get the format like you mentioned in the question.

$ grep -oP 'a=\K([0-9]+)|c=\K([0-9]+)' file
0
5
2
4
3
7

To get the mentioned format , you need to pass the output of grep to paste or any other commands .

$ grep -oP 'a=\K([0-9]+)|c=\K([0-9]+)' file | paste -d' ' - -
0 5
2 4
3 7
like image 28
Avinash Raj Avatar answered Sep 21 '22 17:09

Avinash Raj