Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

grep pattern from file, print the pattern instead matched string

Tags:

grep

bash

awk

I want to grep with patterns from file containing regex. When the pattern matches, it prints the matched stringa but not the pattern. How can I get the pattern instead matched strings?

pattern.txt

Apple (Ball|chocolate|fall) Donut
donut (apple|ball) Chocolate
Donut Gorilla Chocolate
Chocolate (English|Fall) apple gorilla
gorilla chocolate (apple|ball)
(ball|donut) apple

strings.txt

apple ball Donut
donut ball chocolate
donut Ball Chocolate
apple donut
chocolate ball Apple

This is grep command

grep -Eix -f pattern.txt strings.txt

This command prints matched strings from strings.txt

apple ball Donut
donut ball chocolate
donut Ball Chocolate

But I want to find which patterns were used to match from pattern.txt

Apple (Ball|chocolate|fall) Donut
donut (apple|ball) Chocolate

The pattern.txt can be lower cases, upper cases, line with regex and without, free numbers of words and regex elements. There is no other kind of regex than brackets and pipe.

I don't want to use loop to read pattern.txt each line to grep as it's slow. Is there way to print which pattern or line number of pattern file in grep command? or any other command than grep can do the job not too slow?

like image 796
haru Avatar asked Aug 13 '18 12:08

haru


People also ask

How do you print only the matched pattern using grep?

Displaying only the matched pattern : By default, grep displays the entire line which has the matched string. We can make the grep to display only the matched string by using the -o option. 6. Show line number while displaying the output using grep -n : To show the line number of file with the line matched.

How do you print the lines which are not having the given patterns?

The grep command in unix by default prints the lines from a file that contain the specified pattern. We can use the same grep command to display the lines that do not contain the specified pattern.

How do you find a pattern in a file?

The grep utility searches the given input files selecting lines which match one or more patterns. The type of patterns is controlled by the options specified. By default, a pattern matches an input line if any regular expression (RE) in the pattern matches the input line without its trailing newline.

What does grep return if no match?

Indeed, grep returns 0 if it matches, and non-zero if it does not.


1 Answers

Using grep I have no idea but with GNU awk:

$ awk '
BEGIN { IGNORECASE = 1 }      # for case insensitivity
NR==FNR {                     # process pattern file
    a[$0]                     # hash the entries to a
    next                      # process next line
}
{                             # process strings file
    for(i in a)               # loop all pattern file entries
        if($0 ~ "^" i "$") {  # if there is a match (see comments)
            print i           # output the matching pattern file entry
            # delete a[i]     # uncomment to delete matched patterns from a
            # next            # uncomment to end searching after first match
        }
}' pattern strings

outputs:

D (A|B) C

For each line in strings script will loop every pattern line to see if there are more than one match. There is only one match due to case-sensitivity. You can battle that, for example, using GNU awk's IGNORECASE.

Also, if you want each matched one pattern file entry to be outputed once, you could delete them from a after first match: add delete a[i] after the print. That might give you some performance advantage also.

like image 121
James Brown Avatar answered Oct 05 '22 10:10

James Brown