I'm running a grep
to find any *.sql file that has the word select
followed by the word customerName
followed by the word from
. This select statement can span many lines and can contain tabs and newlines.
I've tried a few variations on the following:
$ grep -liIr --include="*.sql" --exclude-dir="\.svn*" --regexp="select[a-zA-Z0-
9+\n\r]*customerName[a-zA-Z0-9+\n\r]*from"
This, however, just runs forever. Can anyone help me with the correct syntax please?
Using grep with the -P Option. The problem with using grep's regular expression is that the pattern is limited to only a single line. While it's possible to use grep multiple times to achieve the required result, it's more convenient to use the -P or –perl-regexp option. The -P option enables the PCRE add-on for grep.
Use the -A argument to grep to specify how many lines beyond the match to output. And use -B n to grep lines before the match. And -C in grep to add lines both above and below the match!
GNU grep supports three regular expression syntaxes, Basic, Extended, and Perl-compatible. In its simplest form, when no regular expression type is given, grep interpret search patterns as basic regular expressions. To interpret the pattern as an extended regular expression, use the -E ( or --extended-regexp ) option.
Without the need to install the grep variant pcregrep
, you can do a multiline search with grep.
$ grep -Pzo "(?s)^(\s*)\N*main.*?{.*?^\1}" *.c
Explanation:
-P
activate perl-regexp for grep (a powerful extension of regular expressions)
-z
Treat the input as a set of lines, each terminated by a zero byte (the ASCII NUL character) instead of a newline. That is, grep knows where the ends of the lines are, but sees the input as one big line. Beware this also adds a trailing NUL char if used with -o
, see comments.
-o
print only matching. Because we're using -z
, the whole file is like a single big line, so if there is a match, the entire file would be printed; this way it won't do that.
In regexp:
(?s)
activate PCRE_DOTALL
, which means that .
finds any character or newline
\N
find anything except newline, even with PCRE_DOTALL
activated
.*?
find .
in non-greedy mode, that is, stops as soon as possible.
^
find start of line
\1
backreference to the first group (\s*
). This is a try to find the same indentation of method.
As you can imagine, this search prints the main method in a C (*.c
) source file.
I am not very good in grep. But your problem can be solved using AWK command. Just see
awk '/select/,/from/' *.sql
The above code will result from first occurence of select
till first sequence of from
. Now you need to verify whether returned statements are having customername
or not. For this you can pipe the result. And can use awk or grep again.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With