I'm running a <code>grep</code> to find any *.sql file that has the word <code>select</code> followed by the word <code>customerName</code> followed by the word <code>from</code>. This select statement can span many lines and can contain tabs and newlines. I've tried a few variations on the following: <pre class="prettyprint"><code>$ grep -liIr --include="*.sql" --exclude-dir="\.svn*" --regexp="select[a-zA-Z0- 9+\n\r]*customerName[a-zA-Z0-9+\n\r]*from" </code></pre> This, however, just runs forever. Can anyone help me with the correct syntax please?

Without the need to install the grep variant <code>pcregrep</code>, you can do a multiline search with grep. <pre class="prettyprint"><code>$ grep -Pzo "(?s)^(\s*)\N*main.*?{.*?^\1}" *.c </code></pre> Explanation: <code>-P</code> activate perl-regexp for grep (a powerful extension of regular expressions) <code>-z</code> Treat the input as a set of lines, each terminated by a zero byte (the ASCII NUL character) instead of a newline. That is, grep knows where the ends of the lines are, but sees the input as one big line. Beware this also adds a trailing NUL char if used with <code>-o</code>, see comments. <code>-o</code> print only matching. Because we're using <code>-z</code>, the whole file is like a single big line, so if there is a match, the entire file would be printed; this way it won't do that. In regexp: <code>(?s)</code> activate <code>PCRE_DOTALL</code>, which means that <code>.</code> finds any character or newline <code>\N</code> find anything except newline, even with <code>PCRE_DOTALL</code> activated <code>.*?</code> find <code>.</code> in non-greedy mode, that is, stops as soon as possible. <code>^</code> find start of line <code>\1</code> backreference to the first group (<code>\s*</code>). This is a try to find the same indentation of method. As you can imagine, this search prints the main method in a C (<code>*.c</code>) source file.

I am not very good in grep. But your problem can be solved using AWK command. Just see <pre class="prettyprint"><code>awk '/select/,/from/' *.sql </code></pre> The above code will result from first occurence of <code>select</code> till first sequence of <code>from</code>. Now you need to verify whether returned statements are having <code>customername</code> or not. For this you can pipe the result. And can use awk or grep again.

Regex (grep) for multi-line search needed [duplicate]

Tags:

regex

linux

grep

cygwin

I'm running a grep to find any *.sql file that has the word select followed by the word customerName followed by the word from. This select statement can span many lines and can contain tabs and newlines.

I've tried a few variations on the following:

$ grep -liIr --include="*.sql" --exclude-dir="\.svn*" --regexp="select[a-zA-Z0-
9+\n\r]*customerName[a-zA-Z0-9+\n\r]*from"

This, however, just runs forever. Can anyone help me with the correct syntax please?

517

asked Oct 06 '22 14:10

Ciaran Archer

2 Answers

Without the need to install the grep variant pcregrep, you can do a multiline search with grep.

$ grep -Pzo "(?s)^(\s*)\N*main.*?{.*?^\1}" *.c

Explanation:

-P activate perl-regexp for grep (a powerful extension of regular expressions)

-z Treat the input as a set of lines, each terminated by a zero byte (the ASCII NUL character) instead of a newline. That is, grep knows where the ends of the lines are, but sees the input as one big line. Beware this also adds a trailing NUL char if used with -o, see comments.

-o print only matching. Because we're using -z, the whole file is like a single big line, so if there is a match, the entire file would be printed; this way it won't do that.

In regexp:

(?s) activate PCRE_DOTALL, which means that . finds any character or newline

\N find anything except newline, even with PCRE_DOTALL activated

.*? find . in non-greedy mode, that is, stops as soon as possible.

^ find start of line

\1 backreference to the first group (\s*). This is a try to find the same indentation of method.

As you can imagine, this search prints the main method in a C (*.c) source file.

598

answered Oct 16 '22 11:10

albfan

I am not very good in grep. But your problem can be solved using AWK command. Just see

awk '/select/,/from/' *.sql

The above code will result from first occurence of select till first sequence of from. Now you need to verify whether returned statements are having customername or not. For this you can pipe the result. And can use awk or grep again.

202

answered Oct 16 '22 13:10

Amit

Related questions
                            
                                Can regular expressions be used to match nested patterns? [duplicate]
                            
                                Regex Email validation
                            
                                How can I use Unicode-aware regular expressions in JavaScript?
                            
                                Grep regex NOT containing string
                            
                                Regex for string not ending with given suffix
                            
                                Regular expression to allow spaces between words
                            
                                python re.sub group: number after \number
                            
                                Regular expression for a string that does not start with a sequence
                            
                                Regex for matching something if it is not preceded by something else
                            
                                Named regular expression group "(?P<group_name>regexp)": what does "P" stand for?
                            
                                How to find patterns across multiple lines using grep?
                            
                                Is there a version of JavaScript's String.indexOf() that allows for regular expressions?
                            
                                Javascript Regex: How to put a variable inside a regular expression? [duplicate]
                            
                                How to use '-prune' option of 'find' in sh?
                            
                                jQuery validate: How to add a rule for regular expression validation?
                            
                                How do I extract text that lies between parentheses (round brackets)?
                            
                                Named capturing groups in JavaScript regex?
                            
                                Remove all special characters from a string [duplicate]
                            
                                Match two strings in one line with grep
                            
                                How to match, but not capture, part of a regex?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With