With gawk, you can use the <code>match</code> function to capture parenthesized groups. <pre class="prettyprint"><code>gawk 'match($0, pattern, ary) {print ary[1]}' </code></pre> example: <pre class="prettyprint"><code>echo "abcdef" | gawk 'match($0, /b(.*)e/, a) {print a[1]}' </code></pre> outputs <code>cd</code>. Note the specific use of gawk which implements the feature in question. For a portable alternative you can achieve similar results with <code>match()</code> and <code>substr</code>. example: <pre class="prettyprint"><code>echo "abcdef" | awk 'match($0, /b[^e]*/) {print substr($0, RSTART+1, RLENGTH-1)}' </code></pre> outputs <code>cd</code>. That was a stroll down memory lane... I replaced awk by perl a long time ago. Apparently the AWK regular expression engine does not capture its groups. you might consider using something like : <pre class="prettyprint"><code>perl -n -e'/test(\d+)/ && print $1' </code></pre> the -n flag causes perl to loop over every line like awk does. This is something I need all the time so I created a bash function for it. It's based on glenn jackman's answer. <h2>Definition</h2> Add this to your .bash_profile etc. <pre class="prettyprint"><code>function regex { gawk 'match($0,/'$1'/, ary) {print ary['${2:-'0'}']}'; } </code></pre> <h2>Usage</h2> Capture regex for each line in file <pre class="prettyprint"><code>$ cat filename | regex '.*' </code></pre> Capture 1st regex capture group for each line in file <pre class="prettyprint"><code>$ cat filename | regex '(.*)' 1 </code></pre> You can use GNU awk: <pre class="prettyprint"><code>$ cat hta RewriteCond %{HTTP_HOST} !^www\.mysite\.net$ RewriteRule (.*) http://www.mysite.net/$1 [R=301,L] $ gawk 'match($0, /.*(http.*?)\$/, m) { print m[1]; }' < hta http://www.mysite.net/ </code></pre>

AWK: Access captured group from line pattern

Tags:

regex

awk

With gawk, you can use the match function to capture parenthesized groups.

gawk 'match($0, pattern, ary) {print ary[1]}'

example:

echo "abcdef" | gawk 'match($0, /b(.*)e/, a) {print a[1]}'

outputs cd.

Note the specific use of gawk which implements the feature in question.

For a portable alternative you can achieve similar results with match() and substr.

example:

echo "abcdef" | awk 'match($0, /b[^e]*/) {print substr($0, RSTART+1, RLENGTH-1)}'

outputs cd.

That was a stroll down memory lane...

I replaced awk by perl a long time ago.

Apparently the AWK regular expression engine does not capture its groups.

you might consider using something like :

perl -n -e'/test(\d+)/ && print $1'

the -n flag causes perl to loop over every line like awk does.

This is something I need all the time so I created a bash function for it. It's based on glenn jackman's answer.

Definition

Add this to your .bash_profile etc.

function regex { gawk 'match($0,/'$1'/, ary) {print ary['${2:-'0'}']}'; }

Usage

Capture regex for each line in file

$ cat filename | regex '.*'

Capture 1st regex capture group for each line in file

$ cat filename | regex '(.*)' 1

You can use GNU awk:

$ cat hta
RewriteCond %{HTTP_HOST} !^www\.mysite\.net$
RewriteRule (.*) http://www.mysite.net/$1 [R=301,L]

$ gawk 'match($0, /.*(http.*?)\$/, m) { print m[1]; }' < hta
http://www.mysite.net/

Related questions
                            
                                Regex for matching something if it is not preceded by something else
                            
                                Named regular expression group "(?P<group_name>regexp)": what does "P" stand for?
                            
                                How to find patterns across multiple lines using grep?
                            
                                Is there a version of JavaScript's String.indexOf() that allows for regular expressions?
                            
                                Javascript Regex: How to put a variable inside a regular expression? [duplicate]
                            
                                How to use '-prune' option of 'find' in sh?
                            
                                jQuery validate: How to add a rule for regular expression validation?
                            
                                How do I extract text that lies between parentheses (round brackets)?
                            
                                Named capturing groups in JavaScript regex?
                            
                                Remove all special characters from a string [duplicate]
                            
                                Match two strings in one line with grep
                            
                                How to match, but not capture, part of a regex?
                            
                                Regex (grep) for multi-line search needed [duplicate]
                            
                                How to validate an email address in PHP
                            
                                How to invert a grep expression
                            
                                Split large string in n-size chunks in JavaScript
                            
                                How to make Regular expression into non-greedy?
                            
                                How is the AND/OR operator represented as in Regular Expressions?
                            
                                How does Stack Overflow generate its SEO-friendly URLs?
                            
                                Javascript replace with reference to matched group?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With