How to print matched regex pattern using awk?

People also ask

Can I use regex with awk?

Use regex to search code using dynamic and complex pattern definitions. In awk, regular expressions (regex) allow for dynamic and complex pattern definitions. You're not limited to searching for simple strings but also patterns within patterns.

What is pattern matching in awk?

Any awk expression is valid as an awk pattern. The pattern matches if the expression's value is nonzero (if a number) or non-null (if a string). The expression is reevaluated each time the rule is tested against a new input record.

How do I match a pattern in regex?

To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \. matches "." ; regex \+ matches "+" ; and regex \( matches "(" . You also need to use regex \\ to match "\" (back-slash).

This is the very basic

awk '/pattern/{ print $0 }' file

ask awk to search for pattern using //, then print out the line, which by default is called a record, denoted by $0. At least read up the documentation.

If you only want to get print out the matched word.

awk '{for(i=1;i<=NF;i++){ if($i=="yyy"){print $i} } }' file

It sounds like you are trying to emulate GNU's grep -o behaviour. This will do that providing you only want the first match on each line:

awk 'match($0, /regex/) {
    print substr($0, RSTART, RLENGTH)
}
' file

Here's an example, using GNU's awk implementation (gawk):

awk 'match($0, /a.t/) {
    print substr($0, RSTART, RLENGTH)
}
' /usr/share/dict/words | head
act
act
act
act
aft
ant
apt
art
art
art

Read about match, substr, RSTART and RLENGTH in the awk manual.

After that you may wish to extend this to deal with multiple matches on the same line.

gawk can get the matching part of every line using this as action:

{ if (match($0,/your regexp/,m)) print m[0] }

match(string, regexp [, array]) If array is present, it is cleared, and then the zeroth element of array is set to the entire portion of string matched by regexp. If regexp contains parentheses, the integer-indexed elements of array are set to contain the portion of string matching the corresponding parenthesized subexpression. http://www.gnu.org/software/gawk/manual/gawk.html#String-Functions

If Perl is an option, you can try this:

perl -lne 'print $1 if /(regex)/' file

To implement case-insensitive matching, add the i modifier

perl -lne 'print $1 if /(regex)/i' file

To print everything AFTER the match:

perl -lne 'if ($found){print} else{if (/regex(.*)/){print $1; $found++}}' textfile

To print the match and everything after the match:

perl -lne 'if ($found){print} else{if (/(regex.*)/){print $1; $found++}}' textfile

If you are only interested in the last line of input and you expect to find only one match (for example a part of the summary line of a shell command), you can also try this very compact code, adopted from How to print regexp matches using `awk`?:

$ echo "xxx yyy zzz" | awk '{match($0,"yyy",a)}END{print a[0]}'
yyy

Or the more complex version with a partial result:

$ echo "xxx=a yyy=b zzz=c" | awk '{match($0,"yyy=([^ ]+)",a)}END{print a[1]}'
b

Warning: the awk match() function with three arguments only exists in gawk, not in mawk

Here is another nice solution using a lookbehind regex in grep instead of awk. This solution has lower requirements to your installation:

$ echo "xxx=a yyy=b zzz=c" | grep -Po '(?<=yyy=)[^ ]+'
b

Related questions
                            
                                How can I validate a string to only allow alphanumeric characters in it?
                            
                                How to pattern match using regular expression in Scala?
                            
                                Is Java RegEx case-insensitive?
                            
                                How to match any non white space character except a particular one?
                            
                                Regular expression to match any character being repeated more than 10 times
                            
                                Eclipse, regular expression search and replace
                            
                                Unicode equivalents for \w and \b in Java regular expressions?
                            
                                How can I remove the string "\n" from within a Ruby string?
                            
                                PHP validation/regex for URL
                            
                                How can I "inverse match" with regex?
                            
                                How can I replace a regex substring match in Javascript?
                            
                                express.js - single routing handler for multiple routes in a single line
                            
                                Extracting numbers from vectors of strings
                            
                                Regular expression to match string starting with a specific word
                            
                                Regex exactly n OR m times
                            
                                Javascript regex returning true.. then false.. then true.. etc [duplicate]
                            
                                Difference between \b and \B in regex
                            
                                Split string into array of character strings
                            
                                python's re: return True if string contains regex pattern
                            
                                How to extract a floating number from a string [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to print matched regex pattern using awk?

Tags:

regex

awk

People also ask

Recent Activity

Donate For Us