I see lots of examples and man pages on how to do things like search-and-replace using sed, awk, or gawk. But in my case, I have a regular expression that I want to run against a text file to extract a specific value. I don't want to do search-and-replace. This is being called from bash. Let's use an example: Example regular expression: <pre class="prettyprint"><code>.*abc([0-9]+)xyz.* </code></pre> Example input file: <pre class="prettyprint"><code>a b c abc12345xyz a b c </code></pre> As simple as this sounds, I cannot figure out how to call sed/awk/gawk correctly. What I was hoping to do, is from within my bash script have: <pre class="prettyprint"><code>myvalue=$( sed <...something...> input.txt ) </code></pre> Things I've tried include: <pre class="prettyprint"><code>sed -e 's/.*([0-9]).*/\\1/g' example.txt # extracts the entire input file sed -n 's/.*([0-9]).*/\\1/g' example.txt # extracts nothing </code></pre>

My <code>sed</code> (Mac OS X) didn't work with <code>+</code>. I tried <code>*</code> instead and I added <code>p</code> tag for printing match: <pre class="prettyprint"><code>sed -n 's/^.*abc$[0-9]*$xyz.*$/\1/p' example.txt </code></pre> For matching at least one numeric character without <code>+</code>, I would use: <pre class="prettyprint"><code>sed -n 's/^.*abc$[0-9][0-9]*$xyz.*$/\1/p' example.txt </code></pre>

You can use sed to do this <pre class="prettyprint"><code> sed -rn 's/.*abc([0-9]+)xyz.*/\1/gp' </code></pre> <ul> <li> <code>-n</code> don't print the resulting line</li> <li> <code>-r</code> this makes it so you don't have the escape the capture group parens<code>()</code>.</li> <li> <code>\1</code> the capture group match</li> <li> <code>/g</code> global match</li> <li> <code>/p</code> print the result</li> </ul> I wrote a tool for myself that makes this easier <pre class="prettyprint"><code>rip 'abc(\d+)xyz' '$1' </code></pre>

how to use sed, awk, or gawk to print only what is matched?

Tags:

regex

unix

sed

awk

gawk

I see lots of examples and man pages on how to do things like search-and-replace using sed, awk, or gawk.

But in my case, I have a regular expression that I want to run against a text file to extract a specific value. I don't want to do search-and-replace. This is being called from bash. Let's use an example:

Example regular expression:

.*abc([0-9]+)xyz.*

Example input file:

a b c abc12345xyz a b c

As simple as this sounds, I cannot figure out how to call sed/awk/gawk correctly. What I was hoping to do, is from within my bash script have:

myvalue=$( sed <...something...> input.txt )

Things I've tried include:

sed -e 's/.*([0-9]).*/\\1/g' example.txt # extracts the entire input file sed -n 's/.*([0-9]).*/\\1/g' example.txt # extracts nothing

845

asked Nov 14 '09 08:11

Stéphane

2 Answers

My sed (Mac OS X) didn't work with +. I tried * instead and I added p tag for printing match:

sed -n 's/^.*abc\([0-9]*\)xyz.*$/\1/p' example.txt

For matching at least one numeric character without +, I would use:

sed -n 's/^.*abc\([0-9][0-9]*\)xyz.*$/\1/p' example.txt

161

answered Sep 21 '22 02:09

mouviciel

You can use sed to do this

 sed -rn 's/.*abc([0-9]+)xyz.*/\1/gp'

-n don't print the resulting line
-r this makes it so you don't have the escape the capture group parens().
\1 the capture group match
/g global match
/p print the result

I wrote a tool for myself that makes this easier

rip 'abc(\d+)xyz' '$1'

answered Sep 25 '22 02:09

Ilia Choly

Related questions
                            
                                JavaScript - string regex backreferences
                            
                                Using sed and grep/egrep to search and replace
                            
                                grepping using the "|" alternative operator
                            
                                Which regular expression operator means 'Don't' match this character?
                            
                                Javascript regular expression: remove first and last slash
                            
                                Python regex - r prefix
                            
                                Get the index of a pattern in a string using regex
                            
                                Remove part of a string
                            
                                Is there a difference between /\s/g and /\s+/g?
                            
                                Validate email address in Dart? [duplicate]
                            
                                Is it possible for a computer to "learn" a regular expression by user-provided examples?
                            
                                Getting the text that follows after the regex match
                            
                                How do I do a case insensitive regular expression in Go?
                            
                                A regex for version number parsing
                            
                                Number of occurrences of a character in a string [duplicate]
                            
                                How do you validate a URL with a regular expression in Python?
                            
                                How can I recognize an evil regex?
                            
                                Java regular expression OR operator
                            
                                javascript regular expression to not match a word
                            
                                How to determine if a string is a valid v4 UUID? [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With