I see lots of examples and man pages on how to do things like search-and-replace using sed, awk, or gawk.
But in my case, I have a regular expression that I want to run against a text file to extract a specific value. I don't want to do search-and-replace. This is being called from bash. Let's use an example:
Example regular expression:
.*abc([0-9]+)xyz.*
Example input file:
a b c abc12345xyz a b c
As simple as this sounds, I cannot figure out how to call sed/awk/gawk correctly. What I was hoping to do, is from within my bash script have:
myvalue=$( sed <...something...> input.txt )
Things I've tried include:
sed -e 's/.*([0-9]).*/\\1/g' example.txt # extracts the entire input file sed -n 's/.*([0-9]).*/\\1/g' example.txt # extracts nothing
Any awk expression is valid as an awk pattern. The pattern matches if the expression's value is nonzero (if a number) or non-null (if a string). The expression is reevaluated each time the rule is tested against a new input record.
Both sed and awk allow processing streams of characters for tasks such as text transformation. The awk is more powerful and robust than sed. It is similar to a programming language.
Combining the Twoawk and sed are both incredibly powerful when combined. You can do this by using Unix pipes.
My sed
(Mac OS X) didn't work with +
. I tried *
instead and I added p
tag for printing match:
sed -n 's/^.*abc\([0-9]*\)xyz.*$/\1/p' example.txt
For matching at least one numeric character without +
, I would use:
sed -n 's/^.*abc\([0-9][0-9]*\)xyz.*$/\1/p' example.txt
You can use sed to do this
sed -rn 's/.*abc([0-9]+)xyz.*/\1/gp'
-n
don't print the resulting line-r
this makes it so you don't have the escape the capture group parens()
.\1
the capture group match/g
global match/p
print the resultI wrote a tool for myself that makes this easier
rip 'abc(\d+)xyz' '$1'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With