Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to use sed, awk, or gawk to print only what is matched?

I see lots of examples and man pages on how to do things like search-and-replace using sed, awk, or gawk.

But in my case, I have a regular expression that I want to run against a text file to extract a specific value. I don't want to do search-and-replace. This is being called from bash. Let's use an example:

Example regular expression:

.*abc([0-9]+)xyz.* 

Example input file:

a b c abc12345xyz a b c 

As simple as this sounds, I cannot figure out how to call sed/awk/gawk correctly. What I was hoping to do, is from within my bash script have:

myvalue=$( sed <...something...> input.txt ) 

Things I've tried include:

sed -e 's/.*([0-9]).*/\\1/g' example.txt # extracts the entire input file sed -n 's/.*([0-9]).*/\\1/g' example.txt # extracts nothing 
like image 845
Stéphane Avatar asked Nov 14 '09 08:11

Stéphane


People also ask

What is pattern matching in awk?

Any awk expression is valid as an awk pattern. The pattern matches if the expression's value is nonzero (if a number) or non-null (if a string). The expression is reevaluated each time the rule is tested against a new input record.

Which is better sed or awk?

Both sed and awk allow processing streams of characters for tasks such as text transformation. The awk is more powerful and robust than sed. It is similar to a programming language.

Can we use awk and sed together?

Combining the Twoawk and sed are both incredibly powerful when combined. You can do this by using Unix pipes.


2 Answers

My sed (Mac OS X) didn't work with +. I tried * instead and I added p tag for printing match:

sed -n 's/^.*abc\([0-9]*\)xyz.*$/\1/p' example.txt 

For matching at least one numeric character without +, I would use:

sed -n 's/^.*abc\([0-9][0-9]*\)xyz.*$/\1/p' example.txt 
like image 161
mouviciel Avatar answered Sep 21 '22 02:09

mouviciel


You can use sed to do this

 sed -rn 's/.*abc([0-9]+)xyz.*/\1/gp' 
  • -n don't print the resulting line
  • -r this makes it so you don't have the escape the capture group parens().
  • \1 the capture group match
  • /g global match
  • /p print the result

I wrote a tool for myself that makes this easier

rip 'abc(\d+)xyz' '$1' 
like image 31
Ilia Choly Avatar answered Sep 25 '22 02:09

Ilia Choly