Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regexp in gawk matches multiples ways

Tags:

match

awk

I have some text I need to split up to extract the relevant argument, and my [g]awk match command does not behave - I just want to understand why?! (I have written a less elegant way around it now...).

So the string is blahblah|msgcontent1=HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002|msgcontent2=header

I want to output just the contents of msgcontent1=, so did

echo "blahblah|msgcontent1=HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002|msgcontent2=header" | gawk '{ if (match($0,/msgcontent1=(.*)[|]/,a)) { print a[1]; } }'

Trouble instead of getting HeaderUUIiewConsenFlagPSMessage

I get the match with everything from there to the last pipe of the string HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002 Now I accept this is because the regexp in /msgcontent1=(.*)[|]/ can match multiple ways, but HOW do I make it match the way I want it to??

like image 589
Andyj12 Avatar asked Dec 18 '22 11:12

Andyj12


2 Answers

With your shown samples please try following. Written and tested in GNU awk this will print only contents from msgcontent1= till | first occurrence.

awk 'match($0,/msgcontent1=[^|]*/){print substr($0,RSTART+12,RLENGTH-12)}' Input_file

OR with echo + awk try:

echo "blahblah|msgcontent1=HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002|msgcontent2=header" |
awk 'match($0,/msgcontent1=[^|]*/){print substr($0,RSTART+12,RLENGTH-12)}'


With FPAT option in GNU awk:

awk -v FPAT='msgcontent1=[^|]*' '{sub(/.*=/,"",$1);print $1}' Input_file
like image 116
RavinderSingh13 Avatar answered Jan 09 '23 05:01

RavinderSingh13


This is your input:

s='blahblah|msgcontent1=HeaderUUIiewConsenFlagPSMessage|msgtype2=Blah002|msgcontent2=header'

You may use gnu awk like this to extract value after msgcontent1=:

awk -F= -v RS='|' '$1 == "msgcontent1" {print $2}' <<< "$s"

HeaderUUIiewConsenFlagPSMessage

or using this sed:

sed -E 's/^(.*\|)?msgcontent1=([^|]+).*/\2/' <<< "$s"

HeaderUUIiewConsenFlagPSMessage

Or using this gnu grep:

grep -oP '(^|\|)msgcontent1=\K[^|]+' <<< "$s"

HeaderUUIiewConsenFlagPSMessage
like image 25
anubhava Avatar answered Jan 09 '23 05:01

anubhava