I'm trying to get the value of the value entry in this xml line via terminal so I'm using sed.
abcs='<param name="abc" value="bob3" no_but_why="4"/>'
echo $abcs | sed -e 's/.*value="\(.*\)" .*/\1/'
echo $abcs | sed -e 's/.*value="\(.*\)".*/\1/'
The output is:
bob3
bob3" no_but_why="4
Why does the second way without the space cause more than just what I wanted to be printed out? Why would the \1 be affected by that
As you can see difference is use of greedy pattern .*
in second regex after "
without space.
Reason why it is behaving differently because there is a double quote after no_but_why=
as well and .*
being a greedy pattern is matching until last "
before />
in second regex.
In your first regex "\(.*\)"
is matching only "bob3"
because there is a space after this which makes regex engine prevent .*
matching till last double quote in input.
To avoid this situation you should be using negated character class instead of greedy matching.
Consider these sed command examples:
sed -e 's/.*value="\([^"]*\)" .*/\1/' <<< "$abcs"
bob3
sed -e 's/.*value="\([^"]*\)".*/\1/' <<< "$abcs"
bob3
Now you can see both command are producing same output bob3
because negated character class [^"]*
will match until it gets next "
not till the very last "
in input as the case with .*
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With