I have an XML file, the file is made up of one line. What I am trying to do is extract the "<code>finalNumber</code>" attribute value from the file via Putty. Rather than having to download a copy and search using notepad++. I've built up a regular expression that I've tested on an On-line Tool, and tried using it within a <code>sed</code> command to duplicate grep functionality. The command runs but doesn't return anything. RegEx: <pre class="prettyprint"><code>(?<=finalNumber=")(.*?)(?=") </code></pre> <code>sed</code> Command (returns nothing, expected 28, see file extract): <pre class="prettyprint"><code>sed -n '/(?<=finalNumber=")(.*?)(?=")/p' file.xml </code></pre> File Extract: <pre class="prettyprint"><code>...argo:finalizedDate="2012-02-09T00:00:00.000Z" argo:finalNumber="28" argo:revenueMonth=""... </code></pre> I feel like I am close (i could be wrong), am I on the right lines or is there better way to achieve the output?

Nothing wrong with good old grep here. <pre class="prettyprint"><code>grep -E -o 'finalNumber="[0-9]+"' file.xml | grep -E -o '[0-9]+' </code></pre> Use <code>-E</code> for extended regular expressions, and <code>-o</code> to print only the matching part.

Though you already select an answer, here is a way you can do in pure sed: <pre class="prettyprint"><code>sed -n 's/^.*finalNumber="$[[:digit:]]\+$".*$/\1/p' <test </code></pre> Output: <pre class="prettyprint"><code>28 </code></pre> This replaces the entire line by the match number and print (because p will print the entire line so you have to replace the entire line)

Print RegEx matches using SED in bash

Tags:

regex

linux

bash

sed

I have an XML file, the file is made up of one line.

What I am trying to do is extract the "finalNumber" attribute value from the file via Putty. Rather than having to download a copy and search using notepad++.

I've built up a regular expression that I've tested on an On-line Tool, and tried using it within a sed command to duplicate grep functionality. The command runs but doesn't return anything.

RegEx:

(?<=finalNumber=")(.*?)(?=")

sed Command (returns nothing, expected 28, see file extract):

sed -n '/(?<=finalNumber=")(.*?)(?=")/p' file.xml

File Extract:

...argo:finalizedDate="2012-02-09T00:00:00.000Z" argo:finalNumber="28" argo:revenueMonth=""...

I feel like I am close (i could be wrong), am I on the right lines or is there better way to achieve the output?

672

asked Jan 23 '13 12:01

Matthew Warman

2 Answers

Nothing wrong with good old grep here.

grep -E -o 'finalNumber="[0-9]+"' file.xml | grep -E -o '[0-9]+'

Use -E for extended regular expressions, and -o to print only the matching part.

155

answered Sep 20 '22 03:09

Perleone

Though you already select an answer, here is a way you can do in pure sed:

sed -n 's/^.*finalNumber="\([[:digit:]]\+\)".*$/\1/p' <test

Output:

This replaces the entire line by the match number and print (because p will print the entire line so you have to replace the entire line)

answered Sep 20 '22 03:09

SwiftMango

Related questions
                            
                                Regex to Match only language chars (all language)?
                            
                                How to test a regex password in Python?
                            
                                Is there a not (!) operator in regexp?
                            
                                Regular expression help - comma delimited string
                            
                                SED: multiple patterns on the same line, how to match/parse first one
                            
                                Regular expressions for a range of unicode points PHP
                            
                                regex check for white space in middle of string
                            
                                PHP regex groups captures
                            
                                Regex matching between two strings?
                            
                                Bash Regular Expression -- Can't seem to match any of \s \S \d \D \w \W etc
                            
                                grepl for a period "." in R?
                            
                                Regex to find whole word in text but case insensitive
                            
                                Swift regular expression format?
                            
                                php regex validation
                            
                                how to match whitespace and alphanumeric characters in python
                            
                                PHP preg_match get in between string
                            
                                Javascript regex - no white space at beginning + allow space in the middle
                            
                                How to grep a word inside xml files in a folder
                            
                                Validating user's UTF-8 name in Javascript
                            
                                Android regular expression - return matched string

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With