Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extraction of data from a simple XML file

Tags:

grep

bash

xml

sed

awk

I've a XML file with the contents:

<?xml version="1.0" encoding="utf-8"?> <job xmlns="http://www.sample.com/">programming</job> 

I need a way to extract what is in the <job..> </job> tags, programmin in this case. This should be done on linux command prompt, using grep/sed/awk.

like image 953
Zacky112 Avatar asked Feb 08 '10 14:02

Zacky112


People also ask

Which function is used to extract data from XML document?

You need to extract individual elements values from an XML document. Solution: Oracle provides the XMLTABLE function to manipulate XML documents using XQuery and column mapping to Oracle datatypes. Using XMLTABLE, we can identify and use data elements in an XML document in a relational way.


2 Answers

Do you really have to use only those tools? They're not designed for XML processing, and although it's possible to get something that works OK most of the time, it will fail on edge cases, like encoding, line breaks, etc.

I recommend xml_grep:

xml_grep 'job' jobs.xml --text_only 

Which gives the output:

programming 

On ubuntu/debian, xml_grep is in the xml-twig-tools package.

like image 171
amarillion Avatar answered Sep 25 '22 03:09

amarillion


 grep '<job' file_name | cut -f2 -d">"|cut -f1 -d"<" 
like image 27
Vijay Avatar answered Sep 21 '22 03:09

Vijay