I've a XML file with the contents:
<?xml version="1.0" encoding="utf-8"?> <job xmlns="http://www.sample.com/">programming</job>
I need a way to extract what is in the <job..>
</job>
tags, programmin in this case. This should be done on linux command prompt, using grep/sed/awk.
You need to extract individual elements values from an XML document. Solution: Oracle provides the XMLTABLE function to manipulate XML documents using XQuery and column mapping to Oracle datatypes. Using XMLTABLE, we can identify and use data elements in an XML document in a relational way.
Do you really have to use only those tools? They're not designed for XML processing, and although it's possible to get something that works OK most of the time, it will fail on edge cases, like encoding, line breaks, etc.
I recommend xml_grep:
xml_grep 'job' jobs.xml --text_only
Which gives the output:
programming
On ubuntu/debian, xml_grep is in the xml-twig-tools package.
grep '<job' file_name | cut -f2 -d">"|cut -f1 -d"<"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With