Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bash script: Get XML elements into array using xmllint

Tags:

bash

shell

xml

This is a follow up question to this post.

I want to end up with an array, containing all the <description> elements of the xml.

array[0] = "<![CDATA[A title for .... <br />]]>"
array[1] = "<![CDATA[A title for .... <br />]]>"

...

file.xml:

<item>
    <description><![CDATA[A title for the URLs<br /><br />

    http://www.foobar.com/foo/bar
    <br />http://bar.com/foo
    <br />http://myurl.com/foo
    <br />http://desiredURL.com/files/ddd
    <br />http://asdasd.com/onefile/g.html
    <br />http://second.com/link
    <br />]]></description> 


</item>
    </item>
<description> ...</description>
    <item>
like image 751
tzippy Avatar asked Dec 10 '13 13:12

tzippy


People also ask

How does xmllint work with XML?

Parsing and Quick Validation When we run xmllint on an XML file without any options, xmllint will simply parse the file and display the content to the standard output. If the parsing is successful and the content is displayed on the standard output without any error, we can ensure that the XML file is well-formed.

How to parse and handle XML documents in Linux?

While there are tonnes of libraries and frameworks that allow us to parse and handle XML documents, xmllint is one of the most versatile XML command-line tools in Linux. 2.1. Installation To install xmllint in Debian based Linux, we could install the libxml2-utils package with apt-get:

How to use XPath in XML?

Particularly, XPath is the de-facto querying language to select nodes, attributes, or text in an XML document. To apply a valid XPath on an XML document, we can run the xmllint command while passing the –xpath option. For example, we could extract the screenSizeInch node of the laptop.xml document using the XPath expression //screenSizeInch:

How to use an array in Bash?

Bash Beginner Series #4: Using Arrays in Bash 1 Creating your first array in a bash script. Let’s say you want to create a bash script timestamp.sh that updates the timestamp of five different files. 2 Accessing array elements in bash. ... 3 Adding array elements in bash. ... 4 Deleting array elements in bash. ...


1 Answers

A Bash solution could be

let itemsCount=$(xmllint --xpath 'count(//item/description)' /tmp/so.xml)
declare -a description=( )

for (( i=1; i <= $itemsCount; i++ )); do 
    description[$i]="$(xmllint --xpath '//item['$i']/description' /tmp/so.xml)"
done

echo ${description[@]}

Disclaimer

Consider that bash may not be the right tool. XSLT/XPath could give you direct access to the content of the description element as describe in previous answer. For instance:

xmllint --xpath '//item/description/text()' /tmp/so.xml

Return every <description> content

like image 199
Édouard Lopez Avatar answered Oct 21 '22 19:10

Édouard Lopez