Today I get to learn how to use xmllint properly. It does not seem to be well covered or explained. I plan to use a single language resource file to run my entire system. I have a mixture of bash scripts and php pages that must read from this language file.
Currently I am using the following format in my xml file en.xml:
<?xml version="1.0" encoding="utf-8"?>
<resources>
<item id="index.php">
<label>LABEL</label>
<value>VALUE</value>
<description>DESCRIPTION</description>
</item>
<item id="config.php">
<label>LABEL</label>
<value>VALUE</value>
<description>DESCRIPTION</description>
</item>
</resources>
Now I need to start with a bash script line that should pull the data values from the xml file. For example I want to get the value of DESCRIPTION
from the index.php
item.
I was using
xmllint --xpath 'string(//description)' /path/en.xml
for a different layout which worked, but now that I am changing the layout of my xml file, I am lost as to how best to target a specific <item>
and then drill down to its child element in the bash script.
Can someone help with a xmllint --xpath
line to get this value please?
The xmllint command is installed with the xmllib2 package. Usually, we can use this command to validate XML files, parse XML files, or pretty-print an XML file. The xmllint command supports a “–xpath” option to evaluate XPath expressions: xmllint --xpath "XPATH_EXPRESSION" INPUT.xml.
The xmllint program parses one or more XML files, specified on the command line as XML-FILE (or the standard input if the filename provided is - ). It prints various types of output, depending upon the options selected. It is useful for detecting errors both in XML code and in the XML parser itself.
XPath stands for XML Path Language. It uses a non-XML syntax to provide a flexible way of addressing (pointing to) different parts of an XML document. It can also be used to test addressed nodes within a document to determine whether they match a pattern or not.
how best to target a specific and then drill down to its child element
The correct XPath expression to do this is:
/resources/item[@id="index.php"]/description/text()
In plain English: Start from the document node, to the document element resources
, on to its child item
, but only if the value of the id
attribute is "index.php", on to its child description
and retrieve its textual value.
I use xmllint to validate XML documents, but never for path expressions. In a bash shell (at least with Mac OS) there is an even simpler tool for evaluating XPath expressions, called "xpath":
$ xpath en.xml '/resources/item[@id="index.php"]/description/text()'
Then, the following result is obtained:
Found 1 nodes:
-- NODE --
DESCRIPTION
If you still prefer xmllint, use it in the following way:
$ xmllint --xpath '/resources/item[@id="index.php"]/description/text()' en.xml > result.txt
By default, --xpath
implies --noout
, which prevents xmllint from outputting the input XML file. To make the output more readable, I redirect the output to a file.
$ cat result.txt
DESCRIPTION
My favorite is xmlstarlet because it seems to be more powerful than xmllint:
xmlstarlet sel -t -v '/resources/item[@id="index.php"]/description/text()' en.xml
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With