I am trying to parse an xml file using Xpath in java. I need to get all element values under text element with attribute value xml:lang="en".
Here is my xml file:
<?xml version="1.0" encoding="UTF-8" ?>
<image id="10001" file="images/2/10001.png">
<name>Lake two mountains.png</name>
<text xml:lang="en">
<description />
<comment />
<caption article="text/en/4/335157">Location map of Lake of Two Mountains. </caption>
</text>
<text xml:lang="de">
<description/>
<comment />
<caption article="text/de/5/441485">Lage des Lac des Deux Montagnes (ganz rechts liegt Montréal)</caption>
</text>
<text xml:lang="fr">
<description />
<comment />
<caption />
</text>
<comment>({{Information |Description= Location map of Lake of Two Mountains in Quebec, Canada. |Source= based on Image:Oka map with roads.png. |Date= |Author= P199 |Permission= |other_versions= }})</comment>
<license>GFDL</license>
</image>
Here is my java code snippet:
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = null;
Document xmlDocument = null;
try {
builder = builderFactory.newDocumentBuilder();
}
catch (ParserConfigurationException e) {
e.printStackTrace();
}
try {
xmlDocument = builder.parse(new FileInputStream(fileEntry.getAbsolutePath()));
} catch (SAXException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
XPath xPath = XPathFactory.newInstance().newXPath();
//prepare node expressions
String nameExpr = "/image/name";
String descriptionExpr = "/image/text[@lang='en']/description";
String captionExpr = "/image/text[@lang='en']/caption";
String commentExpr = "/image/text[@lang='en']/comment";
//read a string value
String name = xPath.compile(nameExpr).evaluate(xmlDocument);
String description = xPath.compile(descriptionExpr).evaluate(xmlDocument);
String caption = xPath.compile(captionExpr).evaluate(xmlDocument);
String comment = xPath.compile(commentExpr).evaluate(xmlDocument);
I tried some Xpath expressions to get element values eg:
(1) /image/text[@xml:lang='en']/description" which doesn't work.
(2) /image/text[@lang='en']/description" works fine.
I am curious to know what is the problem with first Xpath expression.
Thanks in Advance.
For some (presumably historical) reason, DocumentBuilderFactory
is not namespace-aware by default. You must call setNamespaceAware(true)
on the factory before you call newDocumentBuilder()
as XPath only works properly on XML that has been parsed as namespace-aware.
I would then recommend using the lang
function to do the actual test:
/image/text[lang('en')]/description
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With