Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get xml attribute value containing colon?

Tags:

java

xml

xpath

I am trying to parse an xml file using Xpath in java. I need to get all element values under text element with attribute value xml:lang="en".

Here is my xml file:

<?xml version="1.0" encoding="UTF-8" ?>
<image id="10001" file="images/2/10001.png">
   <name>Lake two mountains.png</name>
   <text xml:lang="en">
      <description />
      <comment />
      <caption article="text/en/4/335157">Location map of Lake of Two Mountains.  </caption>
   </text>
   <text xml:lang="de">
      <description/>
      <comment />
      <caption article="text/de/5/441485">Lage des Lac des Deux Montagnes (ganz rechts liegt Montréal)</caption>
   </text>
   <text xml:lang="fr">
      <description />
      <comment />
      <caption />
   </text>
   <comment>({{Information |Description= Location map of Lake of Two Mountains in Quebec, Canada. |Source= based on Image:Oka map with roads.png. |Date= |Author= P199 |Permission= |other_versions= }})</comment>
   <license>GFDL</license>
</image>

Here is my java code snippet:

DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = null;
Document xmlDocument = null;
try {
       builder = builderFactory.newDocumentBuilder();
    } 
catch (ParserConfigurationException e) {
  e.printStackTrace();  
}    

try {
       xmlDocument = builder.parse(new FileInputStream(fileEntry.getAbsolutePath()));
            } catch (SAXException e) {
                e.printStackTrace();
            } catch (IOException e) {
                e.printStackTrace();
            }

            XPath xPath =  XPathFactory.newInstance().newXPath();

            //prepare node expressions
            String nameExpr = "/image/name";
            String descriptionExpr = "/image/text[@lang='en']/description";
            String captionExpr = "/image/text[@lang='en']/caption";
            String commentExpr = "/image/text[@lang='en']/comment";

            //read a string value
            String name = xPath.compile(nameExpr).evaluate(xmlDocument);
            String description = xPath.compile(descriptionExpr).evaluate(xmlDocument);
            String caption = xPath.compile(captionExpr).evaluate(xmlDocument);
            String comment = xPath.compile(commentExpr).evaluate(xmlDocument);

I tried some Xpath expressions to get element values eg:

(1) /image/text[@xml:lang='en']/description" which doesn't work.

(2) /image/text[@lang='en']/description" works fine.

I am curious to know what is the problem with first Xpath expression.

Thanks in Advance.

like image 426
amit Avatar asked Nov 02 '22 05:11

amit


1 Answers

For some (presumably historical) reason, DocumentBuilderFactory is not namespace-aware by default. You must call setNamespaceAware(true) on the factory before you call newDocumentBuilder() as XPath only works properly on XML that has been parsed as namespace-aware.

I would then recommend using the lang function to do the actual test:

/image/text[lang('en')]/description
like image 69
Ian Roberts Avatar answered Nov 15 '22 05:11

Ian Roberts