Exception is thrown when xml tag has colon,
Exception:
org.jsoup.select.Selector$SelectorParseException: Could not parse query 'w:r': unexpected token at ':r'
XML:
<w:r>
<w:rPr>
<w:rStyle w:val="jid"/>
</w:rPr>
<w:t>AN</w:t>
</w:r>
Java code:
org.jsoup.nodes.Document doc = Jsoup.parse(documentXmlString);
Here documentXmlString has the xml specified above
Just replace ":" with "|"
doc.select("w|r");
I'm using Jsoup 1.5.2.
Though your patchwork has worked for you.. I would like to give knowledge on namespace !
the w:
in your XML is actually called namespace prefix. And to use neamespace prefix it has to be declared in the root node! 1+
Since the declaration part was missing in your source XML! parser was throwing an error!
Below is the way to define namespace in XML! I have corrected your own XML, I bet it wouldn't error-out now!
<w:r xmlns:w="http://www.w3.org/SomeNamespace">
<w:rPr>
<w:rStyle w:val="jid"/>
</w:rPr>
<w:t>AN</w:t>
</w:r>
Additional information:
The namespace has its own scope! in the below example:
<root>
<w:r xmlns:w="http://www.w3.org/SomeNamespace">
<w:rPr>
<w:rStyle w:val="jid"/>
</w:rPr>
<w:t>AN</w:t>
</w:r>
<someotherElement>
<dummychild/>
</someotherElement>
In the above example, you cannot use namespace prefix on <someotherElement>
or <dummychild/>
!! because the scope of namespace prefix w is upto element <r>
and its child (grandchild) only!
1+:The Element under which Namespace is declared.. the namespace will be valid for itself and its child nodes.. Declaring namespace under root makes namespace valid/available for all the elements in XML Document.
I used,
documentXmlString = documentXmlString.replaceAll("w:","w");
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With