Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Programatically determining which node in an XML document caused validation against its XML Schema to fail

Tags:

java

xml

xsd

My input is a well-formed XML document and a corresponding XML Schema document. What I would like to do is determine the location within the XML document that causes it to fail validation against the XML Schema document. I could not figure out how to do this using the standard validation approach in Java:

SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = schemaFactory.newSchema(... /* the .xsd source */);
Validator validator = schema.newValidator();
DocumentBuilderFactory ...
DocumentBuilder ...
Document document = DocumentBuilder.parse(... /* the .xml source */);
try {
    validator.validate(new DOMSource(document));
    ...
} catch (SAXParseException e) {
    ...
}

I have toyed with the idea of getting at least the line and column number from SAXParseException, but they're always set to -1, -1 on validation error.

like image 920
user360603 Avatar asked Nov 15 '22 10:11

user360603


1 Answers

A DOM does not retain information about its source -- in most cases it's irrelevant, and DOM is meant to be manipulated (ie, any location information would be incorrect).

The solution is to validate at the time you parse: call DocumentBuilderFactory.setSchema() before creating the DocumentBuilder.

like image 140
Anon Avatar answered Jan 09 '23 14:01

Anon