After searching for the best approach to validating my XML against an XSD, I came across java.xml.validator.
I started off by using the example code from the API and adding my own ErrorHandler
// parse an XML document into a DOM tree
DocumentBuilder parser = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document document = parser.parse(new File("instance.xml"));
// create a SchemaFactory capable of understanding WXS schemas
SchemaFactory factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
// load a WXS schema, represented by a Schema instance
Source schemaFile = new StreamSource(new File("mySchema.xsd"));
Schema schema = factory.newSchema(schemaFile);
// create a Validator instance, which can be used to validate an instance document
Validator validator = schema.newValidator();
// Add a custom ErrorHandler
validator.setErrorHandler(new XsdValidationErrorHandler());
// validate the DOM tree
try {
validator.validate(new DOMSource(document));
} catch (SAXException e) {
// instance document is invalid!
}
...
private class XsdValidationErrorHandler implements ErrorHandler {
@Override
public void warning(SAXParseException exception) throws SAXException {
throw new SAXException(exception.getMessage());
}
@Override
public void error(SAXParseException exception) throws SAXException {
throw new SAXException(exception.getMessage());
}
@Override
public void fatalError(SAXParseException exception) throws SAXException {
throw new SAXException(exception.getMessage());
}
}
This works fine, however, the message passed to my XsdValidationErrorHandler is doesn't give me any indication of exactly where in the document the offending XML is:
"org.xml.sax.SAXParseException: cvc-complex-type.2.4.a: Invalid content was found starting with element 'X'. One of '{Y}' is expected."
Is there a way for me to override or plug-in another section of Validator, so that I can define my own error messages sent to the ErrorHandler without having to rewrite all of the code?
Should I be using a different library?
Try catching SaxParseException, it's a descendant of SaxException. If you get one of them it has methods getLineNumber(), getColumnNumber() etc.
Validate at the time you parse. This will make the location information available, and the ErrorHandler will report it.
Simply create the Schema
before you create the DocumentBuilderFactory
, and apply it like this:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
dbf.setValidating(false);
dbf.setSchema(schema);
Note: the setValidating()
method tells the DBF whether or not to use DTD validation. Setting the schema tells it to use schema validation.
You can do exception.getLineNumber()
and exception.getColumnNumber()
to get the coords of the error. This is a similar question.
Did you look at SAXParseException? I assume you are looking for information like the line number? As @parcifel mentioned, you should validate when you parse (more efficient and better error information). you would do this by specifying the schema on the DocumentBuilderFactory.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With