Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get more specific errors when validating XML against an XSD using java.xml.validator

After searching for the best approach to validating my XML against an XSD, I came across java.xml.validator.

I started off by using the example code from the API and adding my own ErrorHandler

// parse an XML document into a DOM tree
DocumentBuilder parser = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document document = parser.parse(new File("instance.xml"));

// create a SchemaFactory capable of understanding WXS schemas
SchemaFactory factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);

// load a WXS schema, represented by a Schema instance
Source schemaFile = new StreamSource(new File("mySchema.xsd"));
Schema schema = factory.newSchema(schemaFile);

// create a Validator instance, which can be used to validate an instance document
Validator validator = schema.newValidator();

// Add a custom ErrorHandler
validator.setErrorHandler(new XsdValidationErrorHandler());

// validate the DOM tree
try {
    validator.validate(new DOMSource(document));
} catch (SAXException e) {
    // instance document is invalid!
}

...

private class XsdValidationErrorHandler implements ErrorHandler {
    @Override
    public void warning(SAXParseException exception) throws SAXException {
        throw new SAXException(exception.getMessage());
    }

    @Override
    public void error(SAXParseException exception) throws SAXException {
        throw new SAXException(exception.getMessage());
    }

    @Override
    public void fatalError(SAXParseException exception) throws SAXException {
        throw new SAXException(exception.getMessage());
    }
}

This works fine, however, the message passed to my XsdValidationErrorHandler is doesn't give me any indication of exactly where in the document the offending XML is:

"org.xml.sax.SAXParseException: cvc-complex-type.2.4.a: Invalid content was found starting with element 'X'. One of '{Y}' is expected."

Is there a way for me to override or plug-in another section of Validator, so that I can define my own error messages sent to the ErrorHandler without having to rewrite all of the code?

Should I be using a different library?

like image 264
Levity Avatar asked Sep 04 '12 17:09

Levity


4 Answers

Try catching SaxParseException, it's a descendant of SaxException. If you get one of them it has methods getLineNumber(), getColumnNumber() etc.

like image 69
Tony Hopkinson Avatar answered Nov 15 '22 22:11

Tony Hopkinson


Validate at the time you parse. This will make the location information available, and the ErrorHandler will report it.

Simply create the Schema before you create the DocumentBuilderFactory, and apply it like this:

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
dbf.setValidating(false);
dbf.setSchema(schema);

Note: the setValidating() method tells the DBF whether or not to use DTD validation. Setting the schema tells it to use schema validation.

like image 28
parsifal Avatar answered Nov 15 '22 23:11

parsifal


You can do exception.getLineNumber() and exception.getColumnNumber() to get the coords of the error. This is a similar question.

like image 34
dimitrisli Avatar answered Nov 15 '22 21:11

dimitrisli


Did you look at SAXParseException? I assume you are looking for information like the line number? As @parcifel mentioned, you should validate when you parse (more efficient and better error information). you would do this by specifying the schema on the DocumentBuilderFactory.

like image 43
jtahlborn Avatar answered Nov 15 '22 21:11

jtahlborn