Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Validate an XML against an XSD in Java / Getting a hold of the schemaLocation

How can one validate an XML file using an XSD in Java? We don't know the schema in advance. I would like to be able to get the schemaLocation, download the XSD, cache it and then perform the actual validation.

The problem is, that with javax.xml.parsers.DocumentBuilder/DocumentBuilderFactory classes I can't seem to be able to get a hold of the schemaLocation in advance. What's the trick for this? Which classes should I look into?

Perhaps there's a more suitable API I can use? The whole problem is that we need to validate dynamically, without (necessarily) having the XSDs locally.

How could one get a hold of the URL of schemaLocation defined in the XSD file?

I know you can set features/attributes, but that's a different thing. I need to get the schemaLocation from the XSD first.

Please advise!

like image 691
carlspring Avatar asked Feb 01 '12 10:02

carlspring


People also ask

Can we validate XML documents against a schema?

You can validate your XML documents against XML schemas only; validation against DTDs is not supported. However, although you cannot validate against DTDs, you can insert documents that contain a DOCTYPE or that refer to DTDs.

What is schemaLocation in XSD?

The xsi:schemaLocation attribute locates schemas for elements and attributes that are in a specified namespace. Its value is a namespace URI followed by a relative or absolute URL where the schema for that namespace can be found. It is most commonly attached to the root element but can appear further down the tree.


1 Answers

Given that you are using Xerces (or JDK default), have you tried setting this feature to true on the factory: http://apache.org/xml/features/validation/schema. There are other features that you can play with regarding schemas: http://xerces.apache.org/xerces2-j/features.html

UPDATE 2 (for caching):

Implement a org.w3c.dom.ls.LSResourceResolver and set this on the SchemaFactory using the setResourceResolver method. This resolver would either get the schema from cache or fetch it from wherever the location refers to.

UPDATE 3:

LSResourceresolver example (which I think will be a good starting point for you):

/**
 * Resolves resources from a base URL
 */
public class URLBasedResourceResolver implements LSResourceResolver {

private static final Logger log = LoggerFactory
        .getLogger(URLBasedResourceResolver.class);

private final URI base;

private final Map<URI, String> nsmap;

public URLBasedResourceResolver(URL base, Map<URI, String> nsmap)
        throws URISyntaxException {
    super();
    this.base = base.toURI();
    this.nsmap = nsmap;
}

@Override
public LSInput resolveResource(String type, String namespaceURI,
        String publicId, String systemId, String baseURI) {
    if (log.isDebugEnabled()) {
        String msg = String
                .format("Resolve: type=%s, ns=%s, publicId=%s, systemId=%s, baseUri=%s.",
                        type, namespaceURI, publicId, systemId, baseURI);
        log.debug(msg);
    }
    if (type.equals(XMLConstants.W3C_XML_SCHEMA_NS_URI)) {
        if (namespaceURI != null) {
            try {
                URI ns = new URI(namespaceURI);
                if (nsmap.containsKey(ns))
                    return new MyLSInput(base.resolve(nsmap.get(ns)));
            } catch (URISyntaxException e) {
                // ok
            }
        }
    }
    return null;
}

}

The implementation of MyLSInput is really boring:

class MyLSInput implements LSInput {

private final URI url;

public MyLSInput(URI url) {
    super();
    this.url = url;
}

@Override
public Reader getCharacterStream() {
    return null;
}

@Override
public void setCharacterStream(Reader characterStream) {

}

@Override
public InputStream getByteStream() {
    return null;
}

@Override
public void setByteStream(InputStream byteStream) {

}

@Override
public String getStringData() {
    return null;
}

@Override
public void setStringData(String stringData) {

}

@Override
public String getSystemId() {
    return url.toASCIIString();
}

@Override
public void setSystemId(String systemId) {
}

@Override
public String getPublicId() {
    return null;
}

@Override
public void setPublicId(String publicId) {
}

@Override
public String getBaseURI() {
    return null;
}

@Override
public void setBaseURI(String baseURI) {

}

@Override
public String getEncoding() {
    return null;
}

@Override
public void setEncoding(String encoding) {

}

@Override
public boolean getCertifiedText() {
    return false;
}

@Override
public void setCertifiedText(boolean certifiedText) {

}

}
like image 56
forty-two Avatar answered Sep 23 '22 00:09

forty-two