Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Disable XML validation based on external DTD/XSD

Is there a way to disable XML validation based on external DTD/XSD without modifications to the source code (of the libraries that construct DocumentBuilder)? Something like setting JVM-wide defaults for DocumentBuilderFactory features, and the same for SAX?

Validation is great when editing files in IDE, but I don't need my webapp failing to start just because somelib.net went down.

I know I can specify local DTD/XSD locations, but that's an inconvenient workaround.

What are the options? I can think of two:

  • Implement my own DocumentBuilderFactory.
  • Intercept construction of Xerces's DocumentBuilderImpl and modify the features Hashtable (add http://apache.org/xml/features/nonvalidating/load-external-dtd).
like image 538
Yuri Geinish Avatar asked May 04 '11 12:05

Yuri Geinish


People also ask

Can we validate XML documents against so schema?

You can validate your XML documents against XML schemas only; validation against DTDs is not supported. However, although you cannot validate against DTDs, you can insert documents that contain a DOCTYPE or that refer to DTDs.

What is XML DTD validation?

DTD is the acronym for Document Type Definition. This is a description of the content for a family of XML files. This is part of the XML 1.0 specification, and allows one to describe and verify that a given document instance conforms to the set of rules detailing its structure and content.

How do I validate XML against XSD?

Java XML Validation API can be used to validate XML against XSD in java program. javax. xml. validation.


1 Answers

Disabling validation may not prevent a processor from fetching a DTD, as it still may do so in order to use attribute defaults etc. present in the DTD (which it will place in the tree), even if it does no actual validation against the DTD's grammar.

One technique to prevent network activity when processing an XML document is to use a "blanking resolver" like this:

import java.io.ByteArrayInputStream;
import java.io.IOException;

import org.xml.sax.EntityResolver;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;

public class BlankingResolver implements EntityResolver
{

    public InputSource resolveEntity( String arg0, String arg1 ) throws SAXException,
            IOException
    {

        return new InputSource( new ByteArrayInputStream( "".getBytes() ) );
    }

}

and then set this prior to processing like this:

DocumentBuilderFactory factory = DocumentBuilderFactory.
factory.setNamespaceAware( true );
builder = factory.newDocumentBuilder();
builder.setEntityResolver( new BlankingResolver() );
myDoc = builder.parse( myDocUri );
// etc.

You will then also be sure that the document being processed has not been altered by any information from the DTD.y

like image 157
alexbrn Avatar answered Sep 22 '22 03:09

alexbrn