Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Prevent DTD download when using XSLT i.e. XML Transformer

Tags:

java

xml

xslt

I have to process XML files that have a DTD with a XSLT in Java. The DTD is really needed because it contains the definitions of entities I use. (aside: yes, using entities for stuff that could use unicode is a bad idea ;-)

When I run the transformation it downloads the DTD from the external source every time. I want it to use a XML catalog to cache the DTDs so I gave the TransformerFactory a CatalogResolver as URIResolver:

URIResolver cr = new CatalogResolver();
tf = TransformerFactory.newInstance();
tf.setURIResolver(cr);
Transformer t = tf.newTransformer(xsltSrc);
t.setURIResolver(cr);
Result res = new SAXResult(myDefaultHandler());
t.transform(xmlSrc, res);

But when I run the transformation it still downloads the DTDs over the network. (Using Xalan and Xerces either as part of Java5 or standalone or using Saxon and Xerces.)

What does it take to force the transformation to only use the local copy of the DTDs?

like image 503
robcast Avatar asked Jul 07 '09 09:07

robcast


2 Answers

(I'm answering my own question here to save me the next time, or anyone else, the days of tinkering I needed to find the answer.)

What it really needs to change the way DTDs are resolved is an EntityResolver. Unfortunately it is not possible to set the EntityResolver to be used by the Transformer. So you have to create an XMLReader first with the CatalogResolver as its EntityResolver:

SAXParserFactory spf = SAXParserFactory.newInstance();
spf.setNamespaceAware(true);
XMLReader r = spf.newSAXParser().getXMLReader();
EntityResolver er = new CatalogResolver();
r.setEntityResolver(er);

and use it in for the Transformer:

SAXSource s = new SAXSource(r, xmlSrc);
Result res = new SAXResult(myDefaultHandler());
transformer.transform(s, res);
like image 74
robcast Avatar answered Nov 06 '22 04:11

robcast


You can use this code to disable this kind of functionality in Xerces:

org.dom4j.io.SAXReader reader = new org.dom4j.io.SAXReader();
reader.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);

This code sample uses Dom4j, but similar "setFeature" functionality exists in other java XML libraries such as JDOM.

like image 5
piepera Avatar answered Nov 06 '22 06:11

piepera