Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Validating a HUGE XML file

I'm trying to find a way to validate a large XML file against an XSD. I saw the question ...best way to validate an XML... but the answers all pointed to using the Xerces library for validation. The only problem is, when I use that library to validate a 180 MB file then I get an OutOfMemoryException.

Are there any other tools,libraries, strategies for validating a larger than normal XML file?

EDIT: The SAX solution worked for java validation, but the other two suggestions for the libxml tool were very helpful as well for validation outside of java.

like image 668
Dan Cramer Avatar asked Sep 02 '08 21:09

Dan Cramer


People also ask

How do I open a heavy XML file?

XML files can be opened in a browser like IE or Chrome, with any text editor like Notepad or MS-Word. Even Excel can be used to open XML files.


2 Answers

Instead of using a DOMParser, use a SAXParser. This reads from an input stream or reader so you can keep the XML on disk instead of loading it all into memory.

SAXParserFactory factory = SAXParserFactory.newInstance(); factory.setValidating(true); factory.setNamespaceAware(true);  SAXParser parser = factory.newSAXParser();  XMLReader reader = parser.getXMLReader(); reader.setErrorHandler(new SimpleErrorHandler()); reader.parse(new InputSource(new FileReader ("document.xml"))); 
like image 124
jodonnell Avatar answered Sep 22 '22 00:09

jodonnell


Use libxml, which performs validation and has a streaming mode.

like image 39
John Millikin Avatar answered Sep 20 '22 00:09

John Millikin