I have and XML file with open Price tag. Is there a way to parse the file despite the error? How to skip product with error and continue parsing?
<Products>
<Product Name="Gummi bears">
<Price Currency="GBP">4.07</Price>
<BestBefore Date="19-02-2014"/>
</Product>
<Product Name="Mounds">
<Price Currency="AUD">5.64
<BestBefore Date="08-04-2014"/>
</Product>
<Product Name="Vodka">
<Price Currency="RUB">70</Price>
<BestBefore Date="11-10-2014"/>
</Product>
</Products>
Here's the code. It's an implementation to what BrandonArp has already mentioned.
There's a property that need to set to ignore fatal error - continue-after-fatal-error
http://apache.org/xml/features/continue-after-fatal-error
true: Attempt to continue parsing after a fatal error.
false: Stops parse on first fatal error.
default: false
XMLUni Predefined Constant: fgXercesContinueAfterFatalError
note: The behavior of the parser when this feature is set to true is undetermined! Therefore use this feature with extreme caution because the parser may get stuck in an infinite loop or worse.
More detail can be found here
PriceReader class
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.XMLReader;
public class PriceReader {
public static void main(String argv[]) {
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
XMLReader xmlReader = saxParser.getXMLReader();
try {
xmlReader.setFeature(
"http://apache.org/xml/features/continue-after-fatal-error",
true);
} catch (SAXException e) {
System.out.println("error in setting up parser feature");
}
xmlReader.setContentHandler(new PriceHandler());
xmlReader.setErrorHandler(new MyErrorHandler());
xmlReader.parse("bin\\com\\test\\stack\\overflow\\sax\\prices.xml");
} catch (Throwable e) {
System.out.println("Error -- " +e.getMessage());
}
}
}
PriceHandler class
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class PriceHandler extends DefaultHandler {
public void startElement(String uri, String localName,
String qName, Attributes attributes)
throws SAXException {
if (qName.equalsIgnoreCase("Product")) {
System.out.println("Product ::: "+ attributes.getValue("Name"));
}
}
}
MyErrorHandler class
import org.xml.sax.ErrorHandler;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;
public class MyErrorHandler implements ErrorHandler {
private String getParseExceptionInfo(SAXParseException spe) {
String systemId = spe.getSystemId();
if (systemId == null) {
systemId = "null";
}
String info = "URI=" + systemId + " Line="
+ spe.getLineNumber() + ": " + spe.getMessage();
return info;
}
public void warning(SAXParseException spe) throws SAXException {
System.out.println("Warning: " + getParseExceptionInfo(spe));
}
public void error(SAXParseException spe) throws SAXException {
String message = "Error: " + getParseExceptionInfo(spe);
System.out.println(message);
}
public void fatalError(SAXParseException spe) throws SAXException {
String message = "Fatal Error: " + getParseExceptionInfo(spe);
System.out.println(message);
}
}
Output
Product ::: Gummi bears
Product ::: Mounds
Fatal Error: URI=file:///C:/Developer/pachat/workspaces/eclipse-default/stack-overflow/bin/com/test/stack/overflow/sax/prices.xml Line=9: The element type "Price" must be terminated by the matching end-tag "</Price>".
Fatal Error: URI=file:///C:/Developer/pachat/workspaces/eclipse-default/stack-overflow/bin/com/test/stack/overflow/sax/prices.xml Line=9: The end-tag for element type "Price" must end with a '>' delimiter.
Product ::: Vodka
Product ::: Rum
Product ::: Brezzer
Fatal Error: URI=file:///C:/Developer/pachat/workspaces/eclipse-default/stack-overflow/bin/com/test/stack/overflow/sax/prices.xml Line=21: The element type "Price" must be terminated by the matching end-tag "</Price>".
Fatal Error: URI=file:///C:/Developer/pachat/workspaces/eclipse-default/stack-overflow/bin/com/test/stack/overflow/sax/prices.xml Line=21: The end-tag for element type "Price" must end with a '>' delimiter.
Product ::: Water
Fatal Error: URI=file:///C:/Developer/pachat/workspaces/eclipse-default/stack-overflow/bin/com/test/stack/overflow/sax/prices.xml Line=26: The end-tag for element type "Product" must end with a '>' delimiter.
Fatal Error: URI=file:///C:/Developer/pachat/workspaces/eclipse-default/stack-overflow/bin/com/test/stack/overflow/sax/prices.xml Line=26: XML document structures must start and end within the same entity.
Fatal Error: URI=file:///C:/Developer/pachat/workspaces/eclipse-default/stack-overflow/bin/com/test/stack/overflow/sax/prices.xml Line=26: Premature end of file.
Error -- processing event: -1
The general way to deal with errors like this is to use a streaming parser. The one that comes to mind for Java is SAX.
When creating a Handler, you will be able to override/implement the error and fatalError methods. These will allow you to continue parsing, but that still leaves you to handle the actual errors.
Obviously there are many possible errors in an XML document and it'll only make sense to handle some of them. Hopefully this will give you a place to start with a parser, though.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With