Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java SAXParser False Positives

I am trying to build my first XML schema validator as a reusable component throughout my codebase and many projects. I have spent all day trying to follow examples and coding them up, and now have a proof of concept up and running.

The only problem is, its giving me false positives: its validating XML instances that should absolutely be failing. I've tested it out on 3 schemas: 1 schema it worked beautifully with, and now its misbehaving with the last two (false positives). I believe its because the first schema/instance pair I tried were extremely simple. I'm now trying to use it on more complex examples and it is choking.

Here is the body of the validate method where the SAX validation is done:

schema = getSchemaAsString();
targetXml = "ijeioj489fu4u8";

SAXParserFactory oSAXParserFactory = SAXParserFactory.newInstance();
SAXParser oSAXParser = null;
oSAXParserFactory.setNamespaceAware(true);

try 
{
    SchemaFactory oSchemaFactory =      
    SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI); 
    oSAXParserFactory.setSchema(oSchemaFactory.newSchema(new SAXSource(new InputSource(new StringReader(schema)))));

    oSAXParser = oSAXParserFactory.newSAXParser();

    DefaultHandler handler = new DefaultHandler(); 

    oSAXParser.parse(new InputSource(new StringReader(targetXml)), handler);
}
catch(Exception oException) 
{
    throw oException;
}  

Where schema and targetXml are in-memory XML strings (not file URIs) that are given the following values:

schema String:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified">
<xs:element name="PayloadMessage">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="MessageID" type="xs:long"/>
            <xs:element name="Timestamp" type="xs:long"/>
            <xs:element name="MessageAction" type="xs:string"/>
            <xs:element name="ContentType" type="xs:string"/>
            <xs:element name="ContentID" type="xs:string"/>
            <xs:element name="Payload" type="xs:string"/>
        </xs:sequence>
    </xs:complexType>
</xs:element>

Obviously, the given targetXml should fail against its given schema. Nope. No exceptions get thrown anywhere inside the SAX stuff.

I have a feeling I need to do something with the DefaultHandler but not sure... I went to http://www.w3.org/2001/03/webdata/xsv and confirmed that my schema is valid.

Does anything jump out at anyone? Thanks in advance!

like image 703
IAmYourFaja Avatar asked Jun 04 '26 10:06

IAmYourFaja


1 Answers

You must set an error handler that will throw SAXException. The default behavior is to attempt parse document even if it isn't valid. DefaultHandler implements ErrorHandler but the implementation in case of error or warning does nothing.

Javadoc WARNING: If an application does not register an ErrorHandler, XML parsing errors will go unreported, except that SAXParseExceptions will be thrown for fatal errors. In order to detect validity errors, an ErrorHandler that does something with error() calls must be registered.

I recommed this excellent tutorial with examples on XML validation. It was most helpful for me.

like image 189
viktor Avatar answered Jun 05 '26 23:06

viktor