Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Capture Schema Information when validating XDocument

This is similar to this question C# Get schema information when validating xml

However, I am working with an XDocument for LINQ purposes.

I am reading/parsing a set of CSV files and converting to XML, then validating the XML against an XSD schema.

I would like to capture specific errors related to the element values, generate a more user friendly message, and give them back to the user so the input data can be corrected. One of the items I would like to include in the output data is some schema information (such as the range of acceptable values for a numeric type).

In my current approach (which I am open to changing), I am able to capture everything I need except for the schema information.

I've tried accessing the SourceSchemaObject in the ValidationEventArgs argument of the Validation event handler, but that is always null. I've also tried the GetSchemaInfo of the XElement and that appears to be null also.

I am using RegEx to identify the specific validation errors i want to capture, and grabbing data from the XElement via the sender argument of the validation event handler. I've thought of converting the schema to an XDocument and grabbing what I need via LINQ, but it seems to me that there should be a better option

Here's my current Validate Method:

private List<String> this.validationWarnings;
private XDocument xDoc;
private XmlSchemaSet schemas = new XmlSchemaSet();

public List<String> Validate()
{
    this.validationWarnings = new List<String>();

    // the schema is read elsewhere and added to the schema set
    this.xDoc.Validate(this.schemas, new ValidationEventHandler(ValidationCallBack), true);

    return validationWarnings
}

And here's my callback method:

private void ValidationCallBack(object sender, ValidationEventArgs args)
{           
    var element = sender as XElement;

    if (element != null)
    {

        // this is a just a placeholder method where I will be able to extract the 
        //  schema information and put together a user friendly message for specific 
        //  validation errors    
        var message = FieldValidationMessage(element, args);

        // if message is null, then the error is not one one I wish to capture for 
        //  the user and is related to an invalid XML structure (such as missing 
        //  elements or incorrect order).  Therefore throw an exception
        if (message == null)
            throw new InvalidXmlFileStructureException(args.Message, args.Exception);
        else
            validationWarnings.Add(message);

    }
}

The var message = FieldValidationMessage(element, args); line in my callback method is just a placeholder and does not exist yet The intention of this method is to do 3 things:

  1. Identify specific validation errors by using RegEx on args.Message (this already works, I have tested patterns that I plan on using)

  2. Grab attribute values from the XDocument related to the specific XElement that is causing the error (such as the row and column number in the original CSV)

  3. Grab the schema information if it is available so field types and restrictions can be added to the output message.

like image 390
psubsee2003 Avatar asked Oct 22 '11 10:10

psubsee2003


People also ask

What is meant by schema validation?

An API schema defines which API requests are valid based on several request properties like target endpoint and HTTP method. Schema Validation allows you to check if incoming traffic complies with a previously supplied API schema.

How to validate XML in c#?

Solution. Use the XmlValidatingReader to validate XML documents against any descriptor document, such as an XSD (XML Schema), a DTD (Document Type Definition), or an XDR (Xml-Data Reduced): public static void ValidateXML( ) { // create XSD schema collection with book.


1 Answers

For anyone who reads this question in the future, I managed to solve my problem, albeit in a slightly different way than i originally proposed.

The first problem I was having, that the SchemaInfo both in the ValidationEventArgs and the GetSchemaInfo extension method of XElement were null. I resolved that in the same manner as in the question i linked originally....

List<XElement> errorElements = new List<XElement>();

serializedObject.Validate((sender, args) =>
{
    var exception = (args.Exception as XmlSchemaValidationException);

    if (exception != null)
    {
        var element = (exception.SourceObject as XElement);

        if (element != null)
            errorElements.Add(element);
     }

});

foreach (var element in errorElements)
{
    var si = element.GetSchemaInfo(); 

    // do something with SchemaInfo
}

It would appear that the Schema info is not added to the XObject until AFTER the validation callback, so if you try to access it in the middle of the validation callback, it will be null, but if you capture the element, then access if after the Validate method has completed, it will not be null.

However, this opened up another problem. The SchemaInfo object model is not well documented and I had trouble parsing it out to find what I needed.

I found this question after I asked my original question. The accepted answer links to a really great blog post that breaks down the SchemaInfo object model. It took me a bit of work to refine the code to suit my purposes, but it does a good job of illustrating how to get the SchemaInfo for any XmlReader element (which I was able to change to work with an XObject).

like image 80
psubsee2003 Avatar answered Sep 21 '22 03:09

psubsee2003