Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unknown elements are just ignored during deserialization

Tags:

c#

xml

When I deserialize an XML document with XmlTextReader, a textual element for which there is no corresponding class is simply ignored.

Note: this is not about elements missing from the XML, which one requires to be present, but rather being present in the XML text, while not having an equivalent property in code.

I would have expected to get an exception because if the respective element is missing from the runtime data and I serialize it later, the resulting XML document will be different from the original one. So it's not safe to ignore it (in my real-world case I have just forgotten to define one of the 99+ classes the given document contains, and I didn't notice at first).

So is this normal and if yes, why? Can I somehow request that I want to get exceptions if elements cannot be serialized?

In the following example-XML I have purposely misspelled "MyComandElement" to illustrate the core problem:

<MyRootElement>
    <MyComandElement/>
</MyRootElement>

MyRootElement.cs:

public class CommandElement {};

public class MyRootElement
{
    public CommandElement MyCommandElement {get; set;}
}

Deserialization:

XmlSerializer xmlSerializer = new XmlSerializer(typeof(MyRootElement));
XmlTextReader xmlReader = new XmlTextReader(@"pgtest.xml");
MyRootElement mbs2 = (MyRootElement)xmlSerializer.Deserialize(xmlReader);
xmlReader.Close();
like image 292
oliver Avatar asked Dec 14 '22 17:12

oliver


2 Answers

As I have found out by accident during further research, this problem is actually ridiculously easy to solve because...

...XmlSerializer supports events! All one has to do is to define an event handler for missing elements

void Serializer_UnknownElement(object sender, XmlElementEventArgs e)
{
    throw new Exception("Unknown element "+e.Element.Name+" found in "
        +e.ObjectBeingDeserialized.ToString()+" in line "
        +e.LineNumber+" at position "+e.LinePosition);
}

and register the event with XmlSerializer:

xmlSerializer.UnknownElement += Serializer_UnknownElement;

The topic is treated at MSDN, where one also learns that

By default, after calling the Deserialize method, the XmlSerializer ignores XML attributes of unknown types.

Not surprisingly, there are also events for missing attributes, nodes and objects.

like image 57
oliver Avatar answered Dec 31 '22 16:12

oliver


So is this normal and if yes, why?

Because maybe you're consuming someone else's XML document and whilst they define 300 different elements within their XML, you only care about two. Should you be forced to create classes for all of their elements and deserialize all of them just to be able to access the two you care about?

Or perhaps you're working with a system that is going to be in flux over time. You're writing code that consumes today's XML and if new elements/attributes are introduced later, they shouldn't stop your tested and deployed code from being able to continue to consume those parts of the XML that they do understand (Insert caveat here that, hopefully, if you're in such a situation, you/the XML author don't introduce elements later which it is critical to understand to cope with the document correctly).

These are two sides of the same coin of why it can be desirable for the system not to blow up if it encounters unexpected parts within the XML document it's being asked to deserialize.

like image 27
Damien_The_Unbeliever Avatar answered Dec 31 '22 18:12

Damien_The_Unbeliever