Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Deserializing a single element in a large XML document: xmlSerializer.Deserialize(xmlReader.ReadSubtree()) fails due to namespace issues

I am attempting to process a large XML document (using a XmlReader) in a single pass, and deserialize only certain elements in it using an XmlSerializer.

Below is some code and a tiny mock XML document showing how I have attempted to do this.

Rationale for using XmlReader: 1. I am dealing with very large XML documents (10–250 MB), which for this reason I do not want to load into memory. So XmlDocument is out of the question. 2. I want to extract only certain elements. Typically I will be able to ignore most other content. XmlReader appears to give me an efficient means of skipping irrelevant content. 3. I do not know in advance whether any and all elements that I can deal with will be present; therefore I am not using a bunch of Xpath/XQuery or LINQ to XML-based queries, because I want to make only a single pass over the XML files (due to their size).

public class ElementOfInterest { }
…

var xml = @"<?xml version='1.0' encoding='utf-8' ?>
            <Root xmlns:ex='urn:stakx:example'
                  xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'>
              <ElementOfInterest xsi:type='ex:ElementOfInterest' />
            </Root>";

var reader = System.Xml.XmlReader.Create(new System.IO.StringReader(xml));
reader.ReadToFollowing("ElementOfInterest");

var serializer = new System.Xml.Serialization.XmlSerializer(typeof(ElementOfInterest));
serializer.Deserialize(reader.ReadSubtree());

The last line of code throws the following inner exception:

InvalidOperationException: "Namespace prefix ex is not defined."

Obviously, the XmlSerializer doesn't recognise the ex namespace prefix inside the xsi:type attribute's value.

This is just one error I am having, but frankly, the larger problem is that I have no idea how to go about the whole namespace issue. I am simply looking for a convenient way to de-serialize just a single node out of the XML document, but that seems to entail having to manually register/manage namespaces, and to somehow forward them from the XmlReader to the XmlSerializer.

Can someone demonstrate how to deserialize a single node from a XML document read with an XmlReader, either by pointing out the error in my code, or by showing an alternative approach?

like image 748
stakx - no longer contributing Avatar asked Jan 27 '15 23:01

stakx - no longer contributing


People also ask

What is the correct way of using XML Deserialization?

As with the CreatePo method, you must first construct an XmlSerializer, passing the type of the class to be deserialized to the constructor. Also, a FileStream is required to read the XML document. To deserialize the objects, call the Deserialize method with the FileStream as an argument.

Can I make XmlSerializer ignore the namespace on Deserialization?

Yes, you can tell the XmlSerializer to ignore namespaces during de-serialization.


1 Answers

The following works:

using System.IO;
using System.Xml;
using System.Xml.Serialization;

static void Main()
{
    var xml = @"<?xml version='1.0' encoding='utf-8' ?>
                <Root
                  xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'
                  xmlns:ex='urn:stakx:example'
                >
                  <ex:ElementOfInterest xsi:type='ex:ElementOfInterest' />
                </Root>";

    var nt = new NameTable();
    var mgr = new XmlNamespaceManager(nt);
    mgr.AddNamespace("ex", "urn:stakx:example");
    var ctxt = new XmlParserContext(nt, mgr, "", XmlSpace.Default);
    var reader = XmlReader.Create(new StringReader(xml), null, ctxt);
    var serializer = new XmlSerializer(typeof(ElementOfInterest));

    reader.ReadToFollowing("ElementOfInterest", "urn:stakx:example");
    var eoi = (ElementOfInterest)serializer.Deserialize(reader.ReadSubtree());
}

[XmlRoot(Namespace = "urn:stakx:example")]
public class ElementOfInterest { }

Note the namespace in the input: <ex:ElementOfInterest>.

like image 159
Tomalak Avatar answered Oct 19 '22 21:10

Tomalak