Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to deserialize only part of an XML document in C#

Here's a fictitious example of the problem I'm trying to solve. If I'm working in C#, and have XML like this:

<?xml version="1.0" encoding="utf-8"?> <Cars>   <Car>     <StockNumber>1020</StockNumber>     <Make>Nissan</Make>     <Model>Sentra</Model>   </Car>   <Car>     <StockNumber>1010</StockNumber>     <Make>Toyota</Make>     <Model>Corolla</Model>   </Car>   <SalesPerson>     <Company>Acme Sales</Company>     <Position>        <Salary>           <Amount>1000</Amount>           <Unit>Dollars</Unit>     ... and on... and on....   </SalesPerson> </Cars> 

the XML inside SalesPerson can be very long, megabytes in size. I want to deserialize the tag, but not deserialize the SalesPerson XML element instead keeping it in raw form "for later on".

Essentially I would like to be able to use this as a Objects representation of the XML.

[System.Xml.Serialization.XmlRootAttribute("Cars", Namespace = "", IsNullable = false)] public class Cars {     [XmlArrayItem(typeof(Car))]     public Car[] Car { get; set; }      public Stream SalesPerson { get; set; } }  public class Car {     [System.Xml.Serialization.XmlElementAttribute("StockNumber")]     public string StockNumber{ get; set; }      [System.Xml.Serialization.XmlElementAttribute("Make")]     public string Make{ get; set; }      [System.Xml.Serialization.XmlElementAttribute("Model")]     public string Model{ get; set; } } 

where the SalesPerson property on the Cars object would contain a stream with the raw xml that is within the <SalesPerson> xml element after being run through an XmlSerializer.

Can this be done? Can I choose to only deserialize "part of" an xml document?

Thanks! -Mike

p.s. example xml stolen from How to Deserialize XML document

like image 616
Mike Avatar asked Dec 15 '08 21:12

Mike


People also ask

What is XML deserialization?

Serialization is a process by which an object's state is transformed in some serial data format, such as XML or binary format. Deserialization, on the other hand, is used to convert the byte of data, such as XML or binary data, to object type.

What is XML ignore?

XmlIgnore. When a public property or field is decorated with the XmlIgnore attribute it is excluded from serialization, so the generated XML does not include its value. It is also ignored during deserialization. If the property is not given a value by another means it will be defaulted when reconstructed.

How does deserialize work in C#?

Deserialization is the process of reconstructing an object from a previously serialized sequence of bytes. It allows us to recover the object whenever it is required. It is the reverse process of serialization. Deserialize() method of BinaryFormatter class is used for deserialization from binary stream.


2 Answers

It might be a bit old thread, but i will post anyway. i had the same problem (needed to deserialize like 10kb of data from a file that had more than 1MB). In main object (which has a InnerObject that needs to be deserializer) i implemented a IXmlSerializable interface, then changed the ReadXml method.

We have xmlTextReader as input , the first line is to read till a XML tag:

reader.ReadToDescendant("InnerObjectTag"); //tag which matches the InnerObject 

Then create XMLSerializer for a type of the object we want to deserialize and deserialize it

XmlSerializer   serializer = new XmlSerializer(typeof(InnerObject));  this.innerObject = serializer.Deserialize(reader.ReadSubtree()); //this gives serializer the part of XML that is for  the innerObject data  reader.close(); //now skip the rest  

this saved me a lot of time to deserialize and allows me to read just a part of XML (just some details that describe the file, which might help the user to decide if the file is what he wants to load).

like image 155
user271807 Avatar answered Sep 22 '22 09:09

user271807


The accepted answer from user271807 is a great solution but I found, that I also needed to set the xml root of the fragment to avoid an exception with an inner exception saying something like this:

...xmlns=''> was not expected 

This exception was trown when I tried to deserialize only the inner Authentication element of this xml document:

<?xml version=""1.0"" encoding=""UTF-8""?> <Api>   <Authentication>                              <sessionid>xxx</sessionid>       <errormessage>xxx</errormessage>                   </Authentication> </ApI> 

So I ended up creating this extension method as a reusable solution - warning contains a memory leak, see below:

public static T DeserializeXml<T>(this string @this, string innerStartTag = null)         {             using (var stringReader = new StringReader(@this))             using (var xmlReader = XmlReader.Create(stringReader)) {                 if (innerStartTag != null) {                     xmlReader.ReadToDescendant(innerStartTag);                     var xmlSerializer = new XmlSerializer(typeof(T), new XmlRootAttribute(innerStartTag));                     return (T)xmlSerializer.Deserialize(xmlReader.ReadSubtree());                 }                 return (T)new XmlSerializer(typeof(T)).Deserialize(xmlReader);             }         } 

Update 20th March 2017:As the comment below points out, there is a memory leak problem when using one of the constructors of XmlSerializer, so I ended up using a caching solution as shown below:

    /// <summary>     ///     Deserialize XML string, optionally only an inner fragment of the XML, as specified by the innerStartTag parameter.     /// </summary>     public static T DeserializeXml<T>(this string @this, string innerStartTag = null) {         using (var stringReader = new StringReader(@this)) {             using (var xmlReader = XmlReader.Create(stringReader)) {                 if (innerStartTag != null) {                     xmlReader.ReadToDescendant(innerStartTag);                     var xmlSerializer = CachingXmlSerializerFactory.Create(typeof (T), new XmlRootAttribute(innerStartTag));                     return (T) xmlSerializer.Deserialize(xmlReader.ReadSubtree());                 }                 return (T) CachingXmlSerializerFactory.Create(typeof (T), new XmlRootAttribute("AutochartistAPI")).Deserialize(xmlReader);             }         }     } /// <summary> ///     A caching factory to avoid memory leaks in the XmlSerializer class. /// See http://dotnetcodebox.blogspot.dk/2013/01/xmlserializer-class-may-result-in.html /// </summary> public static class CachingXmlSerializerFactory {     private static readonly ConcurrentDictionary<string, XmlSerializer> Cache = new ConcurrentDictionary<string, XmlSerializer>();     public static XmlSerializer Create(Type type, XmlRootAttribute root) {         if (type == null) {             throw new ArgumentNullException(nameof(type));         }         if (root == null) {             throw new ArgumentNullException(nameof(root));         }         var key = string.Format(CultureInfo.InvariantCulture, "{0}:{1}", type, root.ElementName);         return Cache.GetOrAdd(key, _ => new XmlSerializer(type, root));     }     public static XmlSerializer Create<T>(XmlRootAttribute root) {         return Create(typeof (T), root);     }     public static XmlSerializer Create<T>() {         return Create(typeof (T));     }     public static XmlSerializer Create<T>(string defaultNamespace) {         return Create(typeof (T), defaultNamespace);     }     public static XmlSerializer Create(Type type) {         return new XmlSerializer(type);     }     public static XmlSerializer Create(Type type, string defaultNamespace) {         return new XmlSerializer(type, defaultNamespace);     } } 
like image 31
Stig Schmidt Nielsson Avatar answered Sep 20 '22 09:09

Stig Schmidt Nielsson