Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to make XML deserialization faster?

I have the following piece of code

public static object XmlDeserialize(string xml, Type objType)
{
    StringReader stream = null;
    XmlTextReader reader = null;
    try
    {
        XmlSerializer serializer = new XmlSerializer(objType);
        stream = new StringReader(xml); // Read xml data
        reader = new XmlTextReader(stream);  // Create reader
        return serializer.Deserialize(reader);
    }
    finally
    {
        if(stream != null) stream.Close();
        if(reader != null) reader.Close();
    }
}

The object itself has been generated via xsd.exe and looks kind of like this:

/// <remarks/>
[System.CodeDom.Compiler.GeneratedCodeAttribute("xsd", "2.0.50727.3038")]
[System.SerializableAttribute()]
[System.Diagnostics.DebuggerStepThroughAttribute()]
[System.ComponentModel.DesignerCategoryAttribute("code")]
[System.Xml.Serialization.XmlTypeAttribute(AnonymousType=true)]
[System.Xml.Serialization.XmlRootAttribute(Namespace="", IsNullable=false)]
public partial class MyObject {

    private DemographicsCriteriaStateStartAge[] startAgesField;

    private DemographicsCriteriaStateEndAge[] endAgesField;

    private DemographicsCriteriaStateFilter[] selectedFiltersField;

    /// <remarks/>
    [System.Xml.Serialization.XmlArrayItemAttribute("StartAge", IsNullable=false)]
    public DemographicsCriteriaStateStartAge[] StartAges {
        get {
            return this.startAgesField;
        }
        set {
            this.startAgesField = value;
        }
    }
    ...

The method is typically called like this:

var obj = (MyObject) XmlDeserialize(someXmlString, typeof(MyObject));

The following line of code always take a pretty large chunk of time (compared to everything else):

XmlSerializer serializer = new XmlSerializer(objType);

What is going on here, e.g. is it compiling a deserialization assembly in the background? Why the performance issue?

What can I do to ameliorate this performance problem?

like image 355
AngryHacker Avatar asked Nov 16 '11 00:11

AngryHacker


People also ask

Is Deserialization slow?

Serialization and deserialization of objects is a CPU-intensive procedure and is likely to slow down your application. Use the transient keyword to reduce the amount of data serialized. Additionally, customized readObject() and writeObject() methods may be beneficial in some cases.

What is Deserialization XML?

Serialization is a process by which an object's state is transformed in some serial data format, such as XML or binary format. Deserialization, on the other hand, is used to convert the byte of data, such as XML or binary data, to object type.


2 Answers

Try caching the instance of the XmlSerializer for each type at the class level so you don't have to recreate it each time if the same type is used:

class Foo
{
    private static Dictionary<Type, XmlSerializer> xmls = new Dictionary<Type, XmlSerializer>();

    // ...

    public static object XmlDeserialize(string xml, Type objType)
    {
        StringReader stream = null;
        XmlTextReader reader = null;
        try
        {
            XmlSerializer serializer;
            if(xmls.Contains(objType)) {
                serializer = xmls[objType];
            }
            else {
                serializer = new XmlSerializer(objType);
                xmls[objType] = serializer;
            }           

            stream = new StringReader(xml); // Read xml data
            reader = new XmlTextReader(stream);  // Create reader
            return serializer.Deserialize(reader);
        }
        finally
        {
            if(stream != null) stream.Close();
            if(reader != null) reader.Close();
        }
    }
}
like image 169
Callum Rogers Avatar answered Oct 20 '22 09:10

Callum Rogers


Yes, it is dynamically generating a serialisation assembly at run time. You can change this behaviour in Visual Studio. Go to the project properties and the build section. There is a setting for "Generate serialization assemblies" set it to true. This will generate a file like YourProject.XmlSerialiser.dll when you compile and will stop this bottleneck at run time.

One exception to note, however, is that this setting applies only to proxy types (for example, web service proxies and the like). To actually force Visual Studio 2010 to generate serialization assemblies for regular types, one must either mess with the project file (.csproj) and remove /proxytypes from the Sgen call or generate a post-build step to manually call sgen.exe on the assembly.

like image 5
Ben Robinson Avatar answered Oct 20 '22 11:10

Ben Robinson