Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Need a Better Way Than Reflection

Tags:

c#

reflection

I'm reading a CSV file and the records are recorded as a string[]. I want to take each record and convert it into a custom object.

T GetMyObject<T>();

Currently I'm doing this through reflection which is really slow. I'm testing with a 515 Meg file with several million records. It takes under 10 seconds to parse. It takes under 20 seconds to create the custom objects using manual conversions with Convert.ToSomeType but around 4 minutes to do the conversion to the objects through reflection.

What is a good way to handle this automatically?

It seems a lot of time is spent in the PropertyInfo.SetValue method. I tried caching the properties MethodInfo setter and using that instead, but it was actually slower.

I have also tried converting that into a delegate like the great Jon Skeet suggested here: Improving performance reflection , what alternatives should I consider, but the problem is I don't know what the property type is ahead of time. I'm able to get the delegate

var myObject = Activator.CreateInstance<T>();
foreach( var property in typeof( T ).GetProperties() )
{
    var d = Delegate.CreateDelegate( typeof( Action<,> )
    .MakeGenericType( typeof( T ), property.PropertyType ), property.GetSetMethod() );
}

The problem here is I can't cast the delegate into a concrete type like Action<T, int>, because the property type of int isn't known ahead of time.

like image 525
Josh Close Avatar asked Jan 11 '10 19:01

Josh Close


People also ask

When should you use reflection?

You can use reflection to dynamically create an instance of a type, bind the type to an existing object, or get the type from an existing object and invoke its methods or access its fields and properties. If you are using attributes in your code, reflection enables you to access them.

How can I make my reflection faster?

Adding setAccessible(true) call makes these reflection calls faster, but even then it takes 5.5 nanoseconds per call. Reflection is 104% slower than direct access (so about twice as slow). It also takes longer to warm up.

Should you use reflection in production code?

Never use reflection in production code!


2 Answers

The first thing I'd say is write some sample code manually that tells you what the absolute best case you can expect is - see if your current code is worth fixing.

If you are using PropertyInfo.SetValue etc, then absolutely you can make it quicker, even with juts object - HyperDescriptor might be a good start (it is significantly faster than raw reflection, but without making the code any more complicated).

For optimal performance, dynamic IL methods are the way to go (precompiled once); in 2.0/3.0, maybe DynamicMethod, but in 3.5 I'd favor Expression (with Compile()). Let me know if you want more detail?


Implementation using Expression and CsvReader, that uses the column headers to provide the mapping (it invents some data along the same lines); it uses IEnumerable<T> as the return type to avoid having to buffer the data (since you seem to have quite a lot of it):

using System;
using System.Collections.Generic;
using System.Globalization;
using System.IO;
using System.Linq;
using System.Linq.Expressions;
using System.Reflection;
using LumenWorks.Framework.IO.Csv;
class Entity
{
    public string Name { get; set; }
    public DateTime DateOfBirth { get; set; }
    public int Id { get; set; }

}
static class Program {

    static void Main()
    {
        string path = "data.csv";
        InventData(path);

        int count = 0;
        foreach (Entity obj in Read<Entity>(path))
        {
            count++;
        }
        Console.WriteLine(count);
    }
    static IEnumerable<T> Read<T>(string path)
        where T : class, new()
    {
        using (TextReader source = File.OpenText(path))
        using (CsvReader reader = new CsvReader(source,true,delimiter)) {

            string[] headers = reader.GetFieldHeaders();
            Type type = typeof(T);
            List<MemberBinding> bindings = new List<MemberBinding>();
            ParameterExpression param = Expression.Parameter(typeof(CsvReader), "row");
            MethodInfo method = typeof(CsvReader).GetProperty("Item",new [] {typeof(int)}).GetGetMethod();
            Expression invariantCulture = Expression.Constant(
                CultureInfo.InvariantCulture, typeof(IFormatProvider));
            for(int i = 0 ; i < headers.Length ; i++) {
                MemberInfo member = type.GetMember(headers[i]).Single();
                Type finalType;
                switch (member.MemberType)
                {
                    case MemberTypes.Field: finalType = ((FieldInfo)member).FieldType; break;
                    case MemberTypes.Property: finalType = ((PropertyInfo)member).PropertyType; break;
                    default: throw new NotSupportedException();
                }
                Expression val = Expression.Call(
                    param, method, Expression.Constant(i, typeof(int)));
                if (finalType != typeof(string))
                {
                    val = Expression.Call(
                        finalType, "Parse", null, val, invariantCulture);
                }
                bindings.Add(Expression.Bind(member, val));
            }

            Expression body = Expression.MemberInit(
                Expression.New(type), bindings);

            Func<CsvReader, T> func = Expression.Lambda<Func<CsvReader, T>>(body, param).Compile();
            while (reader.ReadNextRecord()) {
                yield return func(reader);
            }
        }
    }
    const char delimiter = '\t';
    static void InventData(string path)
    {
        Random rand = new Random(123456);
        using (TextWriter dest = File.CreateText(path))
        {
            dest.WriteLine("Id" + delimiter + "DateOfBirth" + delimiter + "Name");
            for (int i = 0; i < 10000; i++)
            {
                dest.Write(rand.Next(5000000));
                dest.Write(delimiter);
                dest.Write(new DateTime(
                    rand.Next(1960, 2010),
                    rand.Next(1, 13),
                    rand.Next(1, 28)).ToString(CultureInfo.InvariantCulture));
                dest.Write(delimiter);
                dest.Write("Fred");
                dest.WriteLine();
            }
            dest.Close();
        }
    }
}

Second version (see comments) that uses TypeConverter rather than Parse:

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Globalization;
using System.IO;
using System.Linq;
using System.Linq.Expressions;
using System.Reflection;
using LumenWorks.Framework.IO.Csv;
class Entity
{
    public string Name { get; set; }
    public DateTime DateOfBirth { get; set; }
    public int Id { get; set; }

}
static class Program
{

    static void Main()
    {
        string path = "data.csv";
        InventData(path);

        int count = 0;
        foreach (Entity obj in Read<Entity>(path))
        {
            count++;
        }
        Console.WriteLine(count);
    }
    static IEnumerable<T> Read<T>(string path)
        where T : class, new()
    {
        using (TextReader source = File.OpenText(path))
        using (CsvReader reader = new CsvReader(source, true, delimiter))
        {

            string[] headers = reader.GetFieldHeaders();
            Type type = typeof(T);
            List<MemberBinding> bindings = new List<MemberBinding>();
            ParameterExpression param = Expression.Parameter(typeof(CsvReader), "row");
            MethodInfo method = typeof(CsvReader).GetProperty("Item", new[] { typeof(int) }).GetGetMethod();

            var converters = new Dictionary<Type, ConstantExpression>();
            for (int i = 0; i < headers.Length; i++)
            {
                MemberInfo member = type.GetMember(headers[i]).Single();
                Type finalType;
                switch (member.MemberType)
                {
                    case MemberTypes.Field: finalType = ((FieldInfo)member).FieldType; break;
                    case MemberTypes.Property: finalType = ((PropertyInfo)member).PropertyType; break;
                    default: throw new NotSupportedException();
                }
                Expression val = Expression.Call(
                    param, method, Expression.Constant(i, typeof(int)));
                if (finalType != typeof(string))
                {
                    ConstantExpression converter;
                    if (!converters.TryGetValue(finalType, out converter))
                    {
                        converter = Expression.Constant(TypeDescriptor.GetConverter(finalType));
                        converters.Add(finalType, converter);
                    }
                    val = Expression.Convert(Expression.Call(converter, "ConvertFromInvariantString", null, val),
                        finalType);
                }
                bindings.Add(Expression.Bind(member, val));
            }

            Expression body = Expression.MemberInit(
                Expression.New(type), bindings);

            Func<CsvReader, T> func = Expression.Lambda<Func<CsvReader, T>>(body, param).Compile();
            while (reader.ReadNextRecord())
            {
                yield return func(reader);
            }
        }
    }
    const char delimiter = '\t';
    static void InventData(string path)
    {
        Random rand = new Random(123456);
        using (TextWriter dest = File.CreateText(path))
        {
            dest.WriteLine("Id" + delimiter + "DateOfBirth" + delimiter + "Name");
            for (int i = 0; i < 10000; i++)
            {
                dest.Write(rand.Next(5000000));
                dest.Write(delimiter);
                dest.Write(new DateTime(
                    rand.Next(1960, 2010),
                    rand.Next(1, 13),
                    rand.Next(1, 28)).ToString(CultureInfo.InvariantCulture));
                dest.Write(delimiter);
                dest.Write("Fred");
                dest.WriteLine();
            }
            dest.Close();
        }
    }
}
like image 100
Marc Gravell Avatar answered Sep 18 '22 04:09

Marc Gravell


You should make a DynamicMethod or an expression tree and build statically typed code at runtime.

This will incur a rather large setup cost, but no per-object overhead at all.
However, it's somewhat difficult to do, and will result in complicated code that is difficult to debug.

like image 27
SLaks Avatar answered Sep 19 '22 04:09

SLaks