Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to speed this method up?

I have a method that uses loops through 7,753+ objects and Gets the value of each property for each object. Each object has 14 properties.

private void InitializeData(IList objects, PropertyInfo[] props, List<DPV> dataPs, List<Dictionary<string, object>> tod)
{
    foreach (var item in objects)
    {
        var kvp = new Dictionary<string, object>();
        foreach (var p in props)
        {
            var dataPs = dataPs.FirstOrDefault(x => x.Name == p.Name);
            object returnData;
            if (dataPoint != null)
            {
                int maxLength = (dataP.MaxLength == null) ? 0 : (int) dataP.MaxLength;
                returnData = p.GetValue(item, null);
                if (!string.IsNullOrEmpty(dataP.FormatString) && !string.IsNullOrEmpty(returnData.ToString()))
                {
                    returnData = FormatDataForDisplay(returnData, dataP, maxLength, "", 8);
                }
            }
            else
            {
                returnData = p.GetValue(item, null);
            }
            kvp.Add(p.Name, returnData);
        }
        tod.Add(kvp);
    }
}

I believe GetValue is what takes the majority of the time in this method, The method took around 900ms to run, but GetValue which is called 800,000+ times takes around 750ms (total, not per-call).

public List<Dictionary<string, object>> GetColumnOptions<T>(List<T> list)
    {

        var tod= new List<Dictionary<string, object>>();



        var objects = (IList)list[0];
        Type objType = objects[0].GetType();

        var props = objType.GetProperties(BindingFlags.DeclaredOnly |
                                                         BindingFlags.Public |
                                                         BindingFlags.Instance);


        var dPs= GetDPs();



        //Initialize aaData
        //I don't believe this is correct
        InitializeData2<T>(new List<T> { (T) objects}, props, dPs, tod);

        return tod;
    }
like image 862
Xaisoft Avatar asked Jul 15 '13 17:07

Xaisoft


3 Answers

For your value class you can create direct setter and getter lambda.
The performance is nearly as fast as directly accessing the properies.

Get Setter from PropertyInfo

var propertyInfo = typeof(MyType).GetProperty("MyPropertValue");
var propertySetter = FastInvoke.BuildUntypedSetter<T>(propertyInfo));
var fieldInfo = typeof(MyType).GetField("MyFieldValue");
var fieldSetter = FastInvoke.BuildUntypedSetter<T>(fieldInfo));

Usage in a loop

var myTarget = new MyType();
setter(myTarget, aNewValue)

Helper to retrieving fast Setter an Getter

public static class FastInvoke {

    public static Func<T, object> BuildUntypedGetter<T>(MemberInfo memberInfo)
    {
        var targetType = memberInfo.DeclaringType;
        var exInstance = Expression.Parameter(targetType, "t");

        var exMemberAccess = Expression.MakeMemberAccess(exInstance, memberInfo);       // t.PropertyName
        var exConvertToObject = Expression.Convert(exMemberAccess, typeof(object));     // Convert(t.PropertyName, typeof(object))
        var lambda = Expression.Lambda<Func<T, object>>(exConvertToObject, exInstance);

        var action = lambda.Compile();
        return action;
    }

    public static Action<T, object> BuildUntypedSetter<T>(MemberInfo memberInfo)
    {
        var targetType = memberInfo.DeclaringType;
        var exInstance = Expression.Parameter(targetType, "t");

        var exMemberAccess = Expression.MakeMemberAccess(exInstance, memberInfo);

        // t.PropertValue(Convert(p))
        var exValue = Expression.Parameter(typeof(object), "p");
        var exConvertedValue = Expression.Convert(exValue, GetUnderlyingType(memberInfo));
        var exBody = Expression.Assign(exMemberAccess, exConvertedValue);

        var lambda = Expression.Lambda<Action<T, object>>(exBody, exInstance, exValue);
        var action = lambda.Compile();
        return action;
    }

    private static Type GetUnderlyingType(this MemberInfo member)
    {
        switch (member.MemberType)
        {
            case MemberTypes.Event:
                return ((EventInfo)member).EventHandlerType;
            case MemberTypes.Field:
                return ((FieldInfo)member).FieldType;
            case MemberTypes.Method:
                return ((MethodInfo)member).ReturnType;
            case MemberTypes.Property:
                return ((PropertyInfo)member).PropertyType;
            default:
                throw new ArgumentException
                (
                 "Input MemberInfo must be if type EventInfo, FieldInfo, MethodInfo, or PropertyInfo"
                );
        }
    }
}

============= Performance Analysis Added ===================

5 Mio Objects, 20 Properties

  • 3.4s direct Property access
  • 130.0s via PropertyInfo.SetValue
  • 4.0s via TypedSetter (code shown in article)
  • 9.8s via UnTypedSetter (code above)

The trick is to generate the property-setter and -getter once for each class an reuse them.

// Create an fill objects fast from DataReader
// http://flurfunk.sdx-ag.de/2012/05/c-performance-bei-der-befullungmapping.html 
static List<T> CreateObjectFromReader<T>(IDataReader reader)
    where T : new()
{
  // Prepare
  List<string> fieldNames = GetFieldNames(reader);
  List<Action<T, object>> setterList = new List<Action<T, object>>();
 
  // Create Property-Setter and store it in an array 
  foreach (var field in fieldNames)
  {
    var propertyInfo = typeof(T).GetProperty(field);
    setterList.Add(FastInvoke.BuildUntypedSetter<T>(propertyInfo));
  }
  Action<T, object>[] setterArray = setterList.ToArray();
 
  // generate and fill objects
  while (reader.Read())
  {
    T xclass = new T();
    int fieldNumber = 0;
 
    for (int i = 0; i< setterArray.Length; i++) 
    {
        // call setter
        setterArray[i](xclass, reader.GetValue(i));
        fieldNumber++;
    } 
    result.Add(xclass);
  }
}

My original article (german text and older code) was https://web.archive.org/web/20141020092917/http://flurfunk.sdx-ag.de/2012/05/c-performance-bei-der-befullungmapping.html

like image 161
Fried Avatar answered Nov 17 '22 06:11

Fried


If the problem is really in PropertyInfo.GetValue method call you can use the approach with building property-getters cache (for example via compiled expressions). Here is the sample that demostrates that this approach is up to 30-40% faster than original method on 8000 objects with 14 properties (with hot cache):

static void Main(string[] args) {
    IList objects = new List<Obj>();
    for(int i = 0; i < 8000; i++)
        objects.Add(new Obj());
    var properties = typeof(Obj).GetProperties();


    var sw1 = System.Diagnostics.Stopwatch.StartNew();
    InitializeData1(objects, properties, new List<Dictionary<string, object>>());
    sw1.Stop();
    Console.WriteLine("Reflection PropertyInfo.GetValue: " + sw1.ElapsedTicks.ToString());

    // cold cache testing
    var sw2_coldCache = System.Diagnostics.Stopwatch.StartNew();
    InitializeData2<Obj>(objects, properties, new List<Dictionary<string, object>>(), new Dictionary<string, Func<Obj, object>>());
    sw2_coldCache.Stop();
    Console.WriteLine("Cached Getters (Cold cache): " + sw2_coldCache.ElapsedTicks.ToString());

    // cache initialization
    InitializeData2<Obj>(new List<Obj> { new Obj() }, properties, new List<Dictionary<string, object>>(), gettersCache);
    // hot cache testing
    var sw2_hotCache = System.Diagnostics.Stopwatch.StartNew();
    InitializeData2<Obj>(objects, properties, new List<Dictionary<string, object>>(), gettersCache);
    sw2_hotCache.Stop();
    Console.WriteLine("Cached Getters (Hot cache): " + sw2_hotCache.ElapsedTicks.ToString());

    var sw3 = System.Diagnostics.Stopwatch.StartNew();
    InitializeData3(objects, properties, new List<Dictionary<string, object>>());
    sw3.Stop();
    Console.WriteLine("returnProps special method: " + sw3.ElapsedTicks.ToString());

    var sw4 = System.Diagnostics.Stopwatch.StartNew();
    InitializeData2_NonGeneric(objects, properties, new List<Dictionary<string, object>>());
    sw4.Stop();
    Console.WriteLine("Cached Getters (runtime types resolving): " + sw4.ElapsedTicks.ToString());
}

Here is the original implementation (reduced for test purposes):

static void InitializeData1(IList objects, PropertyInfo[] props, List<Dictionary<string, object>> tod) {
    foreach(var item in objects) {
        var kvp = new Dictionary<string, object>();
        foreach(var p in props) {
            kvp.Add(p.Name, p.GetValue(item, null));
        }
        tod.Add(kvp);
    }
}

Here is the optimized implementation:

static IDictionary<string, Func<Obj, object>> gettersCache = new Dictionary<string, Func<Obj, object>>();
static void InitializeData2<T>(IList objects, PropertyInfo[] props, List<Dictionary<string, object>> tod, IDictionary<string, Func<T, object>> getters) {
    Func<T, object> getter;
    foreach(T item in objects) {
        var kvp = new Dictionary<string, object>();
        foreach(var p in props) {
            if(!getters.TryGetValue(p.Name, out getter)) {
                getter = GetValueGetter<T>(p);
                getters.Add(p.Name, getter);
            }
            kvp.Add(p.Name, getter(item));
        }
        tod.Add(kvp);
    }
}

static Func<T, object> GetValueGetter<T>(PropertyInfo propertyInfo) {
    var instance = System.Linq.Expressions.Expression.Parameter(propertyInfo.DeclaringType, "i");
    var property = System.Linq.Expressions.Expression.Property(instance, propertyInfo);
    var convert = System.Linq.Expressions.Expression.TypeAs(property, typeof(object));
    return (Func<T, object>)System.Linq.Expressions.Expression.Lambda(convert, instance).Compile();
}

Test class:

class Obj {
    public int p00 { set; get; }
    public string p01 { set; get; }
    public float p02 { set; get; }
    public double p03 { set; get; }
    public char p04 { set; get; }
    public byte p05 { set; get; }
    public long p06 { set; get; }
    public int p07 { set; get; }
    public string p08 { set; get; }
    public float p09 { set; get; }
    public double p10 { set; get; }
    public char p11 { set; get; }
    public byte p12 { set; get; }
    public long p13 { set; get; }
}

Update: Added solution from varocarbas into tests

static void InitializeData3(IList objects, PropertyInfo[] props, List<Dictionary<string, object>> tod) {
    foreach(Obj item in objects) {
        var kvp = new Dictionary<string, object>();
        foreach(var p in props) {
            kvp.Add(p.Name, returnProps(p.Name, item));
        }
        tod.Add(kvp);
    }
}
static object returnProps(string propName, Obj curObject) {
    if(propName == "p00") {
        return curObject.p00;
    }
    else if(propName == "p01") {
        return curObject.p01;
    }
    else if(propName == "p02") {
        return curObject.p02;
    }
    else if(propName == "p03") {
        return curObject.p03;
    }
    else if(propName == "p04") {
        return curObject.p04;
    }
    else if(propName == "p05") {
        return curObject.p05;
    }
    else if(propName == "p06") {
        return curObject.p06;
    }
    else if(propName == "p07") {
        return curObject.p07;
    }
    else if(propName == "p08") {
        return curObject.p08;
    }
    else if(propName == "p09") {
        return curObject.p09;
    }
    else if(propName == "p10") {
        return curObject.p10;
    }
    else if(propName == "p11") {
        return curObject.p11;
    }
    else if(propName == "p12") {
        return curObject.p12;
    }
    else if(propName == "p13") {
        return curObject.p13;
    }
    return new object();
}

Console Results: (Release, x64) (Core i5 M560 @2.67 GHz, 8GB RAM, Win7x64)

Reflection PropertyInfo.GetValue: 161288
Cached Getters (Cold cache): 153808
Cached Getters (Hot cache): 110837
returnProps special method: 128905

Thus, the caching approach is the best.

UPDATE2
The methods demonstrated in sample are intended to be used when the type of objects elements is known at compile time (generic way):

InitializeData2<Obj>(...)

If you are using the objects list which type is unknown at compile-time, you can use the following approach to invoke the InitializeData2<> generic method at run-time:

InitializeData2_NonGeneric(objects, properties, new List<Dictionary<string, object>>());
//...
static void InitializeData2_NonGeneric(IList objects, PropertyInfo[] props, List<Dictionary<string, object>> tod) {
    Type elementType = objects[0].GetType();
    var genericMethodInfo = typeof(Program).GetMethod("InitializeData2", BindingFlags.Static | BindingFlags.NonPublic);
    var genericMethod = genericMethodInfo.MakeGenericMethod(new Type[] { elementType });

    var genericGetterType = typeof(Func<,>).MakeGenericType(elementType,typeof(object));
    var genericCacheType = typeof(Dictionary<,>).MakeGenericType(typeof(string), genericGetterType);
    var genericCacheConstructor = genericCacheType.GetConstructor(new Type[] { });
    genericMethod.Invoke(null, new object[] { objects, props, tod, genericCacheConstructor.Invoke(new object[] { }) });
}
like image 39
DmitryG Avatar answered Nov 17 '22 06:11

DmitryG


I did a simple test where I replaced the problematic .GetValue with a function performing a simplistic assignation ("if the name of the property is blabla, the value is Object.blabla"). The test consists just in a simple version of your function/variable/properties and a loop allowing to have full control over the number of iterations. The results have been certainly surprising: the new approach is 10 times faster! Bear in mind that in my original tests (50000 iterations) the times were 2276 (old) vs. 234 (new). This difference remains constant for different scenarios; for example for 8000 iterations, it delivers 358ms vs. 36ms. I have done these tests on a pretty powerful computer and on C# winforms; @Xaisoft can take the code below, perform a test under his specific conditions and tell the results.

The code:

 private void Form1_Load(object sender, EventArgs e)
 {
     List<List> var = new List<List>();

     List var1 = new List();
     var1.var = 1;
     var1.var2 = 1;
     var1.var3 = 1;
     var1.var4 = 1;
     var1.var5 = 1;

     List var2 = new List();
     var2.var = 1;
     var2.var2 = 1;
     var2.var3 = 1;
     var2.var4 = 1;
     var2.var5 = 1;

     List var3 = new List();
     var3.var = 1;
     var3.var2 = 1;
     var3.var3 = 1;
     var3.var4 = 1;
     var3.var5 = 1;

     List var4 = new List();
     var4.var = 1;
     var4.var2 = 1;
     var4.var3 = 1;
     var4.var4 = 1;
     var4.var5 = 1;

     var.Add(var1);
     var.Add(var2);
     var.Add(var3);
     var.Add(var4);

     InitializeData(var, typeof(List).GetProperties());
 }

 private static void InitializeData(List<List> objects, PropertyInfo[] props)
 {
     DateTime start = DateTime.Now;

     int count = 0;
     do
     {
         count = count + 1;
         foreach (var item in objects)
         {

             foreach (var p in props)
             {
                 object returnData = p.GetValue(item, null); //returnProps(p.Name, item);
             }
         }

     } while (count < 50000);


     TimeSpan timer = new TimeSpan();
     timer = DateTime.Now.Subtract(start);
 }

 private class List
 {
     public int var { set; get; }
     public int var2 { set; get; }
     public int var3 { set; get; }
     public int var4 { set; get; }
     public int var5 { set; get; }
     public int var6 { set; get; }
     public int var7 { set; get; }
     public int var8 { set; get; }
     public int var9 { set; get; }
     public int var10 { set; get; }
     public int var11 { set; get; }
     public int var12 { set; get; }
     public int var13 { set; get; }
     public int var14 { set; get; }
 }
 private static object returnProps(string propName, List curObject)
 {
     if (propName == "var")
     {
         return curObject.var;
     }
     else if (propName == "var2")
     {
         return curObject.var2;
     }
     else if (propName == "var3")
     {
         return curObject.var3;
     }
     else if (propName == "var4")
     {
         return curObject.var4;
     }
     else if (propName == "var5")
     {
         return curObject.var5;
     }
     else if (propName == "var6")
     {
         return curObject.var6;
     }
     else if (propName == "var7")
     {
         return curObject.var7;
     }
     else if (propName == "var8")
     {
         return curObject.var8;
     }
     else if (propName == "var9")
     {
         return curObject.var9;
     }
     else if (propName == "var10")
     {
         return curObject.var10;
     }
     else if (propName == "var11")
     {
         return curObject.var11;
     }
     else if (propName == "var12")
     {
         return curObject.var12;
     }
     else if (propName == "var13")
     {
         return curObject.var13;
     }
     else if (propName == "var14")
     {
         return curObject.var14;
     }

     return new object();
 }

FINAL NOTE: I would like people to understand so impressive results more generically than just applied to .GetValue. Nowadays computers can deal with lots of things and you don't really need to maximise the performance of each single bit, this is true. On the other hand, if you have performance problems and you need to "save resources" in a more relevant way, you should focus your improvements on the idea "the simpler, the quicker". I have done myself performance improvements in codes using a relevant number of Lists and Dictionaries and the results are noticiable even after each single change (List into conventional Array). You don't need to be too alarmist on this front but, in case of being required, remember that the memory consumption/associated time requirements to a List with respect to an Array are higher (and both elements do basically the same). Same thing for multi-dimension arrays, long-sized arrays, etc.

------ MORE DETAILED PERFORMANCE ANALYSIS

Even though I have let my point very clear since the start (just an idea which has to be adapted to each situation), I do understand that my claim (10 times faster) do require a proper definition. I have been doing tests under different conditions and here come the results:

NOTE: the aforementioned results were output by a 32-bit executable; all the ones below come from a 64-bit one. I have observed an improvement on the .GetValue performance when moving from 32-bit to 64-bit. The updated 64-bit version of the results above are (ms):

                      GetValue       Direct Assignation     
50000 iterations ->    1197                 157
80000 iterations ->    1922                 253
100000 iterations ->   2354                 310

Thus, the ratio changes from 10 times to 7.5 times.

I started increasing the number of properties (every time on 64-bit) and GetValue became better and better. Results:

28 Properties
                          GetValue       Direct Assignation     
    50000 iterations ->    2386                552
    80000 iterations ->    3857                872

Aver. ratio = 4.37

50 Properties
                          GetValue       Direct Assignation     
    50000 iterations ->    4292                1707
    80000 iterations ->    6772                2711

Aver. ratio = 2.475

I am not sure if the improvement of GetValue will continue and will reach a point where will be better than the simplistic approach but who cares? At this point, it is clear that the increasing number of properties plays against the simplistic approach, so it is time to try a different (again pretty simplistic) alternative: global array storing all the properties.

  private static int[,] List0;

Being populated in parallel with the given property (i.e., when object.propX = any value the corresponding positions in the array are also populated) and referred by objects/properties positions (first object, third property, etc.). Logically, this has the limitation of the number of objects (growing the first dimension above 1000 does not sound recommendable), but you might rely on different arrays (one storing from the first object to the 1000th one, other from the 1001th to the 2000th, etc.); you can set a function taking as argument the object name and returning the corresponding array.

Modifications in the main loop:

int countObject = -1;
foreach (var item in objects)
{
    countObject = countObject + 1;
    int countProp = -1;
    foreach (var p in props)
    {
        countProp = countProp + 1;
        object returnData = List0[countObject, countProp];
    }
}

By running this new approach in the case above, I get:

50 Properties
                         GetValue           2D Array    
   80000 iterations ->    6772                155

Aver. ratio = 45.146

One more:

70 Properties
                          GetValue          2D Array     
    80000 iterations ->    10444               213

Aver. ratio = 49.06

And I stopped my tests here. I guess that it is more than enough to prove my point.

Different approaches deliver different performances under different conditions and thus the best way to know the ideal configuration for a situation is actually testing it. Relying on an ultimate truth is rarely the best solution for a problem (although I might be wrong... still waiting for the reply from DmitryG to test his solution under different conditions). Thus, UNDER THE TESTED CONDITIONS, it seems to be the case that the original simplistic approach is acceptable for cases where the number of properties is relatively low (i.e., below 20); above this, the required hardcoding effort does not seem to be worthy and relying on a different alternative (like the 2D array I proposed) is better. In any case, GetValue delivers clearly a bad performance, which might be improved in many different ways.

I hope that I will not need to update this answer again :)

like image 29
varocarbas Avatar answered Nov 17 '22 06:11

varocarbas