Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating DistinctBy using Expression trees

I wanted to create a method extending IQueryable where a user can specify in a string a property name by which he wants to distinct a collection. I want to use a logic with a HashSet. I basically want to emulate this code:

HashSet<TResult> set = new HashSet<TResult>();

foreach(var item in source)
{
    var selectedValue = selector(item);

    if (set.Add(selectedValue))
        yield return item;
}

using expression trees.

This is where i got so far:

private Expression AssembleDistinctBlockExpression (IQueryable queryable, string propertyName)
    {
        var propInfo = queryable.ElementType.GetProperty(propertyName);
        if ( propInfo == null )
            throw new ArgumentException();

        var loopVar = Expression.Parameter(queryable.ElementType, "");
        var selectedValue = Expression.Variable(propInfo.PropertyType, "selectedValue");

        var returnListType = typeof(List<>).MakeGenericType(queryable.ElementType);
        var returnListVar = Expression.Variable(returnListType, "return");
        var returnListAssign = Expression.Assign(returnListVar, Expression.Constant(Activator.CreateInstance(typeof(List<>).MakeGenericType(queryable.ElementType))));
        var hashSetType = typeof(HashSet<>).MakeGenericType(propInfo.PropertyType);
        var hashSetVar = Expression.Variable(hashSetType, "set");
        var hashSetAssign = Expression.Assign(hashSetVar, Expression.Constant(Activator.CreateInstance(typeof(HashSet<>).MakeGenericType(propInfo.PropertyType))));

        var enumeratorVar = Expression.Variable(typeof(IEnumerator<>).MakeGenericType(queryable.ElementType), "enumerator");
        var getEnumeratorCall = Expression.Call(queryable.Expression, queryable.GetType().GetTypeInfo().GetDeclaredMethod("GetEnumerator"));
        var enumeratorAssign = Expression.Assign(enumeratorVar, getEnumeratorCall);

        var moveNextCall = Expression.Call(enumeratorVar, typeof(IEnumerator).GetMethod("MoveNext"));

        var breakLabel = Expression.Label("loopBreak");

        var loopBlock = Expression.Block(
            new [] { enumeratorVar, hashSetVar, returnListVar },
            enumeratorAssign,
            returnListAssign,
            hashSetAssign,
            Expression.TryFinally(
                Expression.Block(
                    Expression.Loop(
                        Expression.IfThenElse(
                        Expression.Equal(moveNextCall, Expression.Constant(true)),
                        Expression.Block(
                            new[] { loopVar },
                            Expression.Assign(loopVar, Expression.Property(enumeratorVar, "Current")),
                            Expression.Assign(selectedValue, Expression.MakeMemberAccess(loopVar, propInfo)),
                            Expression.IfThen(
                                Expression.Call(typeof(HashSet<>), "Add", new Type[] { propInfo.PropertyType }, hashSetVar, selectedValue),
                                Expression.Call(typeof(List<>), "Add", new Type[] { queryable.ElementType }, returnListVar, loopVar)
                                )
                            ),
                        Expression.Break(breakLabel)
                        ),
                    breakLabel
                    ),
                    Expression.Return(breakLabel, returnListVar)
                ),
                Expression.Block(
                    Expression.Call(enumeratorVar, typeof(IDisposable).GetMethod("Dispose"))
                )
            )
        );
        return loopBlock;
    }

I get an exception when Expression.Block is called for a variable loopBlock which goes like this:

No method 'Add' exists on type 'System.Collections.Generic.HashSet`1[T]'.

like image 456
Nikola.Lukovic Avatar asked May 28 '16 22:05

Nikola.Lukovic


People also ask

What is an expression tree used for?

When you want to have a richer interaction, you need to use Expression Trees. Expression Trees represent code as a structure that you can examine, modify, or execute. These tools give you the power to manipulate code during run time. You can write code that examines running algorithms, or injects new capabilities.

What .NET feature do you use to create an expression tree?

In . NET Framework 4 or later, the expression trees API also supports assignments and control flow expressions such as loops, conditional blocks, and try-catch blocks. By using the API, you can create expression trees that are more complex than those that can be created from lambda expressions by the C# compiler.

What is query expression trees?

An expression tree is a representation of expressions arranged in a tree-like data structure. In other words, it is a tree with leaves as operands of the expression and nodes contain the operators. Similar to other data structures, data interaction is also possible in an expression tree.


1 Answers

The Expression.Call method overload that you are using is for static methods.

Quoting from the reference above:

Creates a MethodCallExpression that represents a call to a static (Shared in Visual Basic) method by calling the appropriate factory method.

What you need to do is to use an overload of that method that is for calling instance methods.

Here is how the relevant part of your code would look like:

Expression.IfThen(
    Expression.Call(hashSetVar, "Add", new Type[] { }, selectedValue),
    Expression.Call(returnListVar, "Add", new Type[] { }, loopVar))

Notice how now we pass the instance (expression) that we need to invoke in the first parameter of Expression.Call.

Please note also that we pass an empty type parameter list. The reason for this is that the Add method in this class does not have any type parameters. The type parameter T in HashSet<T> and List<T> is defined on the class level, not on the method level.

You would need to specify the type parameters only if they are defined on the method itself like this:

void SomeMethod<T1>(...
like image 195
Yacoub Massad Avatar answered Sep 29 '22 12:09

Yacoub Massad