Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Slow LINQ query for .ToArray()

I am using following query

foreach (var callDetailsForNode_ReArrange in callDetailsForNodes_ReArrange)
{
    var test = from r1 in dtRowForNode.AsEnumerable()
               join r2 in dtFileRowForNode.AsEnumerable()
               on r1.Field<int>("Lng_Upload_Id") equals r2.Field<int>("Lng_Upload_Id")
               where ((r1.Field<string>("Txt_Called_Number") == callDetailsForNode_ReArrange.caller2.ToString()) || r1.Field<string>("Txt_Calling_Number") == callDetailsForNode_ReArrange.caller2.ToString())
               select r2.Field<string>("Txt_File_Name");

    var d = test.Distinct();
}

Upto here this query run in no time. But as I added

string[] str =d.ToArray();
strFileName = string.Join(",", str);

It takes almost 4-5 seconds to run. What makes it so slow on adding .ToArray() ?

like image 645
Rajeev Kumar Avatar asked May 10 '13 11:05

Rajeev Kumar


People also ask

Which is faster ToList or ToArray?

ToArray might do an additional allocation and copy operation such that the buffer will be sized exactly to the number of elements. In order to confirm this the following benchmark is used. The results confirm that ToList is in most cases 10% - 15% faster.

Is LINQ slower?

It is slightly slower LINQ syntax is typically less efficient than a foreach loop. It's good to be aware of any performance tradeoff that might occur when you use LINQ to improve the readability of your code. And if you'd like to measure the performance difference, you can use a tool like BenchmarkDotNet to do so.

What is the difference between AsEnumerable and AsQueryable?

AsEnumerable preserves deferred execution and does not build an often useless intermediate list. On the other hand, when forced execution of a LINQ query is desired, ToList can be a way to do that. AsQueryable can be used to make an enumerable collection accept expressions in LINQ statements.

What is ToList in LINQ?

LINQ ToList() Method In LINQ, the ToList operator takes the element from the given source, and it returns a new List. So, in this case, input would be converted to type List.


1 Answers

Upto here this query run in no time.

Up to here, it hasn't actually done anything, except build a deferred-execution model that represents the pending query. It doesn't start iterating until you call MoveNext() on the iterator, i.e. via foreach, in your case via .ToArray().

So: it takes time because it is doing work.

Consider:

static IEnumerable<int> GetData()
{
    Console.WriteLine("a");
    yield return 0;
    Console.WriteLine("b");
    yield return 1;
    Console.WriteLine("c");
    yield return 2;
    Console.WriteLine("d");
}
static void Main()
{
    Console.WriteLine("start");
    var data = GetData();
    Console.WriteLine("got data");
    foreach (var item in data)
        Console.WriteLine(item);
    Console.WriteLine("end");
}

This outputs:

start
got data
a
0
b
1
c
2
d
end

Note how the work doesn't all happen at once - it is both deferred (a comes after got data) and spooling (we don't get a,...,d,0,...2).


Related: this is roughly how Distinct() works, from comments:

public static IEnumerable<T> Distinct<T>(this IEnumerable<T> source) {
    var seen = new HashSet<T>();
    foreach(var item in source) {
        if(seen.Add(item)) yield return item;
    }
}

...

and a new Join operation:

public static string Join(this IEnumerable<string> source, string separator) {
    using(var iter = source.GetEnumerator()) {
        if(!iter.MoveNext()) return "";
        var sb = new StringBuilder(iter.Current);
        while(iter.MoveNext())
            sb.Append(separator).Append(iter.Current);
        return sb.ToString();
    }
}

and use:

string s = d.Join(",");
like image 94
Marc Gravell Avatar answered Oct 22 '22 13:10

Marc Gravell