I am using following query
foreach (var callDetailsForNode_ReArrange in callDetailsForNodes_ReArrange)
{
var test = from r1 in dtRowForNode.AsEnumerable()
join r2 in dtFileRowForNode.AsEnumerable()
on r1.Field<int>("Lng_Upload_Id") equals r2.Field<int>("Lng_Upload_Id")
where ((r1.Field<string>("Txt_Called_Number") == callDetailsForNode_ReArrange.caller2.ToString()) || r1.Field<string>("Txt_Calling_Number") == callDetailsForNode_ReArrange.caller2.ToString())
select r2.Field<string>("Txt_File_Name");
var d = test.Distinct();
}
Upto here this query run in no time. But as I added
string[] str =d.ToArray();
strFileName = string.Join(",", str);
It takes almost 4-5 seconds to run. What makes it so slow on adding .ToArray()
?
ToArray might do an additional allocation and copy operation such that the buffer will be sized exactly to the number of elements. In order to confirm this the following benchmark is used. The results confirm that ToList is in most cases 10% - 15% faster.
It is slightly slower LINQ syntax is typically less efficient than a foreach loop. It's good to be aware of any performance tradeoff that might occur when you use LINQ to improve the readability of your code. And if you'd like to measure the performance difference, you can use a tool like BenchmarkDotNet to do so.
AsEnumerable preserves deferred execution and does not build an often useless intermediate list. On the other hand, when forced execution of a LINQ query is desired, ToList can be a way to do that. AsQueryable can be used to make an enumerable collection accept expressions in LINQ statements.
LINQ ToList() Method In LINQ, the ToList operator takes the element from the given source, and it returns a new List. So, in this case, input would be converted to type List.
Upto here this query run in no time.
Up to here, it hasn't actually done anything, except build a deferred-execution model that represents the pending query. It doesn't start iterating until you call MoveNext()
on the iterator, i.e. via foreach
, in your case via .ToArray()
.
So: it takes time because it is doing work.
Consider:
static IEnumerable<int> GetData()
{
Console.WriteLine("a");
yield return 0;
Console.WriteLine("b");
yield return 1;
Console.WriteLine("c");
yield return 2;
Console.WriteLine("d");
}
static void Main()
{
Console.WriteLine("start");
var data = GetData();
Console.WriteLine("got data");
foreach (var item in data)
Console.WriteLine(item);
Console.WriteLine("end");
}
This outputs:
start
got data
a
0
b
1
c
2
d
end
Note how the work doesn't all happen at once - it is both deferred (a
comes after got data
) and spooling (we don't get a
,...,d
,0
,...2
).
Related: this is roughly how Distinct()
works, from comments:
public static IEnumerable<T> Distinct<T>(this IEnumerable<T> source) {
var seen = new HashSet<T>();
foreach(var item in source) {
if(seen.Add(item)) yield return item;
}
}
...
and a new Join
operation:
public static string Join(this IEnumerable<string> source, string separator) {
using(var iter = source.GetEnumerator()) {
if(!iter.MoveNext()) return "";
var sb = new StringBuilder(iter.Current);
while(iter.MoveNext())
sb.Append(separator).Append(iter.Current);
return sb.ToString();
}
}
and use:
string s = d.Join(",");
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With