I got List of objects of my own class which looks like:
public class IFFundTypeFilter_ib
{
public string FundKey { get; set; }
public string FundValue { get; set; }
public bool IsDisabled { get; set; }
}
The property IsDisabled
is set by doing query collection.Where(some condition)
and counting the number of matching objects. The result is IEnumarable<IFFundTypeFilter_ib>
which does not contain property Count. I wonder, what would be faster.
This one:
collection.Where(somecondition).Count();
or this one:
collection.Where(someocondition).ToList().Count;
Collection could contains few objects but could also contains, for example 700. I am going to make counting call two times and with other conditions. In first condition I check whether FundKey equals some key and in the second condition I do the same, but I compare it with other key value.
You asked:
I wonder, what would be faster.
Whenever you ask that you should actually time it and find out.
I set out to test all of these variants of obtaining a count:
var enumerable = Enumerable.Range(0, 1000000);
var list = enumerable.ToList();
var methods = new Func<int>[]
{
() => list.Count,
() => enumerable.Count(),
() => list.Count(),
() => enumerable.ToList().Count(),
() => list.ToList().Count(),
() => enumerable.Select(x => x).Count(),
() => list.Select(x => x).Count(),
() => enumerable.Select(x => x).ToList().Count(),
() => list.Select(x => x).ToList().Count(),
() => enumerable.Where(x => x % 2 == 0).Count(),
() => list.Where(x => x % 2 == 0).Count(),
() => enumerable.Where(x => x % 2 == 0).ToList().Count(),
() => list.Where(x => x % 2 == 0).ToList().Count(),
};
My testing code explicitly runs each method 1,000 times, measures each execution time with a Stopwatch
, and ignores all results where garbage collection occurred. It then gets an average execution time per method.
var measurements =
methods
.Select((m, i) => i)
.ToDictionary(i => i, i => new List<double>());
for (var run = 0; run < 1000; run++)
{
for (var i = 0; i < methods.Length; i++)
{
var sw = Stopwatch.StartNew();
var gccc0 = GC.CollectionCount(0);
var r = methods[i]();
var gccc1 = GC.CollectionCount(0);
sw.Stop();
if (gccc1 == gccc0)
{
measurements[i].Add(sw.Elapsed.TotalMilliseconds);
}
}
}
var results =
measurements
.Select(x => new
{
index = x.Key,
count = x.Value.Count(),
average = x.Value.Average().ToString("0.000")
});
Here are the results (ordered from slowest to fastest):
+---------+-----------------------------------------------------------+
| average | method |
+---------+-----------------------------------------------------------+
| 14.879 | () => enumerable.Select(x => x).ToList().Count(), |
| 14.188 | () => list.Select(x => x).ToList().Count(), |
| 10.849 | () => enumerable.Where(x => x % 2 == 0).ToList().Count(), |
| 10.080 | () => enumerable.ToList().Count(), |
| 9.562 | () => enumerable.Select(x => x).Count(), |
| 8.799 | () => list.Where(x => x % 2 == 0).ToList().Count(), |
| 8.350 | () => enumerable.Where(x => x % 2 == 0).Count(), |
| 8.046 | () => list.Select(x => x).Count(), |
| 5.910 | () => list.Where(x => x % 2 == 0).Count(), |
| 4.085 | () => enumerable.Count(), |
| 1.133 | () => list.ToList().Count(), |
| 0.000 | () => list.Count, |
| 0.000 | () => list.Count(), |
+---------+-----------------------------------------------------------+
Two things come out that are significant here.
One, any method with a .ToList()
inline is significantly slower than the equivalent without it.
Two, LINQ operators take advantage of the underlying type of the enumerable, where possible, to short-cut computations. The enumerable.Count()
and list.Count()
methods show this.
There is no difference between the list.Count
and list.Count()
calls. So the key comparison is between the enumerable.Where(x => x % 2 == 0).Count()
and enumerable.Where(x => x % 2 == 0).ToList().Count()
calls. Since the latter contains an extra operation we would expect it to take longer. It's almost 2.5 milliseconds longer.
I don't know why you say that you're going to call the counting code twice, but if you do it is better to build the list. If not just do the plain .Count()
call after your query.
Generally, materializing to a list will be less efficient.
Additionally, if you are using two conditions, there is no point in caching the result or materializing the query to a List
.
You should just use the overload of Count
which accepts a predicate:
collection.Count(someocondition);
As @CodeCaster mentions in the comments, it is equivalent to collection.Where(condition).Count()
, but is more readable and concise.
Using it exactly this way
var count = collection.Where(somecondition).ToList().Count;
doesn't make sense - populating a list just to get the count, so using IEnumerable<T>.Count()
is the appropriate way for this case.
Using ToList
would make sense in a case you do something like this
var list = collection.Where(somecondition).ToList();
var count = list.Count;
// do something else with the list
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With