Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

IEnumerable.Count() or ToList().Count

I got List of objects of my own class which looks like:

public class IFFundTypeFilter_ib
{
    public string FundKey { get; set; }
    public string FundValue { get; set; }
    public bool IsDisabled { get; set; }
}

The property IsDisabled is set by doing query collection.Where(some condition) and counting the number of matching objects. The result is IEnumarable<IFFundTypeFilter_ib> which does not contain property Count. I wonder, what would be faster.

This one:

collection.Where(somecondition).Count();

or this one:

collection.Where(someocondition).ToList().Count;

Collection could contains few objects but could also contains, for example 700. I am going to make counting call two times and with other conditions. In first condition I check whether FundKey equals some key and in the second condition I do the same, but I compare it with other key value.

like image 219
Paweł Mikołajczyk Avatar asked Oct 14 '15 06:10

Paweł Mikołajczyk


3 Answers

You asked:

I wonder, what would be faster.

Whenever you ask that you should actually time it and find out.

I set out to test all of these variants of obtaining a count:

var enumerable = Enumerable.Range(0, 1000000);
var list = enumerable.ToList();

var methods = new Func<int>[]
{
    () => list.Count,
    () => enumerable.Count(),
    () => list.Count(),
    () => enumerable.ToList().Count(),
    () => list.ToList().Count(),
    () => enumerable.Select(x => x).Count(),
    () => list.Select(x => x).Count(),
    () => enumerable.Select(x => x).ToList().Count(),
    () => list.Select(x => x).ToList().Count(),
    () => enumerable.Where(x => x % 2 == 0).Count(),
    () => list.Where(x => x % 2 == 0).Count(),
    () => enumerable.Where(x => x % 2 == 0).ToList().Count(),
    () => list.Where(x => x % 2 == 0).ToList().Count(),
};

My testing code explicitly runs each method 1,000 times, measures each execution time with a Stopwatch, and ignores all results where garbage collection occurred. It then gets an average execution time per method.

var measurements =
    methods
        .Select((m, i) => i)
        .ToDictionary(i => i, i => new List<double>());

for (var run = 0; run < 1000; run++)
{
    for (var i = 0; i < methods.Length; i++)
    {
        var sw = Stopwatch.StartNew();
        var gccc0 = GC.CollectionCount(0);
        var r = methods[i]();
        var gccc1 = GC.CollectionCount(0);
        sw.Stop();
        if (gccc1 == gccc0)
        {
            measurements[i].Add(sw.Elapsed.TotalMilliseconds);
        }
    }
}

var results =
    measurements
        .Select(x => new
        {
            index = x.Key,
            count = x.Value.Count(),
            average = x.Value.Average().ToString("0.000")
        });

Here are the results (ordered from slowest to fastest):

+---------+-----------------------------------------------------------+
| average |                          method                           |
+---------+-----------------------------------------------------------+
| 14.879  | () => enumerable.Select(x => x).ToList().Count(),         |
| 14.188  | () => list.Select(x => x).ToList().Count(),               |
| 10.849  | () => enumerable.Where(x => x % 2 == 0).ToList().Count(), |
| 10.080  | () => enumerable.ToList().Count(),                        |
| 9.562   | () => enumerable.Select(x => x).Count(),                  |
| 8.799   | () => list.Where(x => x % 2 == 0).ToList().Count(),       |
| 8.350   | () => enumerable.Where(x => x % 2 == 0).Count(),          |
| 8.046   | () => list.Select(x => x).Count(),                        |
| 5.910   | () => list.Where(x => x % 2 == 0).Count(),                |
| 4.085   | () => enumerable.Count(),                                 |
| 1.133   | () => list.ToList().Count(),                              |
| 0.000   | () => list.Count,                                         |
| 0.000   | () => list.Count(),                                       |
+---------+-----------------------------------------------------------+

Two things come out that are significant here.

One, any method with a .ToList() inline is significantly slower than the equivalent without it.

Two, LINQ operators take advantage of the underlying type of the enumerable, where possible, to short-cut computations. The enumerable.Count() and list.Count() methods show this.

There is no difference between the list.Count and list.Count() calls. So the key comparison is between the enumerable.Where(x => x % 2 == 0).Count() and enumerable.Where(x => x % 2 == 0).ToList().Count() calls. Since the latter contains an extra operation we would expect it to take longer. It's almost 2.5 milliseconds longer.

I don't know why you say that you're going to call the counting code twice, but if you do it is better to build the list. If not just do the plain .Count() call after your query.

like image 145
Enigmativity Avatar answered Nov 07 '22 11:11

Enigmativity


Generally, materializing to a list will be less efficient.

Additionally, if you are using two conditions, there is no point in caching the result or materializing the query to a List.

You should just use the overload of Count which accepts a predicate:

collection.Count(someocondition);

As @CodeCaster mentions in the comments, it is equivalent to collection.Where(condition).Count(), but is more readable and concise.

like image 29
Rotem Avatar answered Nov 07 '22 10:11

Rotem


Using it exactly this way

var count = collection.Where(somecondition).ToList().Count;

doesn't make sense - populating a list just to get the count, so using IEnumerable<T>.Count() is the appropriate way for this case.

Using ToList would make sense in a case you do something like this

var list = collection.Where(somecondition).ToList();
var count = list.Count;
// do something else with the list
like image 37
Ivan Stoev Avatar answered Nov 07 '22 11:11

Ivan Stoev