Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is OfType<> faster than Cast<>?

Tags:

c#

linq

In answer to the following question: How to convert MatchCollection to string array

Given The two Linq expressions:

var arr = Regex.Matches(strText, @"\b[A-Za-z-']+\b")     .OfType<Match>() //OfType     .Select(m => m.Groups[0].Value)     .ToArray(); 

and

var arr = Regex.Matches(strText, @"\b[A-Za-z-']+\b")     .Cast<Match>() //Cast     .Select(m => m.Groups[0].Value)     .ToArray(); 

OfType<> was benchmarked by user Alex to be slightly faster (and confirmed by myself).

This seems counterintuitive to me, as I'd have thought OfType<> would have to do both an 'is' comparison, and a cast (T).

Any enlightenment would be appreciated as to why this is the case :)

like image 284
Dave Bish Avatar asked Jul 11 '12 10:07

Dave Bish


People also ask

What is the function of the OfType()?

The OfType<TResult>(IEnumerable) method returns only those elements in source that can be cast to type TResult . To instead receive an exception if an element cannot be cast to type TResult , use Cast<TResult>(IEnumerable).

What is OfType C#?

The OfType is a filter operation and it filters the collection based on the ability to cast an element in a collection to a specified type. It searches elements by their type only. Syntax.

What is the use of cast in Linq C#?

LINQ Cast() Method In LINQ, Cast operator is used to cast/convert all the elements present in a collection into a specified data type of new collection. In case if we try to cast/convert different types of elements (string/integer) in the collection, then the conversion will fail, and it will throw an exception.


1 Answers

My benchmarking does not agree with your benchmarking.

I ran an identical benchmark to Alex's and got the opposite result. I then tweaked the benchmark somewhat and again observed Cast being faster than OfType.

There's not much in it, but I believe that Cast does have the edge, as it should because its iterator is simpler. (No is check.)

Edit: Actually after some further tweaking I managed to get Cast to be 50x faster than OfType.

Below is the code of the benchmark that gives the biggest discrepancy I've found so far:

Stopwatch sw1 = new Stopwatch(); Stopwatch sw2 = new Stopwatch();  var ma = Enumerable.Range(1, 100000).Select(i => i.ToString()).ToArray();  var x = ma.OfType<string>().ToArray(); var y = ma.Cast<string>().ToArray();  for (int i = 0; i < 1000; i++) {     if (i%2 == 0)     {         sw1.Start();         var arr = ma.OfType<string>().ToArray();         sw1.Stop();         sw2.Start();         var arr2 = ma.Cast<string>().ToArray();         sw2.Stop();     }     else     {         sw2.Start();         var arr2 = ma.Cast<string>().ToArray();         sw2.Stop();         sw1.Start();         var arr = ma.OfType<string>().ToArray();         sw1.Stop();     } } Console.WriteLine("OfType: " + sw1.ElapsedMilliseconds.ToString()); Console.WriteLine("Cast: " + sw2.ElapsedMilliseconds.ToString()); Console.ReadLine(); 

Tweaks I've made:

  • Perform the "generate a list of strings" work once, at the start, and "crystallize" it.
  • Perform one of each operation before starting timing - I'm not sure if this is necessary but I think it means the JITter generates code beforehand rather than while we're timing?
  • Perform each operation multiple times, not just once.
  • Alternate the order in case this makes a difference.

On my machine this results in ~350ms for Cast and ~18000ms for OfType.

I think the biggest difference is that we're no longer timing how long MatchCollection takes to find the next match. (Or, in my code, how long int.ToString() takes.) This drastically reduces the signal-to-noise ratio.

Edit: As sixlettervariables pointed out, the reason for this massive difference is that Cast will short-circuit and not bother casting individual items if it can cast the whole IEnumerable. When I switched from using Regex.Matches to an array to avoid measuring the regex processing time, I also switched to using something castable to IEnumerable<string> and thus activated this short-circuiting. When I altered my benchmark to disable this short-circuiting, I get a slight advantage to Cast rather than a massive one.

like image 98
Rawling Avatar answered Sep 22 '22 19:09

Rawling