Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is 'yield return' slower than "old school" return?

I'm doing some tests about yield return perfomance, and I found that it is slower than normal return.

I tested value variables (int, double, etc.) and some references types (string, etc.)... And yield return were slower in both cases. Why use it then?

Check out my example:

public class YieldReturnTeste
{
    private static IEnumerable<string> YieldReturnTest(int limite)
    {
        for (int i = 0; i < limite; i++)
        {
            yield return i.ToString();
        }
    }

    private static IEnumerable<string> NormalReturnTest(int limite)
    {
        List<string> listaInteiros = new List<string>();

        for (int i = 0; i < limite; i++)
        {
            listaInteiros.Add(i.ToString());
        }
        return listaInteiros;
    }

    public static void executaTeste()
    {
        Stopwatch stopWatch = new Stopwatch();

        stopWatch.Start();

        List<string> minhaListaYield = YieldReturnTest(2000000).ToList();

        stopWatch.Stop();

        TimeSpan ts = stopWatch.Elapsed;


        string elapsedTime = String.Format("{0:00}:{1:00}:{2:00}.{3:00}",

        ts.Hours, ts.Minutes, ts.Seconds,

        ts.Milliseconds / 10);

        Console.WriteLine("Yield return: {0}", elapsedTime);

        //****

        stopWatch = new Stopwatch();

        stopWatch.Start();

        List<string> minhaListaNormal = NormalReturnTest(2000000).ToList();

        stopWatch.Stop();

        ts = stopWatch.Elapsed;


        elapsedTime = String.Format("{0:00}:{1:00}:{2:00}.{3:00}",

        ts.Hours, ts.Minutes, ts.Seconds,

        ts.Milliseconds / 10);

        Console.WriteLine("Normal return: {0}", elapsedTime);
    }
}
like image 639
Marcel James Avatar asked Aug 09 '13 11:08

Marcel James


3 Answers

Consider the difference between File.ReadAllLines and File.ReadLines.

ReadAllLines loads all of the lines into memory and returns a string[]. All well and good if the file is small. If the file is larger than will fit in memory, you'll run out of memory.

ReadLines, on the other hand, uses yield return to return one line at a time. With it, you can read any size file. It doesn't load the whole file into memory.

Say you wanted to find the first line that contains the word "foo", and then exit. Using ReadAllLines, you'd have to read the entire file into memory, even if "foo" occurs on the first line. With ReadLines, you only read one line. Which one would be faster?

That's not the only reason. Consider a program that reads a file and processes each line. Using File.ReadAllLines, you end up with:

string[] lines = File.ReadAllLines(filename);
for (int i = 0; i < lines.Length; ++i)
{
    // process line
}

The time it takes that program to execute is equal to the time it takes to read the file, plus time to process the lines. Imagine that the processing takes so long that you want to speed it up with multiple threads. So you do something like:

lines = File.ReadAllLines(filename);
Parallel.Foreach(...);

But the reading is single-threaded. Your multiple threads can't start until the main thread has loaded the entire file.

With ReadLines, though, you can do something like:

Parallel.Foreach(File.ReadLines(filename), line => { ProcessLine(line); });

That starts up multiple threads immediately, which are processing at the same time that other lines are being read. So the reading time is overlapped with the processing time, meaning that your program will execute faster.

I show my examples using files because it's easier to demonstrate the concepts that way, but the same holds true for in-memory collections. Using yield return will use less memory and is potentially faster, especially when calling methods that only need to look at part of the collection (Enumerable.Any, Enumerable.First, etc.).

like image 196
Jim Mischel Avatar answered Oct 07 '22 14:10

Jim Mischel


For one, it's a convenience feature. Two, it lets you do lazy return, which means that it's only evaluated when the value's fetched. That can be invaluable in stuff like a DB query, or just a collection you don't want to completely iterate over. Three, it can be faster in some scenarios. Four, what was the difference? Probably tiny, so micro optimization.

like image 20
It'sNotALie. Avatar answered Oct 07 '22 13:10

It'sNotALie.


Because C# compiler converts iterator blocks (yield return) into state machine. State machine is very expensive in this case.

You can read more here: http://csharpindepth.com/articles/chapter6/iteratorblockimplementation.aspx

like image 1
oakio Avatar answered Oct 07 '22 14:10

oakio