Algorithm for implementing C# yield statement

Tags:

I'd love to figure it out myself but I was wondering roughly what's the algorithm for converting a function with yield statements into a state machine for an enumerator? For example how does C# turn this:

IEnumerator<string> strings(IEnumerable<string> args)  { IEnumerator<string> enumerator2 = getAnotherEnumerator();         foreach(var arg in arg)      { enumerator2.MoveNext();       yield return arg+enumerator.Current;     }   }

into this:

bool MoveNext()  { switch (this.state)     {         case 0:             this.state = -1;             this.enumerator2 = getAnotherEnumerator();             this.argsEnumerator = this.args.GetEnumerator();             this.state = 1;             while (this.argsEnumerator.MoveNext())             {                 this.arg = this.argsEnumerator.Current;                 this.enumerator2.MoveNext();                 this.current = this.arg + this.enumerator2.Current;                 this.state = 2;                 return true;                state1:                 this.state = 1;             }             this.state = -1;             if (this.argsEnumerator != null) this.argsEnumerator.Dispose();             break;          case 2:             goto state1;     }     return false; }

Of course the result can be completely different depending on the original code.

497

asked Sep 25 '08 07:09

Mark Cidade

1 Answers

The particular code sample you are looking at involves a series of transformations. Please note that this is an approximate description of the algorithm. The actual names used by the compiler and the exact code it generates may be different. The idea is the same, however.

The first transformation is the "foreach" transformation, which transforms this code:

foreach (var x in y) {    //body }

into this code:

var enumerator = y.GetEnumerator(); while (enumerator.MoveNext()) {     var x = enumerator.Current;     //body }  if (y != null) {     enumerator.Dispose(); }

The second transformation finds all the yield return statements in the function body, assigns a number to each (a state value), and creates a "goto label" right after the yield.

The third transformation lifts all the local variables and function arguments in the method body into an object called a closure.

Given the code in your example, that would look similar to this:

 class ClosureEnumerable : IEnumerable<string>  {     private IEnumerable<string> args;     private ClassType originalThis;     public ClosureEnumerator(ClassType origThis, IEnumerable<string> args)     {         this.args = args;         this.origianlThis = origThis;     }     public IEnumerator<string> GetEnumerator()     {         return new Closure(origThis, args);     }  }  class Closure : IEnumerator<string> {     public Closure(ClassType originalThis, IEnumerable<string> args)     {         state = 0;         this.args = args;         this.originalThis = originalThis;     }      private IEnumerable<string> args;     private IEnumerator<string> enumerator2;     private IEnumerator<string> argEnumerator;      //- Here ClassType is the type of the object that contained the method     //  This may be optimized away if the method does not access any      //  class members     private ClassType originalThis;      //This holds the state value.     private int state;     //The current value to return     private string currentValue;      public string Current     {         get          {             return currentValue;         }     } }

The method body is then moved from the original method to a method inside "Closure" called MoveNext, which returns a bool, and implements IEnumerable.MoveNext. Any access to any locals is routed through "this", and any access to any class members are routed through this.originalThis.

Any "yield return expr" is translated into:

currentValue = expr; state = //the state number of the yield statement; return true;

Any yield break statement is translated into:

state = -1; return false;

There is an "implicit" yield break statement at the end of the function. A switch statement is then introduced at the beginning of the procedure that looks at the state number and jumps to the associated label.

The original method is then translated into something like this:

IEnumerator<string> strings(IEnumerable<string> args) {    return new ClosureEnumerable(this,args); }

The fact that the state of the method is all pushed into an object and that the MoveNext method uses a switch statement / state variable is what allows the iterator to behave as if control is being passed back to the point immediately after the last "yield return" statement the next time "MoveNext" is called.

It is important to point out, however, that the transformation used by the C# compiler is not the best way to do this. It suffers from poor performance when trying to use "yield" with recursive algorithms. There is a good paper that outlines a better way to do this here:

http://research.microsoft.com/en-us/projects/specsharp/iterators.pdf

It's worth a read if you haven't read it yet.

192

answered Oct 05 '22 12:10

Scott Wisniewski

Related questions
                            
                                A BitTorrent client completely written in C#? [closed]
                            
                                What is the point of "static new" modifier for a function?
                            
                                How do I add a attribute to a XmlArray element (XML Serialization)?
                            
                                Dependency injection with a static logger, static helper class
                            
                                Upload CSV file to SQL server
                            
                                How to get file path from OpenFileDialog and FolderBrowserDialog?
                            
                                How to make EF-Core use a Guid instead of String for its ID/Primary key
                            
                                Response.Redirect using ~ Path
                            
                                Can you use the params keyword in a delegate?
                            
                                Delegate for an Action< ref T1, T2>
                            
                                adding List of objects to Context in ef
                            
                                Entity Framework - Already Defined
                            
                                Html Agility Pack. Load and scrape webpage
                            
                                Thread safety of a Dictionary<TKey, TValue>
                            
                                Read typed objects from XML using known XSD
                            
                                Capture screenshot Including Semitransparent windows in .NET
                            
                                How to generate WPF controls automatically based on XML file?
                            
                                Difference between list.First(), list.ElementAt(0) and list[0]?
                            
                                Querying Datatable with where condition
                            
                                In C#, how can I detect if a character is a non-ASCII character?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Algorithm for implementing C# yield statement

Tags:

iterator

c#

algorithm

compiler-construction

state-machine

Mark Cidade

People also ask

1 Answers

Scott Wisniewski

Recent Activity

Donate For Us