Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to "unroll" a "recursive" structure

Not sure how to call it, but say you have a class that looks like this:

class Person
{
    public string Name;
    public IEnumerable<Person> Friends;
}

You then have a person and you want to "unroll" this structure recursively so you end up with a single list of all people without duplicates.

How would you do this? I have already made something that seems to be working, but I am curious to see how others would do it and especially if there is something built-in to Linq you can use in a clever way to solve this little problem :)


Here is my solution:

public static IEnumerable<T> SelectRecursive<T>(this IEnumerable<T> subjects, Func<T, IEnumerable<T>> selector)
{
    // Stop if subjects are null or empty
    if(subjects == null)
        yield break;

    // For each subject
    foreach(var subject in subjects)
    {
        // Yield it
        yield return subject;

        // Then yield all its decendants
        foreach (var decendant in SelectRecursive(selector(subject), selector))
            yield return decendant;
    }
}

Would be used something like this:

var people = somePerson.SelectRecursive(x => x.Friends);
like image 931
Svish Avatar asked Jan 06 '10 10:01

Svish


3 Answers

I don't believe there's anything built into LINQ to do this.

There's a problem with doing it recursively like this - you end up creating a large number of iterators. This can be quite inefficient if the tree is deep. Wes Dyer and Eric Lippert have both blogged about this.

You can remove this inefficiency by removing the direct recursion. For example:

public static IEnumerable<T> SelectRecursive<T>(this IEnumerable<T> subjects,
    Func<T, IEnumerable<T>> selector)
{
    if (subjects == null)
    {
        yield break;
    }

    Queue<T> stillToProcess = new Queue<T>(subjects);

    while (stillToProcess.Count > 0)
    {
        T item = stillToProcess.Dequeue();
        yield return item;
        foreach (T child in selector(item))
        {
            stillToProcess.Enqueue(child);
        }
    }
}

This will also change the iteration order - it becomes breadth-first instead of depth-first; rewriting it to still be depth-first is tricky. I've also changed it to not use Any() - this revised version won't evaluate any sequence more than once, which can be handy in some scenarios. This does have one problem, mind you - it will take more memory, due to the queuing. We could probably alleviate this by storing a queue of iterators instead of items, but I'm not sure offhand... it would certainly be more complicated.

One point to note (also noted by ChrisW while I was looking up the blog posts :) - if you have any cycles in your friends list (i.e. if A has B, and B has A) then you'll recurse forever.

like image 199
Jon Skeet Avatar answered Oct 22 '22 15:10

Jon Skeet


I found this question as I was looking for and thinking about a similar solution - in my case creating an efficient IEnumerable<Control> for ASP.NET UI controls. The recursive yield I had is fast but I knew that could have extra cost, since the deeper the control structure the longer it could take. Now I know this is O(n log n).

The solution given here provides some answer but, as discussed in the comments, it does change the order (which the OP did not care about). I realized that to preserve the order as given by the OP and as I needed, neither a simple Queue (as Jon used) nor Stack would work since all the parent objects would be yielded first and then any children after them (or vice-versa).

To resolve this and preserve the order I realized the solution would simply be to put the Enumerator itself on a Stack. To use the OPs original question it would look like this:

public static IEnumerable<T> SelectRecursive<T>(this IEnumerable<T> subjects, Func<T, IEnumerable<T>> selector)
{
    if (subjects == null)
        yield break;

    var stack = new Stack<IEnumerator<T>>();

    stack.Push(subjects.GetEnumerator());

    while (stack.Count > 0)
    {
        var en = stack.Peek();
        if (en.MoveNext())
        {
            var subject = en.Current;
            yield return subject;

            stack.Push(selector(subject).GetEnumerator());
        }
        else 
        {
            stack.Pop().Dispose();
        }
    }
}

I use stack.Peek here to keep from having to push the same enumerator back on to the stack as this is likely to be the more frequent operation, expecting that enumerator to provide more than one item.

This creates the same number of enumerators as in the recursive version but will likely be fewer new objects than putting all the subjects in a queue or stack and continuing to add any descendant subjects. This is O(n) time as each enumerator stands on its own (in the recursive version an implicit call to one MoveNext executes MoveNext on the child enumerators to the current depth in the recursion stack).

like image 15
Kevin Brock Avatar answered Oct 22 '22 16:10

Kevin Brock


You could use a non-recursive method like this as well:

  HashSet<Person> GatherAll (Person p) {
     Stack<Person> todo = new Stack<Person> ();
     HashSet<Person> results = new HashSet<Person> ();
     todo.Add (p); results.Add (p);
     while (todo.Count > 0) {
        Person p = todo.Pop (); 
        foreach (Person f in p.Friends) 
           if (results.Add (f)) todo.Add (f);
     }
     return results;
  }

This should handle cycles properly as well. I am starting with a single person, but you could easily expand this to start with a list of persons.

like image 3
Tarydon Avatar answered Oct 22 '22 14:10

Tarydon