Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pattern to break up C# using blocks to enable functional programming

Our server app has several methods, called in sequence that iterate through a 20M-row resultset and transform it. Each method in this pipeline stores a 200+ megabyte copy of the data, with predictably bad RAM and GC performance impact.

Each method follows a similar pattern:

public HugeCollection1 Step1 (SomeType sourceData)
{ 
    var transformed = new List<RowType>;
    using (var foo = InitializeSomethingExpensive(sourceData))
    {
        foreach (var row in foo)
        {
            transformed.Add (TransformRow(row));
        }
    }
    return transformed;
}

Then these methods are called in a pipeline, e.g.

var results1 = Step1(sourceData);
var results2 = Step2(results1);
var results3 = Step3(results2);
...
var finalResults = StepN (resultsNMinus1);
return finalResults; // final results

I'd like to transform this into a more functional solution that iterates through the original source data without ever holding the entire dataset in RAM. I want to end up with a List of the final results without any intermediate collections.

If there were no setup required at each stage of the pipeline, then the solution would be simple: just run each transformation for each row and store only the final result.

var transformed = new List<SmallResult>;
// TODO: How to set up and ensure teardown of the *other* pipeline steps?
using (var foo = InitializeSomethingExpensive(sourceData))
{
    foreach (var row in foo)
    {
       object result = row;
       foreach (var step in Pipeline)
       {
           result = step.Transform (result);
       }
       transformed.Add (result as SmallResult);
    }
}
return transformed;

But today, each of those separate pipeline steps has its own expensive setup and tear-down process that's enforced via a using block.

What's a good pattern to refactor each of these pipeline methods so the setup/teardown code is guaranteed to happen? In pseudo-code, I'd like to end up with this:

  1. Setup all steps
  2. Loop through each row
  3. Transform row through each step
  4. End loop
  5. Cleanup all steps, guaranteeing that cleanup always happens
  6. Return (small) results

It's not practical to combine all the using blocks into a single method because the code in each of these steps is long and shared and I don't want to repeat that shared code in one method.

I know I could manually replace the using block with try/finally, but doing that manually for multiple resources seems harder than necessary.

Is there a simpler solution possible, e.g. using using and yield together in a smart way? Or is there a good "multi-using" class implementation available that makes this coordinated setup/teardown process easy (e.g. its constructor accepts a list of functions that return IDisposable and its Dispose() implementation would ensure that everything is cleaned up)?

Seems like this is a pattern that someone smarter than I has already figured out, so asking here before re-inventing the wheel.

like image 639
Justin Grant Avatar asked Jul 21 '17 19:07

Justin Grant


People also ask

What break means C?

break command (C and C++) The break command allows you to terminate and exit a loop (that is, do , for , and while ) or switch command from any point other than the logical end.

How do you break a loop?

The break statement exits a for or while loop completely. To skip the rest of the instructions in the loop and begin the next iteration, use a continue statement. break is not defined outside a for or while loop. To exit a function, use return .

What is looping in C?

The looping can be defined as repeating the same process multiple times until a specific condition satisfies. It is known as iteration also. There are three types of loops used in the C language. In this part of the tutorial, we are going to learn all the aspects of C loops..


1 Answers

I'm not sure why you are creating so many disposable objects (you can clean these up with yieldable methods) but you can create an extension method to clean up this pattern for you

public static class ToolsEx
{
    public static IEnumerable<T> EnumerateAndDispose<X, T>(this X input, 
                                      Func<X, IEnumerable<T>> func)
        where X : IDisposable
    {
        using (var mc = input)
            foreach (var i in func(mc))
                yield return i;
    }
}

you can use it likes this...

var query = from x in new MyClass(0, 0, 2).EnumerateAndDispose(i => i)
            from y in new MyClass(1, x, 3).EnumerateAndDispose(i => i)
            select new
            {
                x,
                y,
            };

foreach (var i in query)
    Console.WriteLine(i);

... output ...

{ x = 0, y = 0 }
{ x = 0, y = 1 }
{ x = 0, y = 2 }
Disposed: 1/0
{ x = 1, y = 0 }
{ x = 1, y = 1 }
{ x = 1, y = 2 }
Disposed: 1/1
Disposed: 0/0

Here is a pipeline example with Aggregate ...

var query = from x in new MyClass(0, 0, 2).EnumerateAndDispose(i => i)
            let r = new MyClass(1, x, 3).EnumerateAndDispose(i => i)
                                                .Aggregate(x, (a, i) => (a + i) * 2)
            select new
            {
                x,
                r,
            };

... and the results ...

Disposed: 1/0
{ x = 0, r = 8 }
Disposed: 1/1
{ x = 1, r = 16 }
Disposed: 0/0

... test class for the example ...

public class MyClass : IEnumerable<int>, IDisposable
{

    public MyClass(int set, int set2, int size)
    {
        this.Size = size;
        this.Set = set;
        this.Set2 = set2;
    }

    public IEnumerator<int> GetEnumerator()
    {
        foreach (var i in Enumerable.Range(0, this.Size))
            yield return i;
    }

    System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
    {
        return this.GetEnumerator();
    }

    public void Dispose()
    {
        Console.WriteLine("Disposed: {0}/{1}", this.Set, this.Set2);
    }

    public int Size { get; private set; }
    public int Set { get; private set; }
    public int Set2 { get; private set; }
}
like image 157
Matthew Whited Avatar answered Sep 24 '22 17:09

Matthew Whited