Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I report progress while executing a LINQ expression on a large-ish data set

Tags:

linq

If I need generate a fairly large dataset using LINQ and it may take a while (say a few seconds) and I need to (would like to) generate feedback to the use as to %'age done, is there an easy/ preferred way to do this?

Example, say I have list A with 1000 cars and list B with 1000 trucks and I want to select all possible ordered (car, truck) pairs where car.color == truck.color link this:

var pairs = from car in A 
            from truck in B 
            where car.color==truck.color 
            select new {car, truck};

Now at some point this will be evaluated as a set of nested foreach loops. I would like to be able to report %'age complete as it interates and ideally update a progressbar or something.

EDIT: Just after my query, I store the result in a member variable as a list like this (which forces the query to execute):

mPairs = pairs.ToList();

I do this because I am executing this in a background worker thread as I do not want the UI thread to freeze up as it evaluates the LINQ expression on demand on the UI thread (this is in Silverlight BTW). Hence why I would like to report progress. The UX is basically this:

  1. A user drags an item onto the workspace
  2. The engine then kicks up on a background thread to determine the (many) connection possibilities to all of the other items on the workspace.
  3. While the engine is calculating the UI does not allow new connections AND reports progress to indicate when the new item will be "connectable" to the other items (all the possible connection paths not already in use have been determined via LINQ).
  4. When the engine completes the calculation (query), the item is connectable in the UI and the possible connection paths are stored in a local variable for future use (e.g. when the user clicks to connect the item all the possible paths will be highlighted based upon what was calculated when it was added)

(a similar process must happen on deletion of an item)

like image 542
caryden Avatar asked Mar 17 '09 20:03

caryden


People also ask

What does => mean in LINQ?

The => operator can be used in two ways in C#: As the lambda operator in a lambda expression, it separates the input variables from the lambda body. In an expression body definition, it separates a member name from the member implementation.

What are LINQ query expressions?

For a developer who writes queries, the most visible "language-integrated" part of LINQ is the query expression. Query expressions are written in a declarative query syntax. By using query syntax, you can perform filtering, ordering, and grouping operations on data sources with a minimum of code.

How does work LINQ?

LINQ simplifies this situation by offering a consistent model for working with data across various kinds of data sources and formats. In a LINQ query, you are always working with objects. You use the same basic coding patterns to query and transform data in XML documents, SQL databases, ADO.NET Datasets, .

What is select new LINQ?

@CYB: select new is used when you want your query to create new instances of a certain class, instead of simply taking source items. It allows you to create instances of a completely different class, or even an anonymous class like in OP's case.


1 Answers

Something I used that worked well was an adapter for the DataContext that returned a count of the number of items it's yielded.

public class ProgressArgs : EventArgs
{
    public ProgressArgs(int count)
    {
        this.Count = count;
    }

    public int Count { get; private set; }
}

public class ProgressContext<T> : IEnumerable<T>
{
    private IEnumerable<T> source;

    public ProgressContext(IEnumerable<T> source)
    {
        this.source = source;
    }

    public event EventHandler<ProgressArgs> UpdateProgress;

    protected virtual void OnUpdateProgress(int count)
    {
        EventHandler<ProgressArgs> handler = this.UpdateProgress;
        if (handler != null)
            handler(this, new ProgressArgs(count));
    }

    public IEnumerator<T> GetEnumerator()
    {
        int count = 0;
        foreach (var item in source)
        {
            // The yield holds execution until the next iteration,
            // so trigger the update event first.
            OnUpdateProgress(++count);
            yield return item;
        }
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }
}

Usage

var context = new ProgressContext(
    from car in A 
    from truck in B 
    select new {car, truck};
);
context.UpdateProgress += (sender, e) =>
{
    // Do your update here
};

var query = from item in context
            where item.car.color==item.truck.color;

// This will trigger the updates
query.ToArray();

The only issue is you can't easily do a percentage unless you know the total count. To do a total count often requires processing the entire list, which can be costly. If you do know the total count beforehand then you can work out a percentage in the UpdateProgress event handler.

like image 80
Cameron MacFarland Avatar answered Oct 11 '22 18:10

Cameron MacFarland