Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I merge two Linq IEnumerable<T> queries without running them?

How do I merge a List<T> of TPL-based tasks for later execution?

 public async IEnumerable<Task<string>> CreateTasks(){ /* stuff*/ }

My assumption is .Concat() ...

     void MainTestApp()  // Full sample available upon request.
     {
        List<string> nothingList = new List<string>();
        nothingList.Add("whatever");
        cts = new CancellationTokenSource();

         delayedExecution =
            from str in nothingList
            select AccessTheWebAsync("", cts.Token);
         delayedExecution2 =
          from str in nothingList
          select AccessTheWebAsync("1", cts.Token);

         delayedExecution = delayedExecution.Concat(delayedExecution2);
     }


    /// SNIP

    async Task AccessTheWebAsync(string nothing, CancellationToken ct)
    {
        // return a Task
    }

I want to make sure that this won't spawn any task or evaluate anything. In fact, I suppose I'm asking "what logically executes an IQueryable to something that returns data"?

Background

Since I'm doing recursion and I don't want to execute this until the correct time, what is the correct way to merge the results if called multiple times?

If it matters I'm thinking of running this command to launch all the tasks var AllRunningDataTasks = results.ToList(); followed by this code:

while (AllRunningDataTasks.Count > 0)
{
    // Identify the first task that completes.
    Task<TableResult> firstFinishedTask = await Task.WhenAny(AllRunningDataTasks);

    // ***Remove the selected task from the list so that you don't
    // process it more than once.
    AllRunningDataTasks.Remove(firstFinishedTask);

    // TODO: Await the completed task.
    var taskOfTableResult = await firstFinishedTask;

    // Todo: (doen't work)
    TrustState thisState = (TrustState)firstFinishedTask.AsyncState;

    // TODO: Update the concurrent dictionary with data
    // thisState.QueryStartPoint + thisState.ThingToSearchFor 

    Interlocked.Decrement(ref thisState.RunningDirectQueries);
    Interlocked.Increment(ref thisState.CompletedDirectQueries);

    if (thisState.RunningDirectQueries == 0)
    {
        thisState.TimeCompleted = DateTime.UtcNow;
    }
}
like image 941
makerofthings7 Avatar asked Nov 04 '22 09:11

makerofthings7


1 Answers

To answer the specific question "what logically executes an IQueryable to something that returns data"? That would be anything that either forces the production of at least one value, or forces the discovery of whether a value is available.

For example, ToList, ToArray, First, Single, SingleOrDefault, and Count will all force evaluation. (Although First will not evaluate the entire collection - it'll retrieve the first item and then stop.) These all have to at least start retrieving values, because there's no way for any of them to return what they return without doing so. In the case of ToList and ToArray, these return fully-populated non-lazy collections, which is why they have to evaluate everything. The methods that return a single item need to at least ask for the first item, and the Single ones will then go on to check that nothing else comes out if evaluation continues (and throw an exception if there turns out to be more).

Using foreach to iterate over the query will also force evaluation. (And again, it's for the same reason: you're asking it for actual values from the collection so it has to provide them.)

Concat does not evaluate immediately because it doesn't need to - it's only when you ask the concatenated sequence for a value that it needs to ask its inputs for values.

BTW, although you asked about IQueryable you're not using that in the examples here. This can matter, because there are some differences in how that works compared to the LINQ to Objects implementation (which you get for plain IEnumerable<T>) that you're actually getting. I don't think it makes a difference in this example, but it makes me wonder if something might have changed between your original code, and the version you posted for illustration here? It can matter because different LINQ providers can do things different ways. The IEnumerable<T> flavour of Concat definitely uses deferred evaluation, and although I'd expect that to be true for most other LINQ implementations, it's not absolutely a given.

If you need to use the results multiple times, and you want to ensure that you only evaluate them once, but that you don't evaluate them until you actually need them, then the usual approach is to call ToList at the point where you definitely need to evaluate, and then hold onto the resulting List<T> so you can use it again. Once you've got the data in List<T> (or array) form you can use that list as many times as you like.

By the way, your first question has an issue:

"How do I merge a List of TPL-based tasks for later execution?"

In general, if you already have a TPL task then you can't stop it from executing. (There is an exception to this. If you construct a Task directly instead of using one of the more normal ways of creating it, it won't actually run until you tell it to. But in general, APIs that return tasks return live ones, i.e., they may well already be running, or even complete, by the time you get your hands on them.)

The "later execution" in your example comes from the fact that you don't actually have a list of tasks at all to start with. (If you did in fact have a List<T> of tasks, "later execution" would not be an option.) What you have is a couple of enumerables which, if you were to evaluate them, would create tasks. The act of creating the task is indivisible from the act of starting it in any TAP-style API that returns a Task.

Based on the rest of what you wrote, I think what you are really asking is:

"How do I merge multiple IEnumerable<Task<T>> objects into a single IEnumerable<Task<T>> in a way that defers evaluation of the underlying enumerables until the combined enumerable itself is evaluated?"

Concat should work for that.

like image 182
Ian Griffiths Avatar answered Dec 18 '22 14:12

Ian Griffiths