Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Async recursion. Where is my memory actually going?

This is asked more out of curiosity than with regards to any real-world problem.

Consider the following code:

void Main()
{
    FAsync().Wait();
}

async Task FAsync()
{
    await Task.Yield();
    await FAsync();
}

In the synchronous world, this would ultimately cause a stackoverflow.

In the async world, this simply consumes a lot of memory (which I assume is related to something that I might loosely call the "asynchronous stack"?)

What precisely is this data, and how is it held?

like image 865
spender Avatar asked Feb 17 '16 18:02

spender


2 Answers

Good question.

The stack is the reification of continuation. Continuation is, simply, information about what the program is going to do next. In a traditional non-async environment, this is represented as a return address on the stack; when the method returns it looks at the stack and branches to the return address. The stack also has information on it about what the values of the local variables are at the point where the continuation picks up.

In an async situation all that information is stored on the heap. A task contains a delegate which is invoked when the task is completed. The delegate is bound to an instance of a "closure" class which contains fields for any local variables or other state. And of course the tasks are themselves heap objects.

You might wonder: if it is the case that the continuation is a delegate which is invoked when the task completes, how then is the code which completes the task not on the call stack at the point where the completion is executed? The task can choose to invoke the continuation delegate by posting a windows message, and when the message loop processes the message, it does the invocation. So the invocation is then at the "top" of the stack, where the message loop typically sits. (The precise details of the invocation strategy used for the continuation depend on the context in which the task is created; see a more advanced guide to the task parallel library for the details.)

A good introductory article on how this all works can be found here:

https://msdn.microsoft.com/en-us/magazine/hh456403.aspx

A few of the details have changed since Mads wrote that article but the ideas are sound. (i3arnon's answer illustrates how this evolved; in Mads's article everything goes on the heap, but this turns out to produce excess garbage in some scenarios. A more sophisticated codegen allows us to keep some of the information on the stack. Understanding that distinction is not necessary to see how the continuations are logically represented.)

It's an entertaining and enlightening exercise to take your program and actually draw out all the delegates and tasks that are created, and what the references are between them. Give it a shot!

like image 168
Eric Lippert Avatar answered Nov 12 '22 05:11

Eric Lippert


The compiler turns your async method into a state machine struct. The struct is created firstly on the stack. When you await an uncompleted task (otherwise it continues running synchronously and will cause an overflow of the stack) that state machine is boxed and moved to the heap.

For example this method:

public async Task M()
{
}

is turned into this state machine:

private struct <M>d__0 : IAsyncStateMachine
{
    public int <>1__state;
    public AsyncTaskMethodBuilder <>t__builder;
    void IAsyncStateMachine.MoveNext()
    {
        try
        {
        }
        catch (Exception exception)
        {
            this.<>1__state = -2;
            this.<>t__builder.SetException(exception);
            return;
        }
        this.<>1__state = -2;
        this.<>t__builder.SetResult();
    }
    [DebuggerHidden]
    void IAsyncStateMachine.SetStateMachine(IAsyncStateMachine stateMachine)
    {
        this.<>t__builder.SetStateMachine(stateMachine);
    }
}

So, in "traditional" recursion the state for each iteration is stored on the stack so too many iterations can overflow that memory. In an async method the state is stored on the heap and it may overflow as well (though it's usually much bigger).

like image 3
i3arnon Avatar answered Nov 12 '22 06:11

i3arnon