Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# "async and await" feature and threads [duplicate]

Please see the following codes.

            class AddParams
            {
                public int a, b;

                public AddParams(int numb1, int numb2)
                {
                    a = numb1;
                    b = numb2;
                }
            }

            class Program
            {
                static void Main(string[] args)

                {
                    Console.WriteLine("ID of thread in 1: {0}",
                      Thread.CurrentThread.ManagedThreadId);

                    AddAsync();
                    Console.ReadLine();
                }

                private static async Task AddAsync()
                {
                    Console.WriteLine("***** Adding with Thread objects *****");
                    Console.WriteLine("ID of thread in Main(): {0}",
                      Thread.CurrentThread.ManagedThreadId);

                    AddParams ap = new AddParams(10, 10);
                    await Sum(ap);

                    Console.WriteLine("Other thread is done!");
                }

                static async Task Sum(object data)
                {
                    await Task.Run(() =>
                    {
                        if (data is AddParams)
                        {
                            Console.WriteLine("ID of thread in Add(): {0}",
                              Thread.CurrentThread.ManagedThreadId);

                            AddParams ap = (AddParams)data;
                            Console.WriteLine("{0} + {1} is {2}",
                              ap.a, ap.b, ap.a + ap.b);
                        }
                    });
                }
            }

And the result is this:

***** Adding with Thread objects *****
    ID of thread in Main(): 1
    ID of thread in Add(): 3
    10 + 10 is 20
    Other thread is done!

I get the result that is unlike what I expected. Please correct my assumptions with giving right concepts.

1. I assume that Main() is invoked on the primary thread, say Thread1.

2. Invoke AddAsync() but this method is marked with async, so this method is possibly invoked on a secondary thread, say Thread2. But the result of the following code inside of AddAsync() method says Thread1 unlike Thread2 which I expected:

Console.WriteLine("ID of thread in Main(): {0}",Thread.CurrentThread.ManagedThreadId);

3. Invoke Sum() method. but by decoration by await keyword, Sum() method is invoked on a secondary thread, say Thread3.

  1. Inside of Sum() method, Run() method is invoked on the secondary thread, say Thread4.

I'm understanding at this moment like this: When I do asynchronous programming, creating new threads depend on the CLR. CLR creates multi-threads as it could as possible, but CLR also can process asynchronous tasks on a single thread, just by processing multi tasks at the same time asynchronously on a single thread. I just assume simple case that every time I attempt asynchronous tasks, CLR creates a new thread.

I know this topic is difficult to depict so it would be nicer if it would be explained with drawings.

Additional questions.

1. The following code inside of AddAsync() method really indicates ID of thread in "Main()"? For me, it would be more appropriate with "ID of thread in AddAsync()".

Console.WriteLine("ID of thread in Main(): {0}", Thread.CurrentThread.ManagedThreadId);
like image 844
YoungMin Park Avatar asked Nov 26 '17 03:11

YoungMin Park


2 Answers


Original Answer

  1. I assume that Main() is invoked on the primary thread, say Thread1.

Right, Main will run on the main thread.

  1. Invoke AddAsync() but this method is marked with async, so this method is possibly invoked on a secondary thread, say Thread2. But the result of the following code inside of AddAsync() method says Thread1 unlike Thread2 which I expected:
Console.WriteLine("ID of thread in Main(): {0}",Thread.CurrentThread.ManagedThreadId);

No. The code will not switch threads if you have not reached await.

static void Main(string[] args) // <- main thread
{
    Console.WriteLine("ID of thread in 1: {0}",
        Thread.CurrentThread.ManagedThreadId); // <- main thread

    AddAsync(); // <- main thread (note: you are not awaiting here)
    Console.ReadLine();
}

private static async Task AddAsync()
{
    Console.WriteLine("***** Adding with Thread objects *****"); // <- main thread
    Console.WriteLine("ID of thread in Main(): {0}", // <- main thread
        Thread.CurrentThread.ManagedThreadId);

    AddParams ap = new AddParams(10, 10); // <- main thread
    await Sum(ap); // <- ok, we cannot continue.
                   // Add `Sum(ap)` to pending stuff.
                   // When Sum(ap) is done we resume here, potentially in another thread.
                   // The main thread is now free to do pending stuff.
                   // Turns out `Sum(ap)` is pending, run it on the main thread.

    Console.WriteLine("Other thread is done!");
}
  1. Invoke Sum() method. but by decoration by await keyword, Sum() method is invoked on a secondary thread, say Thread3.

It may or may not run in the same thread. It is very likely that Sum will run in the main thread, because when the main thread is awaiting Sum and we need a thread to run Sum the main thread is available.

If you add the following line at the start of Sum, I would expect it to say the same id as the main thread:

Console.WriteLine("ID of thread in Sum(): {0}", Thread.CurrentThread.ManagedThreadId);
  1. Inside of Sum() method, Run() method is invoked on the secondary thread, say Thread4

Right, Task.Run will use another thread, by default one from the ThreadPool. Note: I say by default, because this depends on the TaskScheduler, and the default one will use the ThreadPool.


I'm understanding at this moment like this: When I do asynchronous programming, creating new threads depend on the CLR. CLR creates multi-threads as it could as possible, but CLR also can process asynchronous tasks on a single thread, just by processing multi tasks at the same time asynchronously on a single thread. I just assume simple case that every time I attempt asynchronous tasks, CLR creates a new thread.

It will not start new threads every time. async/await is not syntactic sugar around Thread, but around Task and continuations. Task was already designed to avoid using new threads if it was not necesary, for example a Task may run inline, or use the ThreadPool.


  1. The following code inside of AddAsync() method really indicates ID of thread in "Main()"? For me, it would be more appropriate with "ID of thread in AddAsync()".
Console.WriteLine("ID of thread in Main(): {0}",  Thread.CurrentThread.ManagedThreadId);

As indicated in the comments in the code above, yes that is the main thread.


After you reach await Task.Run... the main thread will be idle, because it has to wait for the task to complete. When it resumes, it returns to AddAsync, runs Console.WriteLine("Other thread is done!"); then return to Main where it runs Console.ReadLine();. If you add the following line in Main before the call to Console.ReadLine, you will see the id of the main thread:

Console.WriteLine("ID of thread before ReadLine: {0}",
    Thread.CurrentThread.ManagedThreadId);

As you can see, your code does not require parallelism. Aside from using Task.Run it could aswell have run in a single thread. Errata: Upon futher inspection there is parallelism, just not as evident... see the extended answer.


Extended Answer

After a second reading, I suspect you were expecting the call to AddAsync run in parallel. As I said above, you are not awaiting it, in this case it runs like a regular syncrhonous call.

If you want to run AddAsync in parallel, I suggest to use Task.Run, for example:

Task.Run((Func<Task>)AddAsync);

Doing that, AddAsync will no longer run in the main thread. The main thread will advance to Console.ReadLine and may even end before AddASync does. Note that the execution will end as soon as the main thread ends.

Of couse, AddAsync is fast, I suggest to await a few Task.Delay to give you some time to hit that key.


Before you ask, let me post the question: How does Task.Delay work? - a simplified explanation of the internals (at least on Windows) is that it will ask the operating system for a timeout. When the operating system sees that the time is over, it will call the program to notify the timeout is done. That way Task.Delay does not need to use a thread to run.

That is a different type of Task in which it does not have to run code and thus does not need to take a thread. We can refer to that kind of Taks as promises. Another example would be reading from a file, for example:

using (var reader = File.OpenText("example.txt"))
{
    var fileText = await reader.ReadToEndAsync();
    // ...
}

In this case, the act of reading the file will not require one of your threads. Internally the operating system will ask the driver to copy the data to a RAM buffer and notify when it is done (which in modern harware will happen thanks to DMA requiring minimal intervention of the CPU), so no thread used there.

Meanwhile, the calling thread is free to do other stuff. If you have multiple operation like that (for example, you may be reading from file, sending data to the network, etc.) they can happen in parallel, without using your threads, and when one of them is completed then the execution of your code will resume in one of your available threads.


Another thing to consider is that thing work slighly different if you are working on a UI thread.

In a window, a lot of operations start from the message queue. No need to worry about how that works, but suffice to say that the main thread will expend a lot of time waiting for input events (click, key press, etc.).

As you will see at the end of the extended answer, there is a chance a method will continue in a different thread. But, the UI thread is only one that can interact with the UI. Therefore it is not good to get UI code running in a different thread.

To fix the problem, in a UI thread await will let the thread continue to work on the message queue. In addition, it will post messages to the queue for the continuation, allowing the UI thread to pick them up. The way this is archived is by using a different TaskScheduler for UI threads.

That also means that if you are on a UI enviroment, and you use await for promise tasks, it will allow it to stay responsive to input events. That may save you the use of the BackgroundWorker... Well, unless you need something that require a lot of CPU time, then you will need to use Task.Run, call the ThreadPool, use the BackgroundWorker or start a Thread.


Your questions

So, can I say in this code, a new thread is created only by Task.Run()? and async and await keyword don't create a new thread?

No, Task.Run is using another thread, but is not creating it. By default it falls back to the ThreadPool.

What does the ThreadPool do? Well, it keeps a small set of threads that can be used to run operations on demand (for example to run a Task), once the operation is done, the thread returns to the ThreadPool where it will remain idle until you need it again. For abstract: the ThreadPool recycles threads.


At this invoking point "await Sum(ap)" inside of AddAsync(), main thread is still invoking Sum(ap), right

Yes, it is still the main thread. Will go in more detail over this below.


And go to Sum() method, code still being processed on the main thread, suddenly encounted by Task.Run(). At this point of "Task.Run()", a new thread is created and lambda expression code is executed on the new thread?

As I said above, Task.Run does not create a new thread if it does not have to. It will ask the ThreadPool to run the operation on one of its threads. Those ThreadPool threads are there to run one off operations, so you do not end up creating lots of Thread but recycling just a few.

So, yes, the code in the lambda will run in a different Thread, but it is not one created just for that.


And when the "Task.Run()" is being processed, what's the state of the calling thread(main thread) awaiting the result of Task.Run() method? Is it being blocked or non-blocked?

First Notice that you have two options to wait on Task.Run. You can use await or you can use Task.Wait.

  • await will:

    • "pause" the execution of the method.
    • Get or create the incomplete Task of the method.
    • Add a continuation to Task to "resume" the method. Or if there is nothing else to run, the continuation will set the incomplete Task to complete.
    • Return an incomplete Task to the caller.
  • Task.Wait will:

    • Block the thread until the Task is completed.

Now, I will go over the code even more slowly...

First, the sync part again:

static void Main(string[] args) // <-- entry point, main thread
{
    Console.WriteLine("ID of thread in 1: {0}",
        Thread.CurrentThread.ManagedThreadId); // <-- main thread

    AddAsync(); // <-- main thread. You are not awaiting, this is a sync call.
    Console.ReadLine();
}

At this point, we will create an incomplete Task, which I will call AddAsync Task. This will be what AddAsync returns (not that you will use it, you just ignore it).

Then the main thread enters AddAsync:

private static async Task AddAsync() // <-- called from `AddAsync()`
{
    Console.WriteLine("***** Adding with Thread objects *****"); // <-- main thread
    Console.WriteLine("ID of thread in Main(): {0}",
        Thread.CurrentThread.ManagedThreadId); // <-- main thread

    AddParams ap = new AddParams(10, 10); // <-- main thread
    await Sum(ap); // <-- shenanigans!!!

    Console.WriteLine("Other thread is done!");
}

Let me refactor it a bit, real quick...

private static async Task AddAsync() // <-- called from `AddAsync()`
{
    Console.WriteLine("***** Adding with Thread objects *****"); // <-- main thread
    Console.WriteLine("ID of thread in Main(): {0}",
        Thread.CurrentThread.ManagedThreadId); // <-- main thread

    AddParams ap = new AddParams(10, 10); // <-- main thread
    var z = Sum(ap); // <-- shenanigans!!!
    await z;

    Console.WriteLine("Other thread is done!");
}

The next thing that happens is that invocation to Sum. At this point a new incomplete Task is created for Sum. I will refer to it as the Sum Task.

Next, the main thread enters Sum:

static async Task Sum(object data) // <-- called from `await Sum(ap)`
{
    await Task.Run(() =>
    {
        if (data is AddParams)
        {
            Console.WriteLine("ID of thread in Add(): {0}",
                Thread.CurrentThread.ManagedThreadId);

            AddParams ap = (AddParams)data;
            Console.WriteLine("{0} + {1} is {2}",
                ap.a, ap.b, ap.a + ap.b);
        }
    });
}

And more shenanigans... let me refactor that code...

static async Task Sum(object data) // <-- called from `await Sum(ap)`
{
    Action y = () =>
    {
        if (data is AddParams)
        {
            Console.WriteLine("ID of thread in Add(): {0}",
                Thread.CurrentThread.ManagedThreadId);

            AddParams ap = (AddParams)data;
            Console.WriteLine("{0} + {1} is {2}",
                ap.a, ap.b, ap.a + ap.b);
        }
    };
    var x = Task.Run(y);
    await x;
}

The code above is equivalent to what we have. Note here that you could use x.Wait() which would block the main thread. We are not doing that...

static async Task Sum(object data) // <-- called from `await Sum(ap)`
{
    Action y = () =>
    {
        if (data is AddParams)
        {
            Console.WriteLine("ID of thread in Add(): {0}",
                Thread.CurrentThread.ManagedThreadId);

            AddParams ap = (AddParams)data;
            Console.WriteLine("{0} + {1} is {2}",
                ap.a, ap.b, ap.a + ap.b);
        }
    }; // <-- Action created in main thread
    var x = Task.Run(y); // <-- main threat: create a new Task x with the action y
                         //    start the new Task in a thread from the thread pool
    await x;
}

Now, the interesting part...

static async Task Sum(object data)
{
    Action y = () =>
    {
        if (data is AddParams) // <-- second thread
        {
            Console.WriteLine("ID of thread in Add(): {0}",
                Thread.CurrentThread.ManagedThreadId);

            AddParams ap = (AddParams)data;
            Console.WriteLine("{0} + {1} is {2}",
                ap.a, ap.b, ap.a + ap.b);
        }
    };
    var x = Task.Run(y);
    await x; // <-- Add a continuation to x
             //    so that when it finished, it will set the Sum Task to completed
}

And now the method Sum returns (the incomplete Sum Task)

private static async Task AddAsync()
{
    Console.WriteLine("***** Adding with Thread objects *****");
    Console.WriteLine("ID of thread in Main(): {0}",
        Thread.CurrentThread.ManagedThreadId);

    AddParams ap = new AddParams(10, 10);
    var z = Sum(ap); // <-- main thread, z is now the incomplete Sum Task
    await z; // <-- Add a continuation to z
             //    so that when it finished, it will resume `AddAsync`
             //    `AddAsync` is "paused" now.
             //    main thread returns the incomplete Async Task

    Console.WriteLine("Other thread is done!");
}

And now the method AddAsync returns (the incomplete AddAsync Task). I want to add emphasis here: the method AddSync has not finished, but it is returning in an incomplete state.

static void Main(string[] args)
{
    Console.WriteLine("ID of thread in 1: {0}",
        Thread.CurrentThread.ManagedThreadId);

    AddAsync();
    Console.ReadLine(); // <-- main thread
}

Meanwhile, the second thread finishes...

static async Task Sum(object data)
{
    Action y = () =>
    {
        if (data is AddParams) // <-- second thread
        {
            Console.WriteLine("ID of thread in Add(): {0}",
                Thread.CurrentThread.ManagedThreadId); // <-- second thread

            AddParams ap = (AddParams)data; // <-- second thread
            Console.WriteLine("{0} + {1} is {2}",
                ap.a, ap.b, ap.a + ap.b); // <-- second thread
        }
    };
    var x = Task.Run(y);
    await x;
}

And triggers the continuation that we added to x.

That continuation sets the Sum Task (z) to completed. Which will resume AddAsync.

private static async Task AddAsync()
{
    Console.WriteLine("***** Adding with Thread objects *****");
    Console.WriteLine("ID of thread in Main(): {0}",
        Thread.CurrentThread.ManagedThreadId);

    AddParams ap = new AddParams(10, 10);
    var z = Sum(ap);
    await z;

    Console.WriteLine("Other thread is done!"); // <-- second thread
}

Now, AddAsync finishes. However, as I said above, you just ignore what AddAsync returns. You did not Wait, or await or add continuations to it... There is nothing else for the second thread to do, now the second thread dies.

Note: Just to be clear, the second thread was from the ThreadPool. You can check yourself by reading Thread.CurrentThread.IsThreadPoolThread.

like image 89
Theraot Avatar answered Oct 03 '22 05:10

Theraot


is marked with async, so this method is possibly invoked on a secondary thread

It's marked async so there's no reason to believe it's running on a thread at all. Whenever it's running there's obviously a thread it's running on, but there's no reason why it should be a single thread.

Up until the first await it's going to be running on whatever thread called into it, because it isn't awaiting.

If it awaits a task that is completed as soon as it gets it, then it'll stay on the thread it's on.

Consider:

public async Task<string> GetResult()
{
  if (_cachedResult != null) return _cachedResult;
  _cachedResult = await ReallyLongRunningThingAsync();
  return _cachedResult;
}

Something awaiting this method could get a response immediately if it was in the cache, or after a long time otherwise. You wouldn't want the expense of thread-switching for what turned out to be a simple field-access, would you? One of the advantages of tasks is making this decision to not actually do anything async easy.

If it awaits something that isn't completed then the thread it is on may be used for other things.

After the await, when that task has completed, it may be a completely different thread that resumes the rest of the method, unless the task scheduler has some reason to do otherwise.

One of the most important cases in tasks is async I/O, in which case there might not be any thread at all. When you're waiting for a network connection to return data there's no need for any thread to be wasted while waiting on the I/O layer to return results, so sometimes the "current" thread of an async method is no thread at all.

I'm understanding "async and await" feature is one of techniques for multi-threading, so I cannot but help to think in the point of view of threads.

This isn't always the best way to think about it. It's better to think of it as a way to do different things at once. One might object, "but isn't multi-threading how we do different things at once" but actually, no; multi-threading is one of the things we use in doing different things at once. Asynchronous I/O is another way of doing things at once, (though it does generally work in tandem with threads). Tasks are an abstraction above that.

Being an abstraction above threading makes some things that are difficult in directly dealing with threads easy with tasks.

With GetResult() for example, if we were using threads directly then we'd need to put what was going to be done with the result into a callback, then decide whether to call that callback directly (because we have the cached result) or pass that callback onto a call to ReallyLongRunningThingByThreads().

At a lower level we're getting this really, in that GetResult() is turned into a struct with a method that will either be called once (when it hits the cache) or twice. Compare with how a method with if (new Random().Next(0, 2) == 1) yield return 1; would create an object with a method that would be called either once or twice. Indeed the method in the struct created for GetResult() is named MoveNext() just like with enumerators.

It can also be very useful when one isn't so much "doing multithreading" as "dealing with multithreading". Consider a simple website MVC controller that grabs a list of names from a database and passes them to the view:

public IActionResult Names()
{
  var names = GetDataContext().People.Select(p => p.Name).ToList();
  return View(new NameListModel{Names = names});
}

When I'm doing this I don't care about multithreading as something that will give me anything—I only have one thing that I'm doing at a time and I can't do the next until I'm finished the first—but multithreading is something I need to deal with because a website is inherently multithreaded in dealing with lots of requests at a time. As such, multithreading concerns are something in the way; there's no "hurray I can use threads to help me here" opportunity, only a "boo, my holding up a thread waiting on a database is reducing my scalability, and if I want to improve that I have to start thinking about the complexities of threads more".

If I replace this with an async version:

public Task<IActionResult> Names()
{
  var names = await GetDataContext().People.Select(p => p.Name).ToListAsync();
  return View(new NameListModel{Names = names});
}

Then I've stopped blocking the current thread while it's waiting on the database, allowing it to go back to the threadpool to do something else, but without having to think about callbacks or what thread is doing what. So here is somewhere where I didn't necessarily want to think about threads, and I managed to make it deal well with the inherently multithreaded situation without thinking much about them.

Of course, there's still threads doing something much of the time, but the most important thing here is that in the gap between somewhere into the start of ToListAsync() and somewhere shortly before it returning, there is no thread servicing this particular request. This was also the case if you dealt with async i/o the old-fashioned way (and of course, it happens behind the scenes still) but that was often either very tricky or else out of reach, because the library we were using to deal with the i/o (database, web, filesystem) kept us too far away from the available asynchronicity. With tasks its a lot easier to have our methods have times when they don't have a current thread at all (again, behind the scenes they're actually structs with methods that get called several times, and they're not always being called).

As with all abstractions, there's times when we need to look at the level below the abstraction—or just want to because it's interesting stuff—but most of our thinking should be at the level of that abstraction when we're using it, and we shouldn't generally think about threads with tasks any more than we should think about 1s and 0s with strings.

like image 43
Jon Hanna Avatar answered Oct 03 '22 03:10

Jon Hanna