Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is an IOCP a thread that is running while the I/O is taking place or after?

I'm trying to understand I/O Completion Ports and specifically how they relate to using async-await for I/O.

The infamous article There is No Thread talks about IOCPs being borrowed briefly after the I/O is complete. Because the whole point of the article is to show that when the fancy hardware-level I/O stuff is in-flight, there is no thread that is consumed by a loop like

Is the I/O done yet? No. Is the I/O done yet? No. Is the I/O done yet? No. ...

But then I'm looking at this article which says that a

"component is in charge of checking the completion port for queued elements"

and gives an example like

public class IOCompletionWorker
{ 
    public unsafe void Start(IntPtr completionPort)
    {
        while (true)
        {
            uint bytesRead;
            uint completionKey;
            NativeOverlapped* nativeOverlapped;

            var result = Interop.GetQueuedCompletionStatus(
                completionPort, 
                out bytesRead,
                out completionKey,
                &nativeOverlapped, 
                uint.MaxValue);

            var overlapped = Overlapped.Unpack(nativeOverlapped);

            if (result)
            {
                var asyncResult = ((FileReadAsyncResult)overlapped.AsyncResult);
                asyncResult.ReadCallback(bytesRead, asyncResult.Buffer);
            }
            else
            {
                ThreadLogger.Log(Interop.GetLastError().ToString());
            }

            Overlapped.Free(nativeOverlapped);
        }
    }
}

var completionPortThread = new Thread(() => new IOCompletionWorker().Start(completionPortHandle))
{
    IsBackground = true
};
completionPortThread.Start();

which to me looks like there is some polling going on.

I guess my questions boil down to

  • Is it true to say that a .NET application has 2 types of thread pools -- (1) "worker threads" and (2) "I/O threads"?
  • If it's true, is there a fixed number, specified in a configuration, like M worker threads and N I/O threads? And what is usually the ratio of M to N?
  • When exactly are I/O threads used?
like image 627
user7127000 Avatar asked Dec 10 '17 17:12

user7127000


1 Answers

Both articles are correct in their own way.

IOCPs are not threads. They can be seen as some kind of queue in which the kernel (or also regular user-mode code, through PostQueuedCompletionStatus) can post completion items. There is no inherent threading model or threads associated with IOCPs themselves, they are simply multiple producer-consummer queues.

Let's take network Sockets as an example, but this would be true for any kind of asynchronous work:

  • You call your WSARecv on your overlapped-mode socket bound to an IOCP, it's up to the network driver to do whatever is necessary to setup the actual request for reception of data. There is no thread actively waiting for your data to arrive.
  • Data arrives. The operating system is woken up by the hardware. The operating system will give the network driver some CPU time in the kernel to process the incoming event. The network driver processes the interrupt, and then because your socket was bound to an IOCP, posts a completion item to your IOCP queue. The request is complete.

There is no actual user mode thread from your process involved in any of this operation (beyond just the initial asynchronous call). If you want to act on the fact that your data has arrived (which I assume you do when you're reading from a socket!), then you have to dequeue the completed items from your IOCP.

The point of IOCPs is that you can bind thousands of IO handles (sockets, files, ...) to a single IOCP. You can then use a single thread to drive those thousands of asynchronous processes in parallel.

Yes, that one thread doing the GetQueuedCompletionStatus is blocked while there is no completion pending on the IOCP, so that's probably where your confusion came from. But the point of IOCPs is that you block that one thread while you can have hundreds of thousands of network operations pending at any given time, all serviced by your one thread. You would never do a 1-to-1-to-1 mapping between IO handle/IOCP/Servicing Thread, because then you would lose any benefit from being asynchronous, and you might as well just use synchronous IO.

The main point of IOCPs is to achieve impressive parallelism of asynchronous operations under Windows.

I hope this clarifies the confusion.

As for the specific questions

  1. Yes, the .Net framework has two pools. One is purely for user-mode general purpose work, the "Worker" threadpool. The other is the "IO" threadpool. This second one is so that all the IOCP management can be hidden from you when writing high-level C# code and so that your asynchronous socket just works like magic.
  2. This is all implementation detail that can change at any time, but the answer is that both pools are independent. If you have massive work bandwidth happening on the worker threadpool and the framework decides that your overall throughput would increase by adding new threads, it will add threads to the worker pool alone, and not touch the IO pool. The same goes for the IO pool, if you have misbehaving code that blocks IO threads in their callbacks, it will spawn new IO pools and not touch the worker pool. You can customize the numbers using ThreadPool.SetMinThreads/SetMaxThreads, but this is usually a sign that your process misuses the threadpool.
  3. IO threads are used when there are items dequeued from the threadpool's internal IOCP. In typical code, this will be when an asynchronous operation has completed on some IO handle. You can also queue items yourself through UnsafeQueueNativeOverlapped, but that's a lot less common.

Pure managed asynchronous operations (like doing async-await with Task.Delay, for example) do not involve any IO handle, and so they don't end up being posted to an IOCP by some driver, and so those would fall under the "Worker" category.

As a side note, you can tell Worker threads from IO threads by their callstack. Worker threads will start their managed callstack with "ThreadPoolWorkQueue.Dispatch", whereas IO threads will start their managed callstack with "_IOCompletionCallback.PerformIOCompletionCallback". This is all implementation detail that can change at any time, but it can be helpful to know what you are dealing with when debugging your managed code.

like image 193
fbrosseau Avatar answered Nov 15 '22 16:11

fbrosseau