Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can PLINQ or TPL suffocate .net thread pool and IIS request processing queue?

Here is a scenario. A gateway hosted by IIS. A request may require issuing multiple independent other requests to another service each of which may take up to several seconds. Naturally I thought it was a good candidate for parallelization. However I have an internal struggle if I should do it and if I do then what kind of degree of parallelism if any I should use. My concern is that all threads in app are generally managed by the .net thread pool so if I end up allocating too many threads (even waiting), may I end up suffocating other gateway functionality and API?

like image 597
Schultz9999 Avatar asked Mar 19 '13 05:03

Schultz9999


2 Answers

Your concern is valid to a degree. Indeed, the thread pool can be overwhelmed. This is why such gateway services calling long-running non-compute IO functions are generally preferred to be async. Async does not mean "fire and forget". You can still "wait" although this waiting does not block a thread anymore.

This requires extensive code changes.

A much simpler solution would be to put a load generator to your service and measure if it is even a problem. If yes, a quick fix is to vastly increase the thread-pool size (I have positive experience with 500-1000 threads (default = 250)). This is not optimal for throughput but it will work, be reliable and take little developer time. Just make sure to test under load so that you don't get any nasty pager calls at night.

Note, that although async is all the rage right now it has its disadvantages when it comes to developer productivity. Sometimes the good old solution of spawning 100s of threads is sufficient and actually the best engineering trade-off. If you are on C# 5.0 you can take advantage of async/await and achieve non-blocking IO much more easily.

like image 20
usr Avatar answered Sep 28 '22 16:09

usr


PLINQ is useful for compute-bound operations which is not your case. What makes sense for you is Async I/O.

There are multiple ways to do async I/O in .NET: plain old APM, TPL, Reactive Extensions, C# 5 async/await, F# async workflows. The latter three techniques allow you to compose multiple I/O operations in various ways and reduce the amount of boilerplate code.

Async I/O is not about allocating new threads in any way. It uses I/O completion ports to avoid thread blocking while waiting for the reply.

I should stress this out: when you create multiple long-running requests in already-concurrent environment (IIS), with async I/O you definitely shall gain better throughput and thread pool utilization.

Consider the following code:

Task.Factory.StartNew( () => Parallel.ForEach<Item>(items, item => DoSomething(item)));

When it is used in client-side program, it allows you not to block UI thread, while not bothering to do I/O inside DoSomething asynchronously.

When you have already-concurrent service, and DoSomething performs I/O synchronously, you end up using additional, blocked thread on your busy pool. If you synchronously perform 3 outgoing requests in parallel per incoming request, 10 simultaneous incoming requests will block 30 threads on your pool. This is not very scalable approach.

When you have I/O bound task, it is not important whether your task scheduled on the new thread or not, what is important is whether this task blocks the thread or not.

So, with TPL, I see the following pattern is the better than the code above:

var tasks = new Task<System.Net.WebResponse>[2];
var urls = new string[] { "https://stackoverflow.com/", "http://google.com/" };
for (int i = 0; i < tasks.Length; i++)
{
    var wr = (System.Net.HttpWebRequest)System.Net.WebRequest.Create(urls[i]);
    tasks[i] = Task.Factory.FromAsync<System.Net.WebResponse>(
                   wr.BeginGetResponse,
                   wr.EndGetResponse,
                   null);
}

var result = Task.Factory.ContinueWhenAll(
    tasks,
    results => {
        foreach (var result in results)
        {
            var resp = result.Result.GetResponseStream();
            // Process responses and combine them into single result
        }
    });

Again, 90% of the code above is boilerplate. The key feature is WebRequest's BeginGetResponse async operation, and it should be used regardless of the higher-level technique you end up with (TPL, Rx etc...).

like image 154
Artem Koshelev Avatar answered Sep 28 '22 14:09

Artem Koshelev