Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to make certain number of threads running all the time

Ok here my question. I want to start threads until a certain number. Lets say 100. So it will start starting threads and check continuously number of running threads. When the maximum number reached it will stop starting new threads. But with a proper checking interval or completed thread will signal and it will start new thread.

With this way, i will always have certain number of running threads.

I managed this with using sleep and permanent while. So i keep checking total running thread count with a given interval and if thread is completed, dispose it and start a new one.

But my solution is not coming me as a proper way. I suppose it would be better if the completed thread would signal and then checker would start a new one if we are below of the maximum number of threads threshold.

I saw many threadpool examples but most of them not contains any queued pooling with maximum amount of running threads. What i mean is, they just keep starting threads until they are done. But lets say i have 500k urls to harvest. I can not just start all of them in a for loop with thread pool.

platform is c# 4.5 WPF application

And here below is my solution. Actually i am looking for a better one. Not improving this one.

private void Button_Click_4(object sender, RoutedEventArgs e)
{
    Task.Factory.StartNew(() =>
    {
        startCrawler();
    });
}

void startCrawler()
{
    int irMaximumThreadcount = 100;
    List<Task> lstStartedThreads = new List<Task>();
    while (true)
    {
        for (int i = 0; i < lstStartedThreads.Count; i++)
        {
            if (lstStartedThreads[i].IsCompleted == true)
            {
                lstStartedThreads[i].Dispose();
                lstStartedThreads.RemoveAt(i);
            }
        }

        if (lstStartedThreads.Count < irMaximumThreadcount)
        {
            var vrTask = Task.Factory.StartNew(() =>
            {
                func_myTask();
            });
            lstStartedThreads.Add(vrTask);
        }

        System.Threading.Thread.Sleep(50);
    }
}

void func_myTask()
{

}
like image 926
MonsterMMORPG Avatar asked Mar 03 '13 02:03

MonsterMMORPG


2 Answers

Personally I would use PLINQ for this, and specifically the WithDegreeOfParallelism method which limits the number of concurrent executions to the passed in value.

private IEnumerable<Action> InfiniteFunctions()
{
    while(true)
    {
        yield return func_myTask;
    }
}

private void Button_Click_4(object sender, RoutedEventArgs e)
{
    int irMaximumThreadcount = 100;
    InfiniteFunctions()
        .AsParallel()
        .WithDegreeOfParallelism(irMaximumThreadcount)
        .ForAll(f => f());
}

EDIT: Actually reading the documentation it seems that irMaximumThreadCount can only be a max of 64 so watch out for that.

EDIT 2: Ok, had a better look and it seems Parallel.ForEach takes a ParallelOptions parameter which includes a MaxDegreeOfParallelism property that isn't limited - Check it out. So your code might be like:

private void CrawlWebsite(string url)
{
    //Implementation here
}

private void Button_Click_4(object sender, RoutedEventArgs e)
{
    var options = new ParallelOptions() 
    { 
        MaxDegreeOfParallelism = 2000 
    };

    Parallel.ForEach(massiveListOfUrls, options, CrawlWebsite);
}
like image 94
Felix Avatar answered Oct 12 '22 23:10

Felix


You are mixing up tasks with threads. A task is not a thread. There is no guarantee that each task will have it's own thread.

Actually the TPL (Task Parallel Library) is some kind of queue. This means you can just create and start tasks for each Func or Action object you have. There is no easy way to control the number of threads that are actually created.

However, you can create many tasks with little overhead because the TPL will enqueue them and apply further logic to balance the work over the threads of the thread pool.

If some tasks need to be executed one after the other you can use Task.ContinueWith to enqueue them. It is also possible to start new tasks with Task.Factory.ContinueWhenAny or Task.Factory.ContinueWhenAll.

This is also the clue to how you can control the number of parallel tasks you want to create: Just create the desired number of tasks and enqueue the remaining tasks with ContinueWhenAny. Each time a task ends the next will be started.

Again: the TPL will balance the work among the threads in the thread pool. What you need to consider anyway is the use of other resources like disk I/O or internet connection. Having a lot of tasks that try to use the same resources concurrently can drastically slow down your program.

like image 41
pescolino Avatar answered Oct 12 '22 23:10

pescolino