Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Thread queues for dummies

I have what I assume is a pretty common threading scenario:

  • I have 100 identical jobs to complete
  • All jobs are independent of each other
  • I want to process a maximum of 15 jobs at a time
  • As each job completes, a new job will be started until all jobs have been completed

If you assume that each job will fire an event when he completes (I'm using the BackgroundWorker class), I can think of a couple of ways to pull this off, but I'm not sure what the "right" solution is. I was hoping some of you gurus out there could point me in the right direction.

SOLUTION 1: Have a while(continue) { Threading.Sleep(1000); } loop in my Main() function. The code in the Job_Completed event handler would set continue = false when A) no jobs remain to be queued and B) all queued jobs have completed. I have used this solution before and while it seems to work fine...it seems a little "odd" to me.

SOLUTION 2: Use Application.Run() in my Main() function. Similarly, the code in the Job_Completed event handler would call Application.Exit() when A) no jobs remain to be queued and B) all queued jobs have completed.

SOLUTION 3: Use a ThreadPool, queue up all 500-1000 requests, let them run 10 at a time (SetMaxThreads) and somehow wait for them all to complete.

In all of these solutions, the basic idea is that a new job would be started every time another job is completed, until there are no jobs left. So, the problem is not only waiting for existing jobs to complete, but also waiting until there are no longer any pending jobs to start. If ThreadPool is the right solution, what is the correct way to wait on the ThreadPool to complete all queued items?

I think my overriding confusion here is that I don't understand exactly HOW events are able to fire from within my Main() function. Apparently they do, I just don't understand the mechanics of it from a Windows message loop point-of-view. What is the correct way to solve this problem, and why?

like image 826
Casey Gay Avatar asked Apr 28 '09 00:04

Casey Gay


4 Answers

Even though the other answers are nice if you want another option (you can never have enough options), then how about this as an idea.

Just put the data for each job into a structure, which is in a FIFO stack.

Create 15 threads.

Each thread will get the next job from the stack, popping it off.

When a thread finishes the processing, get the next job, if the stack is empty the thread dies or just sleeps, waiting.

The only complexity, which is pretty simple to resolve, is having the popping be in a critical section (synchronize read/pop).

like image 127
James Black Avatar answered Nov 05 '22 22:11

James Black


Re: "somehow wait for them all to complete"

ManualResetEvent is your friend, before you start your big batch create one of these puppies, in your main thread wait on it, set it at the end of the background operation when the job is done.

Another option is to manually create the threads and do a foreach thread, thread.Join()

You could use this (I use this during testing)

     private void Repeat(int times, int asyncThreads, Action action, Action done) {
        if (asyncThreads > 0) {

            var threads = new List<Thread>();

            for (int i = 0; i < asyncThreads; i++) {

                int iterations = times / asyncThreads; 
                if (i == 0) {
                    iterations += times % asyncThreads;                    
                }

                Thread thread = new Thread(new ThreadStart(() => Repeat(iterations, 0, action, null)));
                thread.Start();
                threads.Add(thread);
            }

            foreach (var thread in threads) {
                thread.Join();
            }

        } else {
            for (int i = 0; i < times; i++) {
                action();
            }
        }
        if (done != null) {
            done();
        }
    }

Usage:

// Do something 100 times in 15 background threads, wait for them all to finish.
Repeat(100, 15, DoSomething, null)
like image 35
Sam Saffron Avatar answered Nov 05 '22 22:11

Sam Saffron


I would just use the Task Parallel Library.

You can do this as a single, simple Parallel.For loop with your tasks, and it will automatically manage this fairly cleanly. If you can't wait for C# 4 and Microsoft's implementation, a temporary workaround is to just compile and use the Mono Implementation of TPL. (I personally prefer the MS implementation, especially the newer beta releases, but the Mono one is functional and redistributable today.)

like image 1
Reed Copsey Avatar answered Nov 05 '22 21:11

Reed Copsey


When you queue a work item in the thread queue, you should get a waithandle back. Put them all in an array and you can pass it as an argument to the WaitAll() function.

like image 1
Joel Coehoorn Avatar answered Nov 05 '22 21:11

Joel Coehoorn