Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# Downloader: should I use Threads, BackgroundWorker or ThreadPool?

I'm writing a downloader in C# and stopped at the following problem: what kind of method should I use to parallelize my downloads and update my GUI?

In my first attempt, I used 4 Threads and at the completion of each of them I started another one: main problem was that my cpu goes 100% at each new thread start.

Googling around, I found the existence of BackgroundWorker and ThreadPool: stating that I want to update my GUI with the progress of each link that I'm downloading, what is the best solution?

1) Creating 4 different BackgroundWorker, attaching to each ProgressChanged event a Delegate to a function in my GUI to update the progress?

2) Use ThreadPool and setting max and min number of threads to the same value?

If I choose #2, when there are no more threads in the queue, does it stop the 4 working threads? Does it suspend them? Since I have to download different lists of links (20 links each of them) and move from one to another when one is completed, does the ThreadPool start and stop threads between each list?

If I want to change the number of working threads on live and decide to use ThreadPool, changing from 10 threads to 6, does it throw and exception and stop 4 random threads?

This is the only part that is giving me an headache. I thank each of you in advance for your answers.

like image 591
DDB Avatar asked Aug 02 '11 14:08

DDB


2 Answers

I would suggest using WebClient.DownloadFileAsync for this. You can have multiple downloads going, each raising the DownloadProgressChanged event as it goes along, and DownloadFileCompleted when done.

You can control the concurrency by using a queue with a semaphore or, if you're using .NET 4.0, a BlockingCollection. For example:

// Information used in callbacks.
class DownloadArgs
{
    public readonly string Url;
    public readonly string Filename;
    public readonly WebClient Client;
    public DownloadArgs(string u, string f, WebClient c)
    {
        Url = u;
        Filename = f;
        Client = c;
    }
}

const int MaxClients = 4;

// create a queue that allows the max items
BlockingCollection<WebClient> ClientQueue = new BlockingCollection<WebClient>(MaxClients);

// queue of urls to be downloaded (unbounded)
Queue<string> UrlQueue = new Queue<string>();

// create four WebClient instances and put them into the queue
for (int i = 0; i < MaxClients; ++i)
{
    var cli = new WebClient();
    cli.DownloadProgressChanged += DownloadProgressChanged;
    cli.DownloadFileCompleted += DownloadFileCompleted;
    ClientQueue.Add(cli);
}

// Fill the UrlQueue here

// Now go until the UrlQueue is empty
while (UrlQueue.Count > 0)
{
    WebClient cli = ClientQueue.Take(); // blocks if there is no client available
    string url = UrlQueue.Dequeue();
    string fname = CreateOutputFilename(url);  // or however you get the output file name
    cli.DownloadFileAsync(new Uri(url), fname, 
        new DownloadArgs(url, fname, cli));
}


void DownloadProgressChanged(object sender, DownloadProgressChangedEventArgs e)
{
    DownloadArgs args = (DownloadArgs)e.UserState;
    // Do status updates for this download
}

void DownloadFileCompleted(object sender, AsyncCompletedEventArgs e)
{
    DownloadArgs args = (DownloadArgs)e.UserState;
    // do whatever UI updates

    // now put this client back into the queue
    ClientQueue.Add(args.Client);
}

There's no need for explicitly managing threads or going to the TPL.

like image 57
Jim Mischel Avatar answered Sep 23 '22 10:09

Jim Mischel


I think you should look into using the Task Parallel Library, which is new in .NET 4 and is designed for solving these types of problems

like image 31
Jason Avatar answered Sep 24 '22 10:09

Jason