Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Better approach in management of multiple WebRequest

Tags:

c#

.net

I have an component that is processing multiple web requests each in separate thread. Each WebRequest processing is synchronous.

public class WebRequestProcessor:System.ComponentModel.Component
{
    List<Worker> tlist = new List<Worker>();
    public void Start()
    {
        foreach(string url in urlList){
            // Create the thread object. This does not start the thread.
            Worker workerObject = new Worker();
            Thread workerThread = new Thread(workerObject.DoWork);

            // Start the worker thread.
            workerThread.Start(url);
            tlist.Add(workerThread);
        }
    }
}

public class Worker
{
    // This method will be called when the thread is started.
    public void DoWork(string url)
    {
        // prepare the web page we will be asking for
        HttpWebRequest  request  = (HttpWebRequest) 
            WebRequest.Create(url);

        // execute the request
        HttpWebResponse response = (HttpWebResponse)
            request.GetResponse();

        // we will read data via the response stream
        Stream resStream = response.GetResponseStream();

        // process stream
    }
}

Now I have to find optimal way how to cancel all requests.

One way is to convert each synchronous WebRequest into async and use WebRequest.Abort to cancel processing.

Another way is to release thread pointers and allow all threads to die using GC.

like image 782
walter Avatar asked Jul 16 '11 20:07

walter


2 Answers

If you want to download 1000 files, starting 1000 threads at once is certainly not the best option. Not only it probably won't get you any speedup when compared with downloading just a few files at a time, it will also require at least 1 GB of virtual memory. Creating threads is expensive, try to avoid doing so in a loop.

What you should do instead is to use Parallel.ForEach() along with the asynchronous versions of the request and response operations. For example like this (WPF code):

private void Start_Click(object sender, RoutedEventArgs e)
{
    m_tokenSource = new CancellationTokenSource();
    var urls = …;
    Task.Factory.StartNew(() => Start(urls, m_tokenSource.Token), m_tokenSource.Token);
}

private void Cancel_Click(object sender, RoutedEventArgs e)
{
    m_tokenSource.Cancel();
}

void Start(IEnumerable<string> urlList, CancellationToken token)
{
    Parallel.ForEach(urlList, new ParallelOptions { CancellationToken = token },
                     url => DownloadOne(url, token));

}

void DownloadOne(string url, CancellationToken token)
{
    ReportStart(url);

    try
    {
        var request = WebRequest.Create(url);

        var asyncResult = request.BeginGetResponse(null, null);

        WaitHandle.WaitAny(new[] { asyncResult.AsyncWaitHandle, token.WaitHandle });

        if (token.IsCancellationRequested)
        {
            request.Abort();
            return;
        }

        var response = request.EndGetResponse(asyncResult);

        using (var stream = response.GetResponseStream())
        {
            byte[] bytes = new byte[4096];

            while (true)
            {
                asyncResult = stream.BeginRead(bytes, 0, bytes.Length, null, null);

                WaitHandle.WaitAny(new[] { asyncResult.AsyncWaitHandle,
                                           token.WaitHandle });

                if (token.IsCancellationRequested)
                    break;

                var read = stream.EndRead(asyncResult);

                if (read == 0)
                    break;

                // do something with the downloaded bytes
            }
        }

        response.Close();
    }
    finally
    {
        ReportFinish(url);
    }
}

This way, when you cancel the operation, all downloads are canceled and no new ones are started. Also, you probably want to set MaxDegreeOfParallelism of ParallelOptions, so that you aren't doing too many downloads at once.

I'm not sure what do you want to do with the files you are downloading, so using StreamReader might be a better option.

like image 97
svick Avatar answered Nov 05 '22 02:11

svick


I think the best solution is "Parallel Foreach Cancellation". Please check the following code.

  1. To implement a cancellation, you first make CancellationTokenSource and pass it to Parallel.ForEach through option.
  2. If you want to cancel, you can call CancellationTokenSource.Cancel()
  3. After the cancelling, OperationCanceledException will be occurred, which you need to handle.

There is a good article about Parallel Programming related to my answer, which is Task Parallel Library By Sacha Barber on CodeProject.

CancellationTokenSource tokenSource = new CancellationTokenSource();
ParallelOptions options = new ParallelOptions()
{
    CancellationToken = tokenSource.Token
};

List<string> urlList = null;
//parallel foreach cancellation
try
{
    ParallelLoopResult result = Parallel.ForEach(urlList, options, (url) =>
    {
        // Create the thread object. This does not start the thread.
        Worker workerObject = new Worker();
        workerObject.DoWork(url);
    });
}
catch (OperationCanceledException ex)
{
    Console.WriteLine("Operation Cancelled");
}

UPDATED

The following code is "Parallel Foreach Cancellation Sample Code".

class Program
{
    static void Main(string[] args)
    {
        List<int> data = ParallelEnumerable.Range(1, 10000).ToList();

        CancellationTokenSource tokenSource = new CancellationTokenSource();

        Task cancelTask = Task.Factory.StartNew(() =>
            {
                Thread.Sleep(1000);
                tokenSource.Cancel();
            });


        ParallelOptions options = new ParallelOptions()
        {
            CancellationToken = tokenSource.Token
        };


        //parallel foreach cancellation
        try
        {
            Parallel.ForEach(data,options, (x, state) =>
            {
                Console.WriteLine(x);
                Thread.Sleep(100);
            });
        }
        catch (OperationCanceledException ex)
        {
            Console.WriteLine("Operation Cancelled");
        }


        Console.ReadLine();
    }
}
like image 44
Jin-Wook Chung Avatar answered Nov 05 '22 03:11

Jin-Wook Chung