Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Populating List from Multiple Threads

I have a program that screen scrapes a handful of pages using HtmlAgilityPack and I want it to run faster, as the Load for the pages can take 1-2 seconds on occasion. Right now I have below code that does it sequentially.

  List<Projections> projectionsList = new List<Projections>();       
  for (int i = 532; i <= 548; i++)
            {
                doc = webGet.Load("http://myurl.com/projections.php?League=&Position=97&Segment=" + i + "&uid=4");
                GetProjection(doc, ref projectionsList, (i));
            }

Basically I wan to break out the code inside the loop to multiple threads and wait till all threads complete before executing. I would expect the List to be populated when complete. I realize Lists are not thread safe, but I am a little stuck on figuring out how to get around that.

like image 882
Isaac Levin Avatar asked Feb 10 '26 18:02

Isaac Levin


1 Answers

I suggest you to use a Parallel.For as the example of @Tgys, but you can use a ConcurrentBag collection, which is thread safe and you don't need to handle locks.

ConcurrentBag<Projections> projectionsList = new ConcurrentBag<Projections>();       
Parallel.For(532, 548 + 1, i => {
    var doc = webGet.Load("http://myurl.com/projections.php?League=&Position=97&Segment=" + i + "&uid=4");    
    GetProjection(doc, ref projectionsList, (i));
    }
});

Probably you need to change your GetProjection method. So check if my solution fit your needs.

See this link for more info about ConcurrentBag class.

like image 52
Hernan Guzman Avatar answered Feb 13 '26 09:02

Hernan Guzman