Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to make nested for-loop more efficient?

I have the following code, which I would like to speed up:

foreach (var workflow in closedWorkflows)
{
    var w = relatedWorkflows.FirstOrDefault(r => r.Fingerprint != workflow.Fingerprint && r.TradeDate == workflow.TradeDate);
    if (w != null)
    {
        relatedWorkflows.Add(workflow);
    }
}

relatedWorkflows and closedWorkflows are list of same type of objects. I thought about creating lookup or dictionary with an anonymous class or tuple on Fingerprint and TradeDate, but one check is for equality and other one is for inequality. Would it be a good approach to create lookups on TradeDate, and then create lookups on Fingerprint for each lookup list of TradeDate ?

Updated:

A BIG thanks to @Dmitry Bychenko.

In my test the difference is 41486 ms vs 26ms !!!

like image 312
Anand Avatar asked Sep 18 '25 11:09

Anand


1 Answers

The main problem with performance is with

var w = relatedWorkflows.FirstOrDefault(...)

we scan relatedWorkflows again and again, that's why we have O(closedWorkflows.Count * relatedWorkflows.Count) time complexity. We can get rid of relatedWorkflows.Count with the help of hash-based collections (dictionary and hashset), e.g.

var dict = workflow
  .GroupBy(item => item.TradeDate, item => item.FingerPrint)
  .ToDictionary(group => group.Key, group => group.ToHashSet());

foreach (var workflow in closedWorkflows) {
  // If we have a date, but NOT print we add it into relatedWorkflows
  if (dict.TryGetValue(workflow.TradeDate, out var prints) && 
      prints.Add(workflow.Fingerprint)) {
    relatedWorkflows.Add(workflow);
  }
}
like image 174
Dmitry Bychenko Avatar answered Sep 21 '25 03:09

Dmitry Bychenko