Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# / LINQ fastest way of comparing two lists and assigning value

I have made a code which basically compares two lists in C#. First list contains properties like this:

  • ItemID
  • TotalViews

First list lacks values for TotalViews so I'm assigning them from 2nd list which has these props:

  • ItemID
  • HitCount // this is property for TotalViews that needs to be assigned

The code is as following:

foreach (var item in parsedMerchantData)
{
    var itemInB = HitCountItemIDS.FirstOrDefault(x => x.ItemID == item.ItemID);
    if (itemInB != null)
    {
        if (itemInB.HitCount != -1)
        {
            item.TotalViews = itemInB.HitCount;
        }
        else
        {
            item.TotalViews = 0;
        }
    }
}

Is there any more efficient way to write this using LINQ or implementing a custom comparer which would work faster on larger lists that contains sometimes 100000 items in themselves?

like image 400
User987 Avatar asked Dec 18 '22 09:12

User987


2 Answers

This is like jdweng's answer, but slightly simpler and it won't throw an exception for missing item IDs:

var hitCountsById = HitCountItemIDS.ToDictionary(x => x.ItemID, x => x.HitCount);
foreach (var item in parsedMerchantData)
{
    int hitCount;
    // We don't care about the return value of TryGetValue here...
    hitCountsById.TryGetValue(item.ItemID, out hitCount);
    item.HitCount = hitCount == -1 ? 0 : hitCount;
}

This should be O(N+M) where N is the size of HitCountItemIDs and M is the size of parsedMerchantData... so should as the data gets bigger, it should grow more slowly than the merge-sort approach, and is definitely simpler code. (It doesn't require comparing item ID for ordering, either - just equality.)

like image 183
Jon Skeet Avatar answered Dec 24 '22 02:12

Jon Skeet


Here is the pseudo-code:

var arr1 = parsedMerchantData.OrderBy(x => x.ItemID).ToArray();
var arr2 = HitCountItemID.OrderBy(x => x.ItemID).ToArray();

var i, j = 0;
while(i + j < arr1.Length() + arr2.Length()) // or similar condition
{
    if (arr1[i].ItemID < arr2[j].ItemID) {
        if (i < arr1.Length() - 1) {
            i++;
        }
        continue;
    }

    if (arr1[i].ItemID > arr2[j].ItemID) {
        if (j < arr2.Length() - 1) {
            j++;
        }
        continue;
    }

    if (arr1[i].ItemID == arr2[j].ItemID) {
        arr1[i].TotalViews = arr2[j].HitCount != -1 ? arr2[j].HitCount : 0;
    }

    // Make sure you do not let i and j grow higher then lengths of arrays
}

The idea is to apply MergeSort algorithms. As for complexity, you spend O(n * log(n)) sorting each list then O(n) going trough them. The total is O(n * log(n)) and it is the fastest way I see.

like image 32
Dmytro Bogatov Avatar answered Dec 24 '22 00:12

Dmytro Bogatov