Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Combine two lists into one based on property

I would like to ask whether there's an elegant and efficient way to merge two lists of MyClass into one?

MyClass looks like this:

  • ID: int
  • Name: string
  • ExtID: int?

and the lists are populated from different sources and objects in lists do share ID, so it looks like that:

MyClass instance from List1
ID = someInt
Name = someString
ExtID = null

And MyClass instance from List2

ID = someInt (same as List1)
Name = someString (same as List1)
ExtID = someInt

What I basically need is to combine these two lists, so the outcome is a list containing:

ID = someInt (from List1)
Name = someString (from List1)
ExtID = someInt (null if no corresponding item - based on ID - on List2)

I know I can do this simply using foreach loop, but I'd love to know if there's more elegant and maybe preferred (due to performance, readability) method?

like image 955
pzaj Avatar asked Oct 30 '22 16:10

pzaj


1 Answers

There are many approaches depending on what is the priority, ex. Union + Lookup:

//this will create a key value pairs: id -> matching instances
var idMap = list1.Union(list2).ToLookup(myClass => myClass.ID);
//now just select for each ID the instance you want, ex. with some value
var mergedInstances = idMap.Select(row => 
      row.FirstOrDefault(myClass => myClass.ExtId.HasValue) ?? row.First());

The benefit of above is that it will work with whatever amount of whatever lists even if they contain many duplicated isntances and then you can easily modify the conditions of merging

A small improvement would be to extract a method to merge instances:

MyClass MergeInstances(IEnumerable<MyClass> instances){
     return instances.FirstOrDefault(myClass => myClass.ExtId.HasValue) 
          ?? instances.First(); //or whatever else you imagine
}

and now just use it in the code above

 var mergedInstances = idMap.Select(MergeInstances);

Clean, flexible, simple, no additional conditions. Performance wise not perfect, but who cares.

Edit: since performance is the priority, some more options

  1. Do a lookup like above but only for the smaller list. Then iterate through the bigger and do the needed changes O(m log m) + O(n). m - smaller list size, n- bigger list size - should be fastest.

  2. Order both lists by elements ids. Create a for loop, that iterates through both of them keeping current index to the element with same id for both lists. Move index to the next smallest id found in both list, if one has it only, move only this on. O(n log n) + O(m log m) + O(n);

like image 117
mikus Avatar answered Nov 13 '22 07:11

mikus