Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding a good match of two lists of numbers

Tags:

c#

algorithm

I got two sets of numbers, with SET2 having more items in it usually. It's guaranteed that the count of SET2 is equal or greater than the count of SET1. Acutally, since the order matters the input are rather lists than sets.

My goal is to combine (sum up) / reorder the numbers from SET2 to make it as similar to SET1 as possible. I define the similarity as the sum of the deviations at every position. See this post for the way I calculate the similarity. The smaller the sum, the better.

My first approach was to try all combinations and pick the best one. This only works for quite small sets (esp. the second one). See this post and the answer from Rawling. Is there a smarter way to get a good combination? I definitely don't need the best one. A good one would be fine as a result. Obviously a sets with empty subsets are nonsense. Extremly unbalanced sets don't seem to be very promising to me. SET1 tends to have about 8 but can have up to 18 entries. SET2 has often a count of more than 10 (up to 35). The sum of the numbers in the two sets is equal (except for rounding errors).

Here is an example with good and bad results (not all possible ones):

SET1 = { 272370, 194560, 233430 }; SET2 = { 53407.13, 100000, 365634.03, 181319.07 }

      272370            |      194560          |        233430 
---------------------------------------------------------------------
     365634.03         |  100000 + 53407.13   |      181319.07       (best match)
     365634.03         |     181319.07        |  100000 + 53407.13   (good)
     365634.03         |      100000          |181319.07 + 53407.13  (ok)
      53407.13          |365634.03 + 100000    |      181319.07       (bad)
      53407.13          |365634.03 + 181319.07 |        100000        (bad)
.                 |365634.03 + 181319.07 | 53407.13 + 100000    (invalid)
53407.13 + 100000 |365634.03 + 181319.07 |                      (invalid)

Please let me know if I forgot to describe a premiss or my description is unclear or even faulty. I'm also happy to provide another example.

Thanks in advance!

like image 573
Toby Avatar asked Nov 13 '22 12:11

Toby


1 Answers

Heuristic, which should work quite good:

1. list<int> set1, set2;
2. sort(set2) // decreasing, set2[0] would be the greatest value in set2
3. struct set1item = {set1index, value, list<int> chosen}
4. prepare list<set1item> set1items from set1 //(index = index in set1 list, value = set1[index] and chosen = null)
5. put set1items to some priorityqueue pq // ordered by value
6. for each set2item in set2{
7.     item = pq.first()
8.     item.chosen.add(set2item);
9.     item.value -= set2item;
10.    pq.updateFirst(item)
11.}

It would work like: iterate through set2 from highest to lowest, get actual highest element from set1, decrease it by element got from set2 and add this element from set2 to element from set1 results.

You must remember to check if all elements from set1 has no empty result.

Example1: Set1 = {20, 9, 7, 3}, Set2 = {7, 6, 6, 4, 2, 2, 2, 2, 2, 2, 2, 2}.

iter1: fromSet2 = 7, Set1 = {20:{}, 9:{}, 7:{}, 3:{}}, fromSet1=20. Decreasing 20 by 7 and adding 7 to its result. Updated: Set1 = {13:{7}, 9:{}, 7:{}, 3:{}}.

iter2: fromSet2 = 6, Set1 = {13:{7}, 9:{}, 7:{}, 3:{}}, fromSet1=13. Decreasing 13 by 6 and adding 6 to its result. Updated: Set1 = {7:{7, 6}, 9:{}, 7:{}, 3:{}}.

iter3: fromSet2 = 6, Set1 = {7:{7, 6}, 9:{}, 7:{}, 3:{}}, fromSet1=9. Decreasing 9 by 6 and adding 6 to its result. Updated: Set1 = {7:{7, 6}, 3:{6}, 7:{}, 3:{}}.

iter4: fromSet2 = 4, Set1 = {7:{7, 6}, 3:{6}, 7:{}, 3:{}}, fromSet1=7. Decreasing 7 by 4 and adding 4 to its result. Updated: Set1 = {3:{7, 6, 4}, 3:{6}, 7:{}, 3:{}}.

iter5: fromSet2 = 2, Set1 = {3:{7, 6, 4}, 3:{6}, 7:{}, 3:{}}, fromSet1=7. Decreasing 7 by 2 and adding 2 to its result. Updated: Set1 = {3:{7, 6, 4}, 3:{6}, 5:{2}, 3:{}}.

...

like image 194
Ari Avatar answered Nov 15 '22 04:11

Ari