Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding the best possible subset combinations of numbers to reach a given sum or closest to it

So, I have this problem I need to solve, apparently this is called Subset Sum Problem, except I need to get the subset not only when is exact to the given number, but the closest in case there is no exact sum that reaches the given number, it shouldn’t go over the reference number, only below, also if there are more than two possible subsets with the same result, I'd like to get the subset with the better distribution, from the highest to lowest number in the array as preferred, and limiting each subset to not overpass 10 times the same number, repeating is allowed, for example:

Here is the array with the predefined values:

let num = [64.20, 107, 535, 1070];

and a given number:

let investment = 806.45

One possible solution would be:

[0, 2, 1, 0] // this sums up to 749 (since there is no way to get to 806.45 with the given array)

Please notice this result is referring to how many times each value in nums is allowed to reach the sum:

But a better solution would be:

[4, 5, 0, 0] // this sums up to 791.80 (since there is no way to get to 806.45 with the given array)

And an even better solution (because takes in consideration the higher values over the lower ones first)

[4, 0, 1, 0] // this sums up to 791.80 also but you can see it's taking a higher value when possible.

Another important restriction would be that should never give a negative result.

So far i have tried like this (in VueJS):

getPackages(){
      let investment = 806.45;
      const num = [64.20, 107, 535, 1070]
      let a, b, c, d;
      let results = [];

      a = investment / num[0] >= 0 ? (investment/num[0]) : 0;
      b = investment / num[1] >= 0 ? (investment/num[1]) : 0;
      c = investment / num[2] >= 0 ? (investment/num[2]) : 0;
      d = investment / num[3] >= 0 ? (investment/num[3]) : 0;

      let dResult = [], cResult = [], bResult = [], aResult = [];

      for (let i = 0; i <= d; i++){
        if (i>0){
          dResult.push((i * num[3]))
        }
      }

      for (let i = 0; i <= c; i++){
        if (i>0){
          cResult.push((i * num[2]))
        }
      }

      for (let i = 0; i <= b; i++){
        if (i>0){
          bResult.push((i * num[1]))
        }
      }

      for (let i = 0; i <= a; i++){
        if (i>0){
          aResult.push((i * num[0]))
        }
      }

      let aResultCoincidences = [];
      let bResultCoincidences = [];
      let cResultCoincidences = [];
      let dResultCoincidences = [];

      bResult.forEach(value => {
        aResult.findIndex(item => item === value) > 0 ? bResultCoincidences.push(aResult.findIndex(item => item === value)) : null
      })

      aResult.splice(0, Math.max(...bResultCoincidences) + 1)

      cResult.forEach(value => {
        bResult.findIndex(item => item === value) > 0 ? cResultCoincidences.push(bResult.findIndex(item => item === value)) : null
      })

      bResult.splice(0, Math.max(...cResultCoincidences) + 1)

      dResult.forEach(value => {
        cResult.findIndex(item => item === value) > 0 ? dResultCoincidences.push(cResult.findIndex(item => item === value)) : null
      })

      cResult.splice(0, Math.max(...dResultCoincidences) + 1)

      this.package1 = aResult.length
      this.package2 = bResult.length
      this.package3 = cResult.length
      this.package4 = dResult.length

    },

What happens in my approach is that I try to get all possible results from each multiplication, and then I remove the ones that matches between the arrays I made with this combination, to finally get the result, but this is not well optimized, and I'm sure there is probably a better solution to this problem.

Anyway ignore the vuejs implementation, that's only to set the values in the DOM.

***ES6 solution would be awesome.

CodeSandbox to play around: CODESANDBOX LINK

Thanks in advance.

like image 951
NaturalDevCR Avatar asked Mar 22 '21 21:03

NaturalDevCR


2 Answers

Here's an approach. I didn't look yours over carefully, and don't know if this offers any advantages over it.

It is a brute-force approach, simply taking the cross-product of the potential individual values for each category, summing the totals, and then reducing the list to find all the options closest to the target. I originally wrote this slightly differently, trying to capture the closest value whether over or under the target.

Here's an implementation with several helper functions:

const sum = (ns) =>
  ns .reduce ((a, b) => a + b, 0)

const crossproduct = (xss) => 
  xss.reduce((xs, ys) => xs.flatMap(x => ys.map(y => [...x, y])), [[]])

const range = (lo, hi) =>
  [...Array(hi - lo)] .map ((_, i) => lo + i)

const call = (fn, ...args) => 
  fn (...args)

const closestSums = (t, ns) => 
  call (
    (opts = crossproduct (ns .map (n => range (0, 1 + Math .ceil (t / n))))) => 
      opts .map (xs => [xs, sum (xs .map ((x, i) => x * ns [i]))])
      .reduce (
        ({best, opts}, [opt, tot]) => 
          call (
            (diff = t - tot) => 
              diff >= 0 && diff < best
                ? {best: diff, opts: [opt]}
              : diff >= 0 && diff == best
                ? {best, opts: [...opts, opt]}
              : {best, opts}
          ),
          {best: Infinity, opts: []}
        ) .opts
  )

const byHigher = (as, bs) =>
  as .reduceRight ((r, _, i) => r || (as[i] < bs[i] ? 1 : as[i] > bs[i] ? -1 : 0), 0)

const closestCounts = (t, ns) => 
  closestSums (t * 100, ns .map (n => 100 * n)) 
    .filter (cs => cs.every(c => c <= 10))
    .sort (byHigher)

console .log (
  closestCounts (806.45, [64.20, 107, 535, 1070]),
  closestCounts (791.8, [64.20, 107, 535, 1070])  // exact match
)
.as-console-wrapper {max-height: 100% !important; top: 0}

Note that the wrapper multiplies everything by 100 to rid us of decimals and the potential floating point rounding errors. If that won't do, it would be easy enough to add some tolerance for matching.

The final sorting is naïve, assuming that your values are in ascending numeric order. If not, that sorter would have to get more sophisticated.

As I said, this is not likely to be efficient, and it might not gain anything on your version, but it's at least a different approach.

like image 83
Scott Sauyet Avatar answered Sep 20 '22 10:09

Scott Sauyet


Lets first consider the 2-varible case. We want to find two numbers x1, x2 such that

f(x1,x2) = 64.20 * x1 + 107 * x2 - 806.45

enter image description here

If we allow x1, x2 to be real numbers then this is just the equation of a line. In the problem we only want positive integer solutions, i.e. the grid points. I've badly drawn the grid points with those above the line in red and those below the line in blue. The problem is then finding grid point closest to the line.

Note that there are a lot of points we never need to consider, only the red and blue grid point are possible candidates.

enter image description here

Generating the coloured grid points is quite simple. We could loop through the x1 values from 0 to 13, calculate the real x2 value by solving the equation

x2 =  (806.45 - 64.20 * x1)/107

and finding the floor(x2) and ceil(x2). Slightly better is looping through the x2 which run from 0 to 8 solving for x1

x1 =  (806.45 - 107 * x2)/64.20.

Another approach might be some kind of zero following routine. If you are at a given grid point (x1,x2) calculate f(x1,x2) if it is less than 0 we need to consider (x1+1,x2) or (x1,x2+1), if f(x1,x2) is greater than zero consider (x1-1,x2) or (x1,x2-x1). I don't think the complication of this approach really brings any benefit.

If we now move to 3D, we have an equation in three variables, x1, x2, x3

f(x1,x2,x3) = 64.20 * x1 + 107 * x2 + 535 * x3 - 806.45

This defines a plane in 3D, requiring all variables to be non-negative restricts it to a triangular part of the plane.

enter image description here

To find candidate points here we could loop through possible interger pairs (x1,x2), then find the exact x3 value

x3 = (806.45 - 64.20 * x1 - 107 * x2)/ 535

and its floor and ceiling. The candidate (x1,x2) only line in a candidate triangle so we can use the following procedure.

// First find max possible x1

x1_max = ceil(806.45/64.20)
for(x1 = 0; x1< x1_max; ++x1) {
   // for a given x1 find the maximum x2
   x2_max = ceil((806.45 - 64.20*x1)/107)
   for(x2=0;x2<x2_max;++x2) {
       // for a given (x1,x2) find the exact x3
       x3 = (806.45 - 64.20 * x1 - 107 * x2)/ 535
       // and its floor and ceiling
       x3l = floor(x3); x3h = ceil(x3);
       add_to_candidates(x1,x2,x3l);
       add_to_candidates(x1,x2,x3h);
   }
}

Once you can candidates simply select the one with the smallest absolute value.

A similar idea would extend to more variables.

like image 28
Salix alba Avatar answered Sep 22 '22 10:09

Salix alba