Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding all possible combinations whose sum is within certain range of target

So I talked with some colleagues and the problem I currently have is actually quite challenging. The context behind this problem has to do with mass spectrometry and assigning structure to different peaks that the software gives.

But to break it down into a optimization problem, I have a certain target value. I also have a list of various inputs whose sum I want to be as close as possible to the target as possible.

So as an example, here is what I have.

List of inputs: [18.01, 42.01, 132.04, 162.05, 203.08, 176.03]

Target value: 1800.71

I want to find all the possible combinations of the inputs listed whose sum is within 0.5 of 1800.71. So the sum can be anywhere between 1800.21 and 1801.21.

I already know two inputs could be:

[18.01, 162.05, 162.05, 162.05, 162.05, 162.05, 162.05, 162.05, 162.05, 162.05, 162.05, 162.05] **which gives a sum of 1800.59**

and

[18.01, 18.01, 203.08, 203.08, 203.08, 162.05, 203.08, 18.01, 18.01, 18.01, 18.01, 18.01, 18.01, 18.01, 18.01, 18.01, 18.01, 42.01, 162.05, 203.08, 203.08] **which gives a sum 1800.71**

I'm NOT looking to find the combination that gets me as close as possible to the target value; I am interested in ALL the possible combinations that are within 0.5 of the target value.

If anyone could help me with this problem I would greatly appreciate it!

like image 374
Alan Avatar asked Jun 27 '19 21:06

Alan


2 Answers

Instead of allowing multiple values, it would be a lot faster to just compute an integer factor for every value.

For your problem, I get 988 results.

import math
import time

def combinator(tolerance, target, inputs):

    # Special case for inputs with one element, speeds up computation a lot
    if len(inputs) == 1:
        number = inputs[0]
        result_min = int(math.ceil((target-tolerance)/number))
        result_max = int(math.floor((target+tolerance)/number))
        for factor in range(result_min, result_max+1):
            yield [factor]
        return

    # Special case for no inputs, just to prevent infinite recursion 
    if not inputs:
        return

    number = inputs[-1]
    max_value = int(math.floor((target + tolerance)/number))

    for i in range(max_value+1):
        for sub_factors in combinator(tolerance, target-i*number, inputs[:-1]):
            sub_factors.append(i)
            yield sub_factors

def main():
    inputs = [18.01, 42.01, 132.04, 162.05, 203.08, 176.03]
    target = 1800.71

    tolerance = 0.5

    t_start = time.perf_counter()
    results = list(combinator(tolerance, target, inputs))
    t_end = time.perf_counter()

    for result in results:
        result_str = ""
        result_value = 0
        for factor, value in zip(result, inputs):
            if not factor:
                continue
            if result_str != "":
                result_str += " + "
            result_str += "{}* {}".format(factor, value)
            result_value += factor*value
        print("{:.2f}".format(result_value) + " =\t[" + result_str + "]") 

    print("{} results found!".format(len(results)))
    print("Took {:.2f} milliseconds.".format((t_end-t_start)*1000))

if __name__ == "__main__":
    main()
1801.00 =   [100* 18.01]
1800.96 =   [93* 18.01 + 3* 42.01]
1800.92 =   [86* 18.01 + 6* 42.01]
...
1800.35 =   [5* 18.01 + 3* 42.01 + 9* 176.03]
1800.33 =   [2* 42.01 + 1* 132.04 + 9* 176.03]
1800.35 =   [3* 18.01 + 1* 162.05 + 9* 176.03]
988 results found!
Took 11.48 milliseconds.

I also reimplemented the same algorithm in Rust.

Performance on your problem:

  • Python: ~12 ms
  • Rust: ~0.7 ms

Here is the code:

use std::time::Instant;

fn combinator(tolerance : f32, target: f32, inputs: &[f32]) -> Vec<Vec<i32>>{

    let number = match inputs.last() {
        Some(i) => i,
        None => return vec![]
    };

    if inputs.len() == 1 {
        let result_min = ((target-tolerance)/number).ceil() as i32;
        let result_max = ((target+tolerance)/number).floor() as i32;
        return (result_min..=result_max).map(|x| vec![x]).collect();
    }

    let max_value = ((target + tolerance)/number).floor() as i32;

    let mut results = vec![];
    for i in 0..=max_value {
        for mut sub_factors in combinator(tolerance, target - i as f32 * number, &inputs[..inputs.len()-1]) {
            sub_factors.push(i);
            results.push(sub_factors);
        }
    }

    results
}

fn print_result(factors: &[i32], values: &[f32]){
    let sum : f32 = factors.iter()
        .zip(values.iter())
        .map(|(factor,value)| *factor as f32 * *value)
        .sum();
    println!("{:.2} =\t[{}]", sum,
             factors.iter()
                    .zip(values.iter())
                    .filter(|(factor, _value)| **factor > 0)
                    .map(|(factor, value)| format!("{}* {}", factor, value))
                    .collect::<Vec<String>>()
                    .join(", "));
}

fn main() {
    let inputs = vec![18.01, 42.01, 132.04, 162.05, 203.08, 176.03];
    let target = 1800.71;

    let tolerance = 0.5;

    let t_start = Instant::now();
    let results = combinator(tolerance, target, &inputs);
    let duration = t_start.elapsed().as_micros() as f64;

    for result in &results {
        print_result(&result, &inputs);
    }

    println!("{} results found!", results.len());
    println!("Took {} milliseconds", duration / 1000.0);
}
1801.00 =   [100* 18.01]
1800.96 =   [93* 18.01, 3* 42.01]
1800.92 =   [86* 18.01, 6* 42.01]
...
1800.35 =   [5* 18.01, 3* 42.01, 9* 176.03]
1800.33 =   [2* 42.01, 1* 132.04, 9* 176.03]
1800.35 =   [3* 18.01, 1* 162.05, 9* 176.03]
988 results found!
Took 0.656 milliseconds

Also, just for fun, those are the exact solutions to your problem. There are 5 of them.

1800.71 =   [12* 18.01, 1* 42.01, 2* 162.05, 6* 203.08]
1800.71 =   [13* 18.01, 2* 42.01, 2* 132.04, 6* 203.08]
1800.71 =   [16* 18.01, 7* 42.01, 6* 203.08]
1800.71 =   [52* 18.01, 1* 42.01, 1* 132.04, 1* 162.05, 3* 176.03]
1800.71 =   [54* 18.01, 4* 42.01, 1* 132.04, 3* 176.03]
like image 177
Finomnis Avatar answered Sep 28 '22 16:09

Finomnis


Another answer in the same vein as the existing fine answers. I found it simpler to use a range instead of a target + tolerance, and use a trivial (unoptimized) recursive solution, which seems fast enough to find the ~1000 answers to your use case.

Changing to use generators/yield or to optimize the single value case did not change the time taken for all results, though you may find it useful if you have a pipeline.

def fuzzy_coins(vals, lower, upper):
    '''
    vals: [Positive]
    lower: Positive
    upper: Positive
    return: [[Int]]
    Returns a list of coefficients for vals such that the dot
    product of vals and return falls between lower and upper.
    '''
    ret = []
    if not vals:
        if lower <= 0 <= upper:
            ret.append(())
    else:
        val = vals[-1]
        for i in xrange(int(upper / val) + 1):
            for sub in fuzzy_coins(vals[:-1], lower, upper):
                ret.append(sub + (i,))
            lower -= val
            upper -= val
    return ret

Even so, this takes ~100ms in python 2.7 and 3.6

[('1800.33', (0, 2, 1, 0, 0, 9)),
 ('1800.35', (3, 0, 0, 1, 0, 9)),
 ('1800.35', (5, 3, 0, 0, 0, 9)),
 ('1800.38', (0, 10, 0, 2, 0, 6)),
 ('1800.38', (1, 11, 2, 0, 0, 6)),
...
 ('1800.92', (86, 6, 0, 0, 0, 0)),
 ('1800.94', (88, 2, 1, 0, 0, 0)),
 ('1800.96', (91, 0, 0, 1, 0, 0)),
 ('1800.96', (93, 3, 0, 0, 0, 0)),
 ('1801.00', (100, 0, 0, 0, 0, 0))]
Took 0.10885s to get 988 results

e.g. usage:

from __future__ import print_function
import pprint
import time


def main():
    vals = [18.01, 42.01, 132.04, 162.05, 203.08, 176.03]
    target = 1800.71
    fuzz = .5

    lower = target - fuzz
    upper = target + fuzz
    start = time.time()
    coefs = fuzzy_coins(vals, lower, upper)
    end = time.time()
    pprint.pprint(sorted(
        ('%.2f' % sum(c * v for c, v in zip(coef, vals)), coef)
        for coef in coefs
    ))
    print('Took %.5fs to get %d results' % (end - start, len(coefs)))
like image 31
Cireo Avatar answered Sep 28 '22 15:09

Cireo