Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP: Finding a set of numbers in a database that sums up to a particular number

Firstly, i am a php newbie... so i still code and understand php procedurally. that said,

i have a collection of numbers (amount) stored in a database.

Question: Using PHP and mySQL,

  1. Whats the best way to spool this info out of a database such that the amount will be associated with its transaction ID

  2. Most importantly, i need to find a matching set of numbers in the db that equals a sum of 29.

Below is the Transaction Table , Transaction_tlb, for my Database mydb

    Transaction_ID |     Name         |       Date      | Amount
    ---------------|------------------|-----------------|------------ 
    11012          | Jonathan May     |   6/12/2016     |     84
    21012          | John Pedesta     |   6/12/2016     |     38
    31012          | Mary Johnson     |   1/01/2017     |     12
    41012          | John Johnson     |   8/01/2017     |     13
    51012          | Keith Jayron     |   8/01/2017     |     17
    61012          | Brenda Goldson   |   8/01/2017     |     2
    71012          | Joshua Traveen   |   8/01/2017     |     78
    81012          | Remy ma Goldstein|   8/01/2017     |     1
    91012          | Barbie Traveen   |   8/01/2017     |     1

Now, i have an idea..but its not efficient. I am going to try every possible case. meaning if i have n values to check, the time complexity is going to be about 2^n. this is highly inefficient (plus, i dont even know if my code makes any sense. (see below)

I saw a similar example in this YouTube video: https://www.youtube.com/watch?v=XKu_SEDAykw&t

but, Im not sure exactly how to write the code in php.

The code:

<?php
  if (!mysql_connect("localhost", "mysql_user", "mysql_password") || !mysql_select_db("mydb")) {
      die("Could not connect: " . mysql_error()); } //End DB Connect

  $capacity = 29; //Knapsack Capacity or Sum

  //Select Transact ID and Value from the Database where Amount is <= Capacity
  $fetchQuery = "SELECT 'Transaction_ID', 'Amount' FROM 'Transaction_tlb' WHERE 'Amount' <= $capacity"; 

  $components = array(); //new array to hold components

  if ($queryResults = mysql_query($fetchQuery)) {

     //check if data was pulled
     if (mysql_num_row($queryResults) != NULL) {
        while ($row = mysqli_fetch_assoc($queryResults) {
           $components[$row['Transaction_ID']] = $row['Amount'];
        }
     }
  }

  /* Correct me if i am wrong, but, Components associative array Should be something like
  $components = array('11012'=> 84, '21012'=> 38, '31012'=> 12, '41012'=> 13, '51012'=> 17, 
                      '61012'=> 2, '71012'=> 78, '81012'=> 1, '91012'=> 1);
  */

  $components = asort($components) // sort array in ascending order
  $componentCount = count($component)


  function match ($componentCount, $capacity) {
              $temp = match (($componentCount - 1), $capacity);
              $temp1 = $component[$componentCount] + match (($componentCount - 1), ($capacity - $component[$componentCount]));
              $result = max($temp, $temp1);
         return $result;
         }
}?>

can anyone please point me in the right direction? this code doesn work... and even if it works... the method is not efficient at all. what happens when Ive got 3 million records to work with? i need help please.

like image 456
Tamara Avatar asked Mar 05 '17 15:03

Tamara


2 Answers

You can formulate your problem in terms of the 0/1 Knapsack problem. Ready-to-use implementation in PHP is available.

Using the function knapSolveFast2 defined in the linked page, one could proceed as in the example below. The idea here is that you set the "weights" entering the Knapsack algorithm equal to the values themselves.

$components = array(84, 38, 12, 13, 17, 2, 78, 1, 1);

$m = array();
list($m4, $pickedItems) = knapSolveFast2($components, $components, sizeof($components)-1, 29, $m);

echo "sum: $m4\n";
echo "selected components:\n";
foreach($pickedItems as $idx){
    echo "\t$idx --> $components[$idx]\n";
}

which yields:

sum: 29
selected components:
    2 --> 12
    4 --> 17 

Notes:

  • you could modify your SQL query in order to skip rows with amount larger than the required sum (29)
  • the function above will pick one solution (assuming that it exists), it won't provide all of them
  • one should check whether the return value $m4 is indeed equal to the specified sum (29) - as the algorithm works, the specified amount is only the upper limit which is not guaranteed to be attained (for example for 37 instead of 29, the return value is only 34 since there is no combination of the input numbers the sum of which would yield 37)
like image 182
ewcz Avatar answered Oct 30 '22 20:10

ewcz


This is really a knapsack problem, but I will try to give a full solution which isn't optimal, but illustrates a full strategy for solving your problem.

First of all, you can do this with just one iteration over the array of numbers with no recursion and no pre-sorting needed. Dynamic programming is all you need, keeping track of all previously possible partial-sum 'paths'. The idea is somewhat similar to your described recursive method, but we can do it iteratively and without presorting.

Assuming an input array of [84, 38, 12, 13, 17, 2, 78, 1, 1] and a target of 29, we loop over the numbers like so:

* 84 - too big, move on
* 38 - too big, move on
* 12 - gives us a subtarget of 29-12 = 17
            subtargets:
              17 (paths: 12)
* 13 - gives us a subtarget of 29-13=16
            subtargets:
              16 (paths: 13)
              17 (paths: 12)
* 17 - is a subtarget, fulfilling the '12' path;
   and gives us a subtarget of 29-17=12
            subtargets:
              12 (paths: 17)
              16 (paths: 13)
              17 (paths: 12)
            solutions:
              12+17
etc.

The trick here is that while looping over the numbers, we keep a lookup table of subTargets, which are the numbers which would give us a solution using one or more combinations ('paths') of previously seen numbers. If a new number is a subTarget, we add to our list of solutions; if not then we append to existing paths where num<subTarget and move on.

A quick and dirty PHP function to do this:

// Note: only positive non-zero integer values are supported
// Also, we may return duplicate addend sets where the only difference is the order
function findAddends($components, $target)
{
    // A structure to hold our partial result paths
    // The integer key is the sub-target and the value is an array of string representations
    // of the 'paths' to get to that sub-target. E.g. for target=29
    // subTargets = {
    //   26: { '=3':true },
    //   15: { '=12+2':true, '=13+1':true }
    // }
    // We are (mis)using associative arrays as HashSets
    $subTargets = array();

    // And our found solutions, stored as string keys to avoid duplicates (again using associative array as a HashSet)
    $solutions = array();

    // One loop to Rule Them All
    echo 'Looping over the array of values...' . PHP_EOL;
    foreach ($components as $num) {
        echo 'Processing number ' . $num . '...' . PHP_EOL;

        if ($num > $target) {
            echo $num . ' is too large, so we skip it' . PHP_EOL;
            continue;
        }

        if ($num == $target) {
            echo $num . ' is an exact match. Adding to solutions..' . PHP_EOL;
            $solutions['='.$num] = true;
            continue;
        }

        // For every subtarget that is larger than $num we get a new 'sub-subtarget' as well
        foreach ($subTargets as $subTarget => $paths) {
            if ($num > $subTarget) { continue; }

            if ($num == $subTarget) {
                echo 'Solution(s) found for ' . $num . ' with previous sub-target. Adding to solutions..' . PHP_EOL;
                foreach ($paths as $path => $bool) {
                    $solutions[$path . '+' . $num] = true;
                }
                continue;
            }

            // Our new 'sub-sub-target' is:
            $subRemainder = $subTarget-$num;
            // Add the new sub-sub-target including the 'path' of addends to get there
            if ( ! isset($subTargets[$subRemainder])) { $subTargets[$subRemainder] = array(); }

            // For each path to the original sub-target, we add the $num which creates a new path to the subRemainder
            foreach ($paths as $path => $bool) {
                $subTargets[$subRemainder][$path.'+'.$num] = true;
            }
        }

        // Subtracting the number from our original target gives us a new sub-target
        $remainder = $target - $num;

        // Add the new sub-target including the 'path' of addends to get there
        if ( ! isset($subTargets[$remainder])) { $subTargets[$remainder] = array(); }
        $subTargets[$remainder]['='.$num] = true;
    }
    return $solutions;
}

Run the code like so:

$componentArr = array(84, 38, 12, 13, 17, 2, 78, 1, 1);
$addends = findAddends($componentArr, 29);

echo 'Result:'.PHP_EOL;
foreach ($addends as $addendSet => $bool) {
    echo $addendSet . PHP_EOL;
}

which outputs:

Looping over the array of values...
Processing number 84...
84 is too large, so we skip it
Processing number 38...
38 is too large, so we skip it
Processing number 12...
Processing number 13...
Processing number 17...
Solution(s) found for 17 with previous sub-target. Adding to solutions..
Processing number 2...
Processing number 78...
78 is too large, so we skip it
Processing number 1...
Processing number 1...
Solution(s) found for 1 with previous sub-target. Adding to solutions..

Result:
=12+17
=12+13+2+1+1
like image 24
Jens Roland Avatar answered Oct 30 '22 21:10

Jens Roland