Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ subset sum 2^n/recursion bug/clarification

This is not homework, I don't have money for school so I am teaching myself whilst working shifts at a tollbooth on the highway (long nights with few customers).

I am trying to implement a simple subset sum algorithm which, given an array of integers, returns a subset of it whose sum equals a desired sum, reporting how many invocations it took to find it.

I did an implementation in Java using Collections but that was very bloated code, even if I was able to return all sets adding up to the desired number as well as telling the function to stop at the first match or not.

The problem I have with this code is as follows: rather than running in 2^n time (that's correct for such an implementation when no results are found, is it not?) it runs in [2^(n+1)]-1 time; O(2^n) as pointed out by a comment. I can see why that is given that I check for (runningTotal == targetTotal) on a deeper level than I could, essentially adding the extra depth myself, don't I? I was trying to model the base case as cleanly as possible, let me know if you detect any "code smells". Should I be breaking as soon as I see that (runningTotal + consider) == targetTotal?

Note: I don't think this belongs to "Code Review" as I am asking about a particular code line, not the overall approach (if I need to change the approach, so be it, I am doing this to learn).

Here my attempt (is this "passable" C/C++ apart from the lack of optimisation mentioned above?):

#include <iostream>
using namespace std;
bool setTotalling(int chooseFrom[], int nChoices, int targetTotal,
    int chooseIndex, int runningTotal, int solutionSet[], int &solutionDigits,
    int &nIterations) {
  nIterations++;
  if (runningTotal == targetTotal) {
    return true;
  }
  if (chooseIndex >= nChoices) {
    return false;
  }
  int consider = chooseFrom[chooseIndex];
  if (setTotalling(chooseFrom, nChoices, targetTotal, chooseIndex + 1,
      runningTotal + consider, solutionSet, solutionDigits, nIterations)) {
    solutionSet[solutionDigits++] = consider;
    return true;
  }
  if (setTotalling(chooseFrom, nChoices, targetTotal, chooseIndex + 1,
      runningTotal, solutionSet, solutionDigits, nIterations)) {
    return true;
  }
  return false;
}
void testSetTotalling() {
  int chooseFrom[] = { 1, 2, 5, 9, 10 };
  int nChoices = 5;
  int targetTotal = 23;
  int chooseIndex = 0;
  int runningTotal = 0;
  int solutionSet[] = { 0, 0, 0, 0, 0 };
  int solutionDigits = 0;
  int nIterations = 0;
  cout << "Looking for a set of numbers totalling" << endl << "--> "
      << targetTotal << endl << "choosing from these:" << endl;
  for (int i = 0; i < nChoices; i++) {
    int n = chooseFrom[i];
    cout << n << ", ";
  }
  cout << endl << endl;
  bool setExists = setTotalling(chooseFrom, nChoices, targetTotal, chooseIndex,
      runningTotal, solutionSet, solutionDigits, nIterations);
  if (setExists) {
    cout << "Found:" << endl;
    for (int i = 0; i < solutionDigits; i++) {
      int n = solutionSet[i];
      cout << n << ", ";
    }
    cout << endl;
  } else {
    cout << "Not found." << endl;
  }
  cout << "Iterations: " << nIterations << endl;
}
int main() {
  testSetTotalling();
  return 0;
}
like image 489
Robottinosino Avatar asked Aug 20 '12 05:08

Robottinosino


1 Answers

The point is how to count an "iteration". Suppose you have the simple case with n=1 targeting a sum that is not zero and not the element you have.

You call the function and this immediately increments the counter, then you get to the bifurcation and the function calls itself two times (one considering the element and one without considering the element). Each of these call will count 1 so you will end up with a total counter of 3.

I don't see anything wrong with this...

You could add a special check to repeat the test and avoiding call if the numb of remaining choices is zero, but this would require repeating the check. Doing the end check only at recursive call place would not take in account that the function could be called with zero choiches directly. Basically you are "inlining" level 0... but then why stop at level zero and not inlining also level 1?

If you are looking for speedups note that (assuming all elements are non-negative) if you know that adding all the remaining available numbers still there's not enough to get to the target then you can avoid doing the check of all possible subset. By computing once the total of all remaining numbers from a given index to the end of the list of available elements (that's an O(n) computation) you can save (2^remaining) iterations. Also if the current sum is already too big there's not point in considering adding other elements either.

if (targetTotal > runningTotal)
    return false; // We already passed the limit
if (targetTotal - runningTotal > sumOfAllFrom[choseIndex])
    return false; // We're not going to make it

If you also sort the elements in decreasing order the above optimization can save a lot.

like image 167
6502 Avatar answered Oct 27 '22 11:10

6502