Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace duplicate values in array with new randomly generated values

I have below a function (from a previous question that went unanswered) that creates an array with n amount of values. The sum of the array is equal to $max.

function randomDistinctPartition($n, $max) {
  $partition= array();
  for ($i = 1; $i < $n; $i++) {
    $maxSingleNumber = $max - $n;
    $partition[] = $number = rand(1, $maxSingleNumber);
    $max -= $number;
  }
  $partition[] = $max;
  return $partition;
}

For example: If I set $n = 4 and $max = 30. Then I should get the following.

array(5, 7, 10, 8);

However, this function does not take into account duplicates and 0s. What I would like - and have been trying to accomplish - is to generate an array with unique numbers that add up to my predetermined variable $max. No Duplicate numbers and No 0 and/or negative integers.

like image 564
Russell Dias Avatar asked May 08 '10 15:05

Russell Dias


1 Answers

Ok, this problem actually revolves around linear sequences. With a minimum value of 1 consider the sequence:

f(n) = 1 + 2 + ... + n - 1 + n

The sum of such a sequence is equal to:

f(n) = n * (n + 1) / 2

so for n = 4, as an example, the sum is 10. That means if you're selecting 4 different numbers the minimum total with no zeroes and no negatives is 10. Now go in reverse: if you have a total of 10 and 4 numbers then there is only one combination of (1,2,3,4).

So first you need to check if your total is at least as high as this lower bound. If it is less there is no combination. If it is equal, there is precisely one combination. If it is higher it gets more complicated.

Now imagine your constraints are a total of 12 with 4 numbers. We've established that f(4) = 10. But what if the first (lowest) number is 2?

2 + 3 + 4 + 5 = 14

So the first number can't be higher than 1. You know your first number. Now you generate a sequence of 3 numbers with a total of 11 (being 12 - 1).

1 + 2 + 3 = 6
2 + 3 + 4 = 9
3 + 4 + 5 = 12

The second number has to be 2 because it can't be one. It can't be 3 because the minimum sum of three numbers starting with 3 is 12 and we have to add to 11.

Now we find two numbers that add up to 9 (12 - 1 - 2) with 3 being the lowest possible.

3 + 4 = 7
4 + 5 = 9

The third number can be 3 or 4. With the third number found the last is fixed. The two possible combinations are:

1, 2, 3, 6
1, 2, 4, 5

You can turn this into a general algorithm. Consider this recursive implementation:

$all = all_sequences(14, 4);
echo "\nAll sequences:\n\n";
foreach ($all as $arr) {
  echo implode(', ', $arr) . "\n";
}

function all_sequences($total, $num, $start = 1) {
  if ($num == 1) {
    return array($total);
  }
  $max = lowest_maximum($start, $num);
  $limit = (int)(($total - $max) / $num) + $start;
  $ret = array();
  if ($num == 2) {
    for ($i = $start; $i <= $limit; $i++) {
      $ret[] = array($i, $total - $i);
    }
  } else {
    for ($i = $start; $i <= $limit; $i++) {
      $sub = all_sequences($total - $i, $num - 1, $i + 1);
      foreach ($sub as $arr) {
        array_unshift($arr, $i);
        $ret[] = $arr;
      }
    }
  }
  return $ret;
}

function lowest_maximum($start, $num) {
  return sum_linear($num) + ($start - 1) * $num;
}

function sum_linear($num) {
  return ($num + 1) * $num / 2;
}

Output:

All sequences:

1, 2, 3, 8
1, 2, 4, 7
1, 2, 5, 6
1, 3, 4, 6
2, 3, 4, 5

One implementation of this would be to get all the sequences and select one at random. This has the advantage of equally weighting all possible combinations, which may or may not be useful or necessary to what you're doing.

That will become unwieldy with large totals or large numbers of elements, in which case the above algorithm can be modified to return a random element in the range from $start to $limit instead of every value.

like image 183
cletus Avatar answered Oct 18 '22 08:10

cletus