Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generating permutations with a sum constraint

I have n sets of variable length and would like to get all permutations of items from each set where the sum is within a certain range. For example in R we can do:

set1 <- c(10, 15, 20)
set2 <- c(8, 9)
set3 <- c(1, 2, 3, 4)

permutations <- expand.grid(set1, set2, set3)
permutations$sum <- rowSums(permutations)
final <- permutations[permutations$sum >= 25 & permutations$sum <= 29, ]

# final:
#    Var1 Var2 Var3 sum
# 3    20    8    1  29
# 5    15    9    1  25
# 8    15    8    2  25
# 11   15    9    2  26
# 14   15    8    3  26
# 17   15    9    3  27
# 20   15    8    4  27
# 23   15    9    4  28

This is fine for a small number of sets, however quickly (factorially) grows with larger or a greater number of sets.

Is it possible to generate permutations that fit the constraint, without having to calculate all possibilities?

In this example, there are no final combinations containing the 10 from set1, as the resulting sum would be too small no matter which other numbers are chosen. This could be useful for reducing the scope of the problem. For example, if I know that min(set1) + max(set2) + max(set3) < 25 == TRUE, then I can make sure to not include min(set1) in any permutations.

How can I generalize this, and use the constraints to prevent generating invalid permutations?

like image 749
C. Braun Avatar asked May 17 '18 21:05

C. Braun


1 Answers

I think what you're asking for is pretty shoe-horn specific and unlikely to be "easy to implement" (efficiently). A different way to look at it is to do the conditioning as you run the experiment (assuming this is a design for trials).

I wrote a lazyExpandGrid.R that is similar in concept to a lazy expand.grid, meaning it does not evaluate all possible combinations up front. The code can be inserted later in this answer if needed, but the github-gist is fairly solid (and not short).

Using it, you should be able to do:

set1 <- c(10, 15, 20)
set2 <- c(8, 9)
set3 <- c(1, 2, 3, 4)

iter <- lazyExpandGrid(set1, set2, set3)

while (is.data.frame(item <- iter$nextItem())) {
  p <- sum(item)
  if (p < 25 || 29 < p) next
  print(item) # but really, do something more interesting here
}
#   Var1 Var2 Var3
# 3   20    8    1
#   Var1 Var2 Var3
# 5   15    9    1
#   Var1 Var2 Var3
# 8   15    8    2
#    Var1 Var2 Var3
# 11   15    9    2
#    Var1 Var2 Var3
# 14   15    8    3
#    Var1 Var2 Var3
# 17   15    9    3
#    Var1 Var2 Var3
# 20   15    8    4
#    Var1 Var2 Var3
# 23   15    9    4

Caveat emptor: the function is mostly usable, but there are certainly ways it can be improved. For instance, the use of is.data.frame(item <- iter$nextItem()) is effectively an isTruthy test (name from shiny); currently it returns a 1-row data.frame until nothing remains, then returns FALSE. As I look at it now, that can most certainly be improved, I just haven't had the need. Feel free to comment on the github gist page if you have thoughts, bugs, etc.

like image 198
r2evans Avatar answered Oct 14 '22 04:10

r2evans