Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Number of combinations in configurator

I have been asked to program a routine to decide the number of possible combinations in a product configurator.

The configurator is really simple. Even though it has more features than this, it can be modeled as several "radio groups" (like the UI control) where one of n options has to be selected.

The only kind of constraints that can be used is rules that says if one option is selected another option can not be selected.

So what I want to do is to calculate the number of different products that can be configured, given a set of option groups and constraints.

I made a naive approach to solve this using the Inclusion-exclusion principle. However as far as I can see, any algorithm based on this method should run in O(2^n) which won't work. There are of course several possible optimizations that should give decent runtimes but there are still easily constructed worst case scenarios.

Thats pretty much where I am right now. Any suggestions?

Update

I realize I haven't explained how the rules applies well enough.

There are several groups with options. One and only one option must be selected in each group. There can be one or more of options in a group.

There is only one type of constraints. If option A in some group is selected, then option B in some other group can't be selected. There can be any number of constraints, there is no limit how many constraints/rules apply to a option group or an option itself.

So an example would be:

Group 1:
x1 x2 x3 x4 x5

Group 2:
y1 y2 y3

Group 3:
z1 z2 z3 z4

Constraints:
x1 <-> y2 *
x1 <-> z4
y2 <-> z2

* If option x1 is selected in group1, then option y2 in group 2 can not be selected.

Using inclusion-exclusion I would calculate the number of combinations as

Combinations = Cno rules - Cr[1] - Cr[2] - Cr[3] + Cr[1,2] + Cr[1,3] + Cr[2,3] - Cr[1,2,3]

Where

Cno rules = 5 * 3 * 4

Cr[a,b,c] = Number of combinations that violates rule a, b and c.

The method unfortunately requires 2^|rules| calculations.

like image 445
Daniel Avatar asked Jan 23 '23 09:01

Daniel


1 Answers

Ok, I can't get around 2 ^ N, but I can reduce the sample set. To do this, we'll compute "Composed Constraints". A Composed Constraint is a constraint where, if the all options on the left side is selected, then none of the options on the right side can be selected, but no other constraint based on the left side options may apply.

We need to compute a set of all possible Composed Constraints from a set of constraints. While not necessary, we'll "fix" the existing constraints by swapping left and right hands if the group of the right hand is smaller than the group of the left. This may reduce some composed constraints, though better heuristics are possible for swapping.

We also need to compute a "minimum set" of options that can be arbitrarily chosen for each group. This minimum set is computed by removing from the list of available options any options appearing on the left hand of a composed constraint.

An algorithm follows, but I'm not proving it computes CCs correctly. I will prove that, if it does, then they can be used to compute the number of possible combinations.

  1. Fix the constraints so that the group of the left hand is less than, or equal to the group of the right hand.
  2. Compose the constraints:

    1. Sort constraints by left hand
    2. Sequentially, for each constraint:

      1. Fold the constrain with all constraints that follow it with the same left hand, turning x1 <-> y1, x1 <-> y2 ... x1 <-> yN into Set(x1) <-> Set(y1 ... yN)
      2. Compose the folded constraint with each already folded constraint, if:
        • x1 is not in the right hand of the already folded constraint
        • x1 is not in the same group of any element in the left hand
      3. Add the folded constraint and all it's compositions to the set of folded constraints
  3. Compute the Minimum Set, by taking all options and removing the ones appearing in the left hand of the fixed constraints.

Now you can compute the number of combinations with the formula below. Let's call CC a composed constraint. Then the number of combinations is:

C(Mininum Set) + CCC1 + ... + CCCn

Where:

  • C(Minimum Set) is the number of combinations possible with the minimum set.
  • CCCx is the number of combinations possible by taking the minimum set, replacing any groups for which there's an option on the left hand of CCx with that option, and then removing any options on the right hand of CCx.

Note that the expression is purely additive. This means that, for that expression to yield the expected result, the following two conditions must be true:

  1. No two terms of it may contain the same combination.
  2. All combinations must be accounted for by these terms.
  3. No invalid combinations may be yielded by any term.

For the first proof, note that there are no two distinct CCs with the same left hand. If two CCs had the same left hand but different right hands, that would mean there was an addition constraint that must apply to one of the CCs, or an invalid constraint being applied to the other.

Since there are no two CCs with the same left hand, and the minimum set does not contain the left hand of any CC by definition, then any two CC can be distinguished by at least one option which is selected for one but not for the other. Therefore, no two CCs may yield the same combination.

For the second proof, note that the set of CCs contains, by definition, all valid combinations for options on the left hand.

Assume that there is one combination that does not appear in either the minimum set nor the set of CCs. If this combination does not contain any left-hand option, then it must be a combination from the minimum set, by definition. Therefore, it must contain options from the left hand.

Since the set of CCs contains all valid combinations for the left hand, then there is one CC with the same left-hand options. This combination must have, therefore, an option that is not included in any combination for that CC. But the only options not included in that CC are the ones appearing in the left hand for other CCs, and the ones that must be excluded from it by constraints. Since neither may be the case, then this combination cannot exist.

For the third proof, let's first consider the minimum set. The minimum set does not contain any option in the left hand of any group. Since all constraints are between one left and one right hand option, no constraint applies to the minimum set.

Now let's consider the CCs. A CC has a valid combination of left hand options by definition. Any option that is incompatible with that left hand must appear in the right hand, and any option from that right hand must be removed from the minimum set. Since no options on the minimum set where incompatible between themselves to begin with, there can be no unsatisfied constraint in any combination on a CC.

And that ends the proof.

Let's see how this applies with an example from the comments:

G1: x1, x2, x3 
G2: y1, y2 
G3: z1, z2, z3 

R1: x1 <-> y2 
R2: x3 <-> y2 
R3: y1 <-> z1 
R4: y2 <-> z2 
R5: y2 <-> z3

CC1: {x1} <-> {y2}
CC2: {x3} <-> {y2}
CC3: {y1} <-> {z1}
CC4: {x1, y1} <-> {y2, z1}
CC5: {x3, y1} <-> {y2, z1}
CC6: {y2} <-> {z2, z3}

Let's ponder briefly on composed groups not in the list:

R1&R2: {x1, x3} <-> {y2} -- not in the list because x1 and x3 belongs to the same
                            group
R1&R5: {x1, y2} <-> {y2} -- not in the list because the left hand of R2, y2
                            appears in the right hand of R1

Now, let's see what options are possible in each set:

Minimum Set: (x2), (), (z1, z2, z3)    
CC1: (x1), (), (z1, z2, z3) -- replace G1 with x1, remove y2 from G2
CC2: (x3), (), (z1, z2, z3) -- replace G1 with x3, remove y2 from G2
CC3: (x2), (y1), (z2, z3)   -- replace G2 with y1, remove z1 from G3
CC4: (x1), (y1), (z2, z3)   -- replace G1 with x1, G2 with y1, remove y2 and z1
CC5: (x3), (y1), (z2, z3)   -- replace G1 with x3, G2 with y1, remove y2 and z1
CC6: (x2), (y2), (z1)       -- replace G2 with y2, remove z2 and z3 from G3

Now, let's add things up:

C(Minimum Set) = 1 * 0 *3 = 0
CCC1 = 1 * 0 * 3 = 0
CCC2 = 1 * 0 * 3 = 0
CCC3 = 1 * 1 * 2 = 2
CCC4 = 1 * 1 * 2 = 2
CCC5 = 1 * 1 * 2 = 2
CCC6 = 1 * 1 * 1 = 1

C(Minimum Set) + CCC1 + CCC2  + CCC3 + CCC4 + CCC5 + CCC6
0              + 0    + 0     + 2    + 2    + 2    + 1    = 7

I'll add one further thought here. Even though there are only 6 CCCs for 5 rules, much less than the 32 terms expected otherwise, these CCCs are computed with 2^N worst time, since, for each rule, you must compare and combine it to all CCCs created so far. You may think of them as binary numbers, where a bit is set if the rule is being combined, and not set if not.

However, incompatible combinations are discard right away, so that for each new rule being combined, no time is lost on combinations already deemed invalid. Also, by sorting the rules before-hand, successive rules in the same group may be discarded without even testing for incompatibility, with the right data structures.

As this particular example shows, the average time can be much better than 2^N.

Alternative Algorithms and Considerations

There's some talk of 2-SAT and 3-SAT going around. It seems, to me, that this is a 2-SAT problem, in the sense that every constrain a <-> b is actually a clause "!a || !b". So all constrains together can just be written as "(!x1 || !y2) && (!x1 || !z4) && (!y2 && !z3)", etc. That means you can "solve" it in the sense that you can find out if there is a boolean assignment to each option that will turn this true. There is a linear algorithm for this by Aspall, Plass and Tarjan, with a slide presentation here.

But knowing whether the constrains can be solved or not is not what was asked. What was asked is the number of ways all options can be set while keeping the 2-SAT problem true.

Now, there are efficient algorithms for counting the number of of ways to satisfy a 2-SAT problem. For instance, this paper presents an algorithm that runs in 1.2561n. But even this won't help us, as we need to know what the solutions are to be able to compute the number of combinations that satisfy that solution.

According to Wikipedia, this paper has an algorithm which efficiently enumerate all solutions, which is what we want. But if counting is already exponential, so will be enumerating. Better than 2n, perhaps, but still exponential.

If we do enumerate all solutions to the 2-SAT problem, the number of combinations for each group is given by 1 multiplied by the number of free options, options that do not appear in any constraint, of each group for which no option was selected by the solution.

For example, taking the previous set of groups and constraints. The 2-SAT problem, including mutual exclusion, is:

(!x1 || !y2) && (!x3 || !y2) && (!y1 || !z1) && (!y2 || !z2) && (!y2 || !z3) && (!x1 || !x3) && (!y1 || !y2) && (!z1 || !z2) && (!z1 || !z3) && (!z2 || !z3)

The first line are the five rules. The second line is the mutual exclusion of all options in the same group that appear in the constraint rules.

The solutions to this 2-SAT problems are:

x1    x3    y1    y2    z1    z2    z3    Combinations
true  false true  false false true  false 1
true  false true  false false false true  1
true  false true  false false false false 0
true  false false false true  false false 0
true  false false false false true  false 0
true  false false false false false true  0
true  false false false false false false 0
false true  true  false false true  false 1
false true  true  false false false true  1
false true  true  false false false false 0
false true  false false true  false false 0
false true  false false false true  false 0
false true  false false false false true  0
false true  false false false false false 0
false false true  false false true  false 1
false false true  false false false true  1
false false true  false false false false 0
false false false true  true  false false 1
false false false true  false false false 0
false false false false true  false false 0
false false false false false true  false 0
false false false false false false true  0
false false false false false false false 0

In the first two solutions, there are no groups without a selected option, so the number of combinations is 1. The third solution has no option selected for the group G3, so we multiply 1 by 0. In the lines that start with "false false", there are no option selected for group G1, and one free option: x2. So we multiply 1 by 1 for them, and by 0 if there are no option selected for G2 or G3 (in which case the number of free options is 0).

Now, there is the question of how do I enforce one option being chosen in each group and still claim to be 2-SAT. The problem, as stated, has two implicit constraints: for each group, there must be one, and only one, option selected. These two constrains can be written as:

    x1 || x2 || x3 (for group x with options x1 .. x3)
    (!x1 || !x2) && (!x1 || !x3) && (!x2 || !x3)

The later constrain being 2-SAT, the former being 3-SAT for any non-trivial case. As it happens, I do not enforce the first constraint, but the count then becomes 0. The counting algorithm should go like this:

  • For the constraint-less combinations, multiply the number of constraint-less options in each group by each other.
  • For the constraint-full combinations, add the result of the following counts:
    • For each solution, multiply the number of constraint-less options in each group without an option evaluated to "true" by each other.

So, for every group in which there is at least one constraint-less option, the selection is implicit and anonymous. For every group in which all options are part of some constraint, if no option was selected then the count for that group becomes 0, and, therefore, the number of combinations for that solution becomes 0 as well.

This feels like cheating the problem out of a valid > 2-SAT constraint. After all, if this was possible, then the 3-SAT problem could be solved just by enumerating the solutions to the 2-SAT part of it, and then discarding every one which does not satisfy the 3-SAT part of it. Alas, there is one fundamental difference I can identify:

  • All predicates not solved by the 2-SAT part of the problem are free from any further constraints.

Given this rather restrictive constraint on the clauses, we can solve this problem by enumerating solutions to the 2-SAT explicit constraints.

If anyone wants to pursue this further, go ahead. I'm satisfied with the improvement on the 2n solution I suggested.

like image 189
Daniel C. Sobral Avatar answered Feb 03 '23 15:02

Daniel C. Sobral