Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast way of generating combinations with constraints?

I have generator function that creates a Cartesian product of lists. The real application uses more complex objects, but they can be represented by strings:

import itertools

s1 = ['a', 'b']
s2 = ['c', 'd', 'e', 'f']
s3 = ['c', 'd', 'e', 'f']
s4 = ['g']

p = itertools.product(*[s1,s2,s3,s4])
names = [''.join(s) for s in p]

In this example, the result is 32 combinations of characters:

names
['accg', 'acdg', 'aceg', 'acfg', 'adcg', 'addg', 'adeg', 'adfg', 'aecg',
 'aedg', 'aeeg', 'aefg', 'afcg', 'afdg', 'afeg', 'affg', 'bccg', 'bcdg',
 'bceg', 'bcfg', 'bdcg', 'bddg', 'bdeg', 'bdfg', 'becg', 'bedg', 'beeg',
 'befg', 'bfcg', 'bfdg', 'bfeg', 'bffg']

Now, let's say I have some constraints such that certain character combinations are illegal. For example, let's say that only strings that contain the regex '[ab].c' are allowed. ('a' or 'b' followed by any letter followed by 'c')

After applying these constraints, we are left with a reduced set of just 8 strings:

import re
r = re.compile('[ab].c')
filter(r.match, names)
['accg', 'adcg', 'aecg', 'afcg', 'bccg', 'bdcg', 'becg', 'bfcg']

In the real application the chains are longer, there could be thousands of combinations and applying the hundreds of constraints is fairly computationally intensive so I'm concerned about scalability.

Right now I'm going through every single combination and checking its validity. Does an algorithm/data structure exist that could speed up this process?

EDIT: Maybe this will clarify a little: In the real application I am assembling random 2D pictures of buildings from simple basic blocks (like pillars, roof segments, windows, etc.). The constraints limit what kind of blocks (and their rotations) can be grouped together, so the resulting random image looks realistic, and not a random jumble.

A given constraint can contain many combinations of patterns. But of all those combinations, many are not valid because a different constraint would prohibit some portion of it. So in my example, one constraint would contain the full Cartesian product of characters above. And a second constraint is the '[ab].c'; this second constraint reduces the number of valid combinations of the first constraint that I need to consider.

Because these constraints are difficult to create; I looking to visualize what all the combinations of blocks in each constraint look like, but only the valid combinations that pass all constraints. Hence my question. Thanks!

like image 683
jasonm76 Avatar asked Jun 24 '15 16:06

jasonm76


2 Answers

Try just providing the iterator that generates the names directly to filter, like so:

import itertools
import re

s1 = ['a', 'b']
s2 = ['c', 'd', 'e', 'f']
s3 = ['c', 'd', 'e', 'f']
s4 = ['g']

p = itertools.product(*[s1,s2,s3,s4])
r = re.compile('[ab].c')
l = filter(r.search, (''.join(s) for s in p))
print(list(l))

That way, it shouldn't assemble the full set of combinations in memory, it will only keep the ones that fit the criteria. There is probably another way as well.

EDIT:

One of the primary differences from the original, is that instead of:

[''.join(s) for s in p]

Which is a list comprehension, we use:

(''.join(s) for s in p)

Which is a generator.

The important difference here is that a list comprehension creates a list using the designated criteria and generator, while only providing the generator allows the filter to generate values as needed. The important mechanism is lazy evaluation, which really just boils down to only evaluating expressions as their values become necessary.

like image 73
Rob Foley Avatar answered Nov 02 '22 04:11

Rob Foley


By switching from a list to a generator, Rob's answer saves space but not time (at least, not asymptotically). You've asked a very broad question about how to enumerate all solutions to what is essentially a constraint satisfaction problem. The big wins are going to come from enforcing local consistency, but it's difficult to give you advice without specific knowledge of your constraints.

like image 32
David Eisenstat Avatar answered Nov 02 '22 04:11

David Eisenstat