Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split up a collection, for each subset respecting probabilities for properties of its items

Tags:

c++

algorithm

For a small game (for which I am a bit forced to use C++, so STL-based solutions can be interesting here), I encountered following neat problem. I was wondering if there is any literature on the subject that I could read, or clever implementations.

Collection S of unique items {E1, E2, E3}, each item E having a set of properties, {P1, P2, P3...}
This collection should be split up in S1, S2, S3, S4. It is defined how large S1..4 have to be exactly. We can assume the collection can be correctly split up in those sizes for the remainder of the problem.

Now, for S1, a number of constraints can appear, {C1, C2..}, which specify that for instance, no items with the property P1 may appear in it. Another constraint could be that it should favour the items with property P2 with a factor of 0.8 (we can assume these types of constraints are normalized for all of the subsets per property).

The "weighting" is not that hard to implement. I simply fill some array with candidate numbers, the ones with higher weight are represented more in this array. I then select a random element of the array. the size of the array determines accuracy/granularity (in my case, a small array suffices).

The problem is forbidding some items to appear. It can easily lead to a situation where one item in S needs to be placed in one of the subsets S1, S2, S3, or S4, but this can no longer happen because the subsets are either all full, or the ones that are not full have a specific constraint that this item cannot appear in the set. So you have to backtrack the placement. Doing so too often may violate the weighted probability too much.

How is this problem called, or does it easily map to another (probably NP) problem?

EDIT: Example:

S = {A, B, C, D, E, F, G, H, I, J, K, L, M }

S1 = [ 0.8 probability of having VOWEL, CANNOT HAVE I or K, SIZE = 6 ]
S2 = [ 0.2 probability of having VOWEL, CANNOT HAVE M, B, E, SIZE = 7 ]

Now, suppose we start filling by FOR(LETTER IN S):

LETTER A, create a fill array based on property constraints (0.8 vs 0.2):
[ 1, 1, 1, 1, 1, 1, 1, 2, 2].
Pick a random element from that array: 1.

Now, put A in S1.

For letter I, for instance, the only candidate would be 2, since S1 has a constraint that I cannot appear in it.

Keep doing this, eventually you might end up with:
C = { M } // one more letter to distribute

S1 = A, B, D, E, F, G
S2 = C, F, G, I, K, L

Now, where to place M? I tcannot be placed in S1, since that one is full, and it cannot be placed in S2 because it has a constraint that M cannot be placed in it.
The only way is to backtrack some placement, but then we might mess with the weighted distribution too much (f.i., giving S2 one vowel of S1, which flips around the natural distribution)

Note that this become slightly more complex (in the sense that more backtracks would be needed) when more subsets are in play, instead of just 2.

like image 662
buddhabrot Avatar asked Nov 06 '22 01:11

buddhabrot


1 Answers

This has resemblance to a constraint satisfaction problem (CSP) with hard and soft constraints. There are a couple of standard algorithms for that, but you have to check, if they apply to your particular problem instance.

Check wikipedia for starters.

like image 167
tokage Avatar answered Nov 12 '22 17:11

tokage