a's are objects with multiple "categories", b's, for instance a1 has three cateories b1,b2,b3. The problem is to, reduce the number of categories (which can grow rather large), into groups that always occurs together. A "largest common subset" thing.
So for instance, given the following data set:
a1{ b1,b2,b3 }
a2{ b2,b3 }
a3{ b1,b4 }
We can find that b2 and b3 always comes together..
b23 = {b2,b3}
..and we can reduce the category set to this:
a1{ b1, b23 }
a2{ b23 }
a3{ b1,b4 }
So, my issue is to find some algorithm to solve this problem.
I have started to look at the Longest Common Sequence problem, and it might be a solution. i.e. something like repeatedly grouping categories like this b' = LCS(set_of_As)
until all categories has been traversed. However, this is not complete. I have to limit the input domain in some way to make this possible.
Do I miss something obvious? Any hints of a problem domain you can point me to? Does anyone recognize any other approach to such a problem.
Transform your sets to have sets of b's that include a's:
b1 { a1, a3 }
b2 { a1, a2 }
b3 { a1, a2 }
b4 { a3 }
Make sure the contents of the new b sets are sorted.
Sort your b sets by their contents.
Any two adjacent sets with the same elements are b's that occur in the same a sets.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With