I have K sets of data points, I would like to make groups of size K which minimize the total sum of intra group distances. I'm familiar with matching algorithms with bipartite graphs, but I would like this for more than two sets.
Any ideas?
Edit :
Each group would be made of one element of each set, no repetitions allowed.
An example : you have {a1, a2, a3}, {b1, b2, b3}, {c1, c2, c3} You want to create groups e.g. {a1, b3, c3}, {a2, b1, c2}, {a3, b2, c1} minimizing the sum of intra group distances.
This problem can be reduced to another, similar problem that I have solved for another stackoverflow question before. The idea is to compute all combinations of n / k
sized groups, and weight these according to their intra group distances. Traverse the search space for valid combinations of combinations. Keep record of the minimal sum, and use this to prune dead-end branches. You can speedup the search using dynamic programming by producing optimal subsets of the solution, and building up to the final solution from that (as described in my other post), or you could use a greedy method and some hand wavey tricks to find a nearly optimal (or optimal) solution (also described in said post). Here is a link to the sub problem that you can reduce this to.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With