Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to (efficiently) generate disjoint sets while usings pairs of elements only once?

What I would like to do is split a group of (n) items into groups of equal size (groups of size m, and for simplicity assume that there are no leftovers, i.e. n is divisible by m). Doing this multiple times, I would like to ensure that no pair of items is in the same group together twice.

To make this slightly more concrete, for building groups of two out of the six items A..F, once could partition the set five times in different ways:

  • (A, B), (C, D), (E, F)
  • (A, C), (B, E), (D, F)
  • (A, D), (B, F), (C, E)
  • (A, E), (B, D), (C, F)
  • (A, F), (B, C), (D, E)

The same set of items can be partitioned only once into groups of three without overlapping pairs:

  • (A, B, C), (D, E, F)

(As @DavidHammen points out below, there are different ways of making the partition in this example. However, having made the partition once, there is never another, second split which keeps all pairs of items apart. That's fine -- my application doesn't need to generate all possibly ways of partitioning the set globally, one solution meeting the constraints will do)


My question, now, is this: Is there a way of doing this efficiently? Are there tricks to speed up the generation of these sets?

So, far, I've been treating this as an exact cover problem, and solving it with a backtracking algorithm (a variant of DLX). This works extremely well for pairs, but as the groups become larger the number of possibilities the algorithm has to consider explodes, and processing becomes very unwieldy.

What I'm looking for are tricks to speed things up. Any ideas are very welcome, in particular (but not limited to):

  • Optimizations and heuristics to reduce the number of possibilities that need to be considered prior to solving (for example, it is clear from the examples above that the first split can be made simply arbitrarily, and the first set of each partition [the first column above] can be generated automatically).
  • Are there variants of backtracking that can cope with huge amounts of candidates? (i.e. not needing to generate all possibilities beforehand)
  • Other algorithms, approaches or mathematical concepts I should consider?

Any ideas and suggestions are very welcome. Thank you very much for considering this!


Update

Ok, so this has been a while, but I spent a lot more time on this and wanted to get back to you. @david-eisenstat put me on the right path by giving me the correct search term (thank you so much!) -- I've since read quite a bit on the Social Golfer Problem.

One of the best resources that I found, that I would like to share here, is the work of Markus Triska, who discusses several approaches (and then goes on to present a very nice algorithm) in his thesis. This is highly recommended if anybody runs into a similar problem!

like image 642
mezzopiano Avatar asked Sep 26 '15 21:09

mezzopiano


People also ask

What are 2 ways to implement disjoint sets?

The disjoint set data structure supports following operations: Adding new sets to the a disjoint set. Merging disjoint sets to a single disjoint set using Union operation.

What data structure can be used to implement disjoint sets?

Implementation with linked lists One way to implement disjoint set data structures is to represent each set by a linked list.

What is the time complexity of disjoint set?

Time complexity. A disjoint-set forest implementation in which Find does not update parent pointers, and in which Union does not attempt to control tree heights, can have trees with height O(n). In such a situation, the Find and Union operations require O(n) time.

How do you find the number of disjoint subsets?

Now we have to find out the total number of subsets of the given set S, such that the subsets are unordered pairs of disjoint subsets of S. Now as we know that the total number of unordered pairs of disjoint subsets of any set containing n elements is equal to, N = $\dfrac{{{3^n} + 1}}{2}$....................


1 Answers

This problem is studied under the name Social Golfer Problem. The literature has nontrivial size, but there are three main approaches:

  1. Local search methods, which can handle the cases where many pairs are not present.

  2. Complete search methods like your reduction to exact cover. From what I remember, the research here revolves around efficient methods for symmetry breaking, of which your idea of fixing the first row is probably the simplest.

  3. Mathematical constructions. When q is a prime power, there's a construction for q groups of q involving finite affine planes that's not too awful to implement. Beyond that, there are a lot of one-off constructions. The Handbook of Combinatorial Designs is probably your best bet for summarizing what's known.

like image 103
David Eisenstat Avatar answered Sep 30 '22 23:09

David Eisenstat