Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm: identify pairwise conflicts in a set of objects, given only "there is a conflict in this list" oracle

Tags:

algorithm

I have an (unordered) set of objects. I also have an oracle, which takes an ordered list of objects and returns true if there was at least one ordered conflict within that list, false otherwise. An ordered conflict is an ordered pair of objects (A,B) such that the oracle returns true for any input list [..., A, ..., B, ...]. (A,B) being an ordered conflict does not necessarily imply that (B,A) is an ordered conflict.

I want to identify all unordered conflicts within the set: that is, find all pairs {x, y} such that either (x, y) or (y, x) is an ordered conflict as defined above. The oracle is slow (tens to hundreds of milliseconds per invocation) so it is essential to minimize the number of oracle invocations; the obvious naive algorithm (feed every possible ordered pair of set elements to the oracle; O(n²) invocations) is unacceptable. There are hundreds of set elements, and there are expected to be fewer than ten conflicts overall.

This is as far as I've gotten: If the oracle returns true for a two-element list, then obviously the elements in the list constitute a conflict. If the oracle returns false for any list, then there are no ordered conflicts in that list; if the oracle returns false for both list L and the reversal of list L, then there are no unordered conflicts in L. So a divide-and-conquer algorithm not entirely unlike the below ought to work:

Put all the set elements in a list L (choose any convenient order).
Invoke the oracle on L.
  If the oracle returns false, invoke the oracle on rev(L).
    If the oracle again returns false, there are no unordered conflicts within L.

# At this point the oracle has returned true for either L or rev(L).
If L is a two-element list, the elements of L constitute an unordered conflict.

Otherwise, somehow divide the set in half and recurse on each

I'm stuck on the "divide the set in half and recurse" part. There are two complications. First, it is not sufficient to take the top half and then the bottom half of the ordered list, because the conflict(s) might be eliminated by the split (consider [...A1, A2, ... An ...][...B1, B2, ...Bn ...]). Enumerating all subsets of size n/2 should work, but I don't know how to do that efficiently. Second, a naive recursion may repeat a great deal of work due to implicit state on the call stack -- suppose we have identified that A conflicts with B, then any oracle invocation with a list containing both A and B is wasted, but we still need to rule out other conflicts {A, x} and {B, x}. I can maintain a memo matrix such that M[a][b] is true if and only if (A, B) has already been tested, but I don't know how to make that play nice with the recursion.

Additional complications due to the context: If any object appears more than once in the list, the second and subsequent instances are ignored. Furthermore, some objects have dependencies: if (P,Q) is a dependency, then any oracle input in which Q appears before the first appearance of P (if any) will spuriously report a conflict. All dependencies have already been identified before this algorithm starts. If P conflicts with A, it is not possible to know whether Q also conflicts with A, but this is an acceptable limitation.

(Context: This is for identifying pairs of C system headers which cannot be included in the same source file. The "oracle" is the compiler.)

like image 898
zwol Avatar asked Jul 15 '13 19:07

zwol


1 Answers

A few suggestions:

Finding a single conflict

Suppose that you know there is a conflict in n items, you can use O(log n) operations to find the location of a single conflict by first bisecting on the end point, and then bisecting on the start point.

For example, this might look like:

Test 1,2,3,4,5,6,7,8,9,10 -> Conflict
Test 1,2,3,4,5 -> Good
Test 1,2,3,4,5,6,7 -> Conflict
Test 1,2,3,4,5,6 -> Good (Now deduce Endpoint is 7, the last end with a conflict)
Test 3,4,5,6 -> Conflict
Test 5,6 -> Good
Test 4,5,6 -> Conflict (Now deduce Startpoint is 4.)

You now know that 4,5,6,7 is tight (i.e. cannot be made any smaller without removing a conflict), so we can deduce that 4 and 7 must conflict.

Finding more conflicts

Once you have found a problem, you can remove one of the offending items, and test the remaining set. If this still conflicts, you can use the bisection method to identify another conflict.

Repeat until no more conflicts are found.

Finding remaining conflicts

You should now have a large set of not conflicting items, and a few items that have been removed which may have additional conflicts.

To find remaining conflicts you might want to try taking one of the removed items, and then reinserting all items (except those that we already know conflict with it). This should either identify another conflict, or prove that all conflicts with that item have been found.

You can repeat this process with each of the removed items to find all remaining conflicts.

like image 161
Peter de Rivaz Avatar answered Sep 26 '22 02:09

Peter de Rivaz