I need to select one number from each set in a list of sets of positive integers, such that the difference between the smallest and largest number in the output list is minimized. For example:
numbers = [
(0, 4, 9),
(3, 5),
(7, 8, 9)
]
In this case, the list containing one number from each set and minimizing range is [4, 5, 7], which has a range of 3.
I've thought of a few basic approaches so far, all of which involve starting at the first (top) set and working downwards to build the resulting list. A basic approach seems to be trying each possible path in a depth-first search, while keeping track of the minimum range achieved so far and halting any paths that exceed the current minimum range.
A second approach might be to binary search each set for the numbers closest to the previous number (selected from the last set), selecting the two closest numbers with one greater than and one less than the previous number, and then using a depth-first search on this (significantly smaller) search space. I'm not totally sure this approach would lead to a globally optimal solution though.
I expect this problem can almost certainly be mapped to a well-known search algorithm but I'm having trouble finding the best algorithm. What existing algorithms are well-suited to solve such a problem?
Here's an idea. Not sure if it is optimal but it should be correct.
of course, you'd insert tuples of (value, parent_set) into the heap so you know which value came from which set once they're popped from the heap and needs to be replaced
import heapq
from heapq import heapreplace
def minimize_range(sets: list[set[int]]) -> list[int]:
"""Finds the minimum range that contains one element from each set."""
sets = [list(s) for s in sets] # convert to lists to allow sorting
for s in sets:
s.sort(reverse=True) # reverse so .pop() gives the smallest element
heap = [(s.pop(), i) for i, s in enumerate(sets)]
heapq.heapify(heap)
# to track minimum range
range_hi = max(v[0] for v in heap)
min_range = range_hi - heap[0][0] # smallest item in heap is at index 0
best_result = heap.copy()
while True:
min_val, min_set_idx = heap[0]
if not sets[min_set_idx]:
break # the set is empty, we're done
# replace the smallest element in the heap with the next element from the same set
next_val = sets[min_set_idx].pop()
heapreplace(heap, (next_val, min_set_idx))
# calculate the new range
range_hi = max(range_hi, next_val)
new_range = range_hi - heap[0][0]
if new_range < min_range:
min_range = new_range
best_result = heap.copy()
return [v[0] for v in best_result]
if __name__ == '__main__':
numbers = [
{0, 4, 9},
{3, 5},
{7, 8, 9}
]
result = minimize_range(numbers)
print(result) # Output: [4, 5, 7]
Same idea as Akioweh's, i.e., merging the sorted sets with a heap containing one element from each set. But with two main differences:
lo = 4 and then build [4, 5, 7] by taking the smallest number ≥4 from each set. This way I only need 0.3 seconds for my input sets = [set(range(50000))] + [{50000}] * 50000 for which Akioweh's took 15 seconds.heapq.merge, which is implemented with a heap similarly, and I wrap each sorted set with a tracker that updates the high point. I wouldn't rely on implementation details like that in important code, though.from heapq import merge
def minimize_range(sets: list[set[int]]) -> list[int]:
hi = -1e999
def track_hi(s):
nonlocal hi
for x in sorted(s):
hi = max(hi, x)
yield x
hi = 1e999
lo = min(
(hi - lo, lo)
for lo in merge(*map(track_hi, sets))
)[1]
return [
min(x for x in s if x >= lo)
for s in sets
]
if __name__ == '__main__':
numbers = [
{0, 4, 9},
{3, 5},
{7, 8, 9}
]
result = minimize_range(numbers)
print(result) # Output: [4, 5, 7]
Attempt This Online!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With