Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest way to sample most numbers with minimum difference larger than a value from a Python list

Given a list of 20 float numbers, I want to find a largest subset where any two of the candidates are different from each other larger than a mindiff = 1.. Right now I am using a brute-force method to search from largest to smallest subsets using itertools.combinations. As shown below, the code finds a subset after 4 s for a list of 20 numbers.

from itertools import combinations
import random
from time import time

mindiff = 1.
length = 20
random.seed(99)
lst = [random.uniform(1., 10.) for _ in range(length)]

t0 = time()
n = len(lst)
sample = []
found = False
while not found:
    # get all subsets with size n
    subsets = list(combinations(lst, n))
    # shuffle to ensure randomness
    random.shuffle(subsets)
    for subset in subsets:
        # sort the subset numbers
        ss = sorted(subset)
        # calculate the differences between every two adjacent numbers
        diffs = [j-i for i, j in zip(ss[:-1], ss[1:])]
        if min(diffs) > mindiff:
            sample = set(subset)
            found = True
            break
    # check subsets with size -1
    n -= 1

print(sample)
print(time()-t0)

Output:

{2.3704888087015568, 4.365818049020534, 5.403474619948962, 6.518944556233767, 7.8388969285727015, 9.117993839791751}
4.182451486587524

However, in reality I have a list of 200 numbers, which is infeasible for a brute-froce enumeration. I want a fast algorithm to sample just one random largest subset with a minimum difference larger than 1. Note that I want each sample has randomness and maximum size. Any suggestions?

like image 358
Shaun Han Avatar asked Jun 18 '21 18:06

Shaun Han


People also ask

How to check if a number is greater than numbers in a list Python?

Using all() function we can check if all values are greater than any given value in a single line. It returns true if the given condition inside the all() function is true for all values, else it returns false.

How do you find the lowest value in a list in Python?

The Python min() function returns the lowest value in a list of items. min() can be used to find the smallest number in a list or first string that would appear in the list if the list were ordered alphabetically.

How do you check if a number is repeated in an array Python?

Using Count() The python list method count() returns count of how many times an element occurs in list. So if we have the same element repeated in the list then the length of the list using len() will be same as the number of times the element is present in the list using the count(). The below program uses this logic.

How do you check if a list has only one element?

To check that the list contains only one element, you could use a couple of try , except statements to check that (1) you can access the first element, and (2) you can't access a second element. Really, the most Pythonic way is to just use len .

How do you find the smallest and largest value in Python?

In Python, you can use min () and max () to find the smallest and largest value, respectively, in a list or a string. This guide will explore how to use the min () and max () methods in Python and will walk you through a few examples of each.

What is the difference between Max () and Min () in Python?

The Python max () function is used to find the largest value in a list of values. The Python min () function is used to find the lowest value in a list. The list of values can contain either strings or numbers.

How do you find the smallest number in a list?

Python min () Function. The Python min () function returns the lowest value in a list of items. min () can be used to find the smallest number in a list or first string that would appear in the list if the list were ordered alphabetically.

What is the difference between the largest and smallest value of NUMS?

We have to find the minimum difference between the largest and smallest value of nums after preforming at most 3 moves. So, if the input is like nums = [3,7,2,12,16], then the output will be 1 because we can make given array to [1,1,0,1,1], so the maximum is 1 and minimum is 0, so the difference is 1.


1 Answers

I probably don't fully understand the question, because right now the solution is quite trivial. EDIT: yes, I misunderstood after all, the OP does not just want an optimal solution, but wishes to randomly sample from the set of optimal solutions. This answer is not incorrect but it also is an answer to a different question than what OP is interested in.


Simply sort the numbers and greedily construct the subset:

def mindist_subset(xs, mindist):
    result = []
    for x in sorted(xs):
        if not result or x - result[-1] > mindist:
            result.append(x)
    return result

Sketch of proof of correctness.

Suppose we have a solution S given input array A that is of optimal size. If it does not contain min(A) note that we could remove min(S) from S and add min(A) since this would only increase the distance between min(S) and the second smallest number in S. Conclusion: we can without loss of generality assume that min(A) is part of an optimal solution.

Now we can apply this argument recursively. We add min(A) to a solution and remove all elements too close to min(A), giving remaining elements A'. Then we're left with a subproblem where exactly the same argument applies, we can choose min(A') as our next element of the solution, etc.

like image 93
orlp Avatar answered Oct 13 '22 03:10

orlp