Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient algorithm to get all the combinations of numbers that are within a certain range from 2 lists in python

Suppose I have two lists list_1 and list_2

list_1 = [1, 5, 10] list_2 = [3, 4, 15]

I want to get a list of tuples containing elements from both list_1 and list_2 such that the difference between the numbers in a tuple is under a some constant c.

E.g. suppose c is 2 then the tuples I would have would be: [(1, 3), (5, 3), (5, 4)]

Of course one can iterate over both lists and check that the difference between 2 elements is less than c, but that has a complexity of n^2 and I would rather reduce that complexity.

like image 754
knowledge_seeker Avatar asked Oct 28 '21 17:10

knowledge_seeker


People also ask

How do you generate all possible combinations with two lists in Python?

The unique combination of two lists in Python can be formed by pairing each element of the first list with the elements of the second list. Method 1 : Using permutation() of itertools package and zip() function. Approach : Import itertools package and initialize list_1 and list_2.

What is a combination in Python?

The combination is a mathematical technique which calculates the number of possible arrangements in a collection of items or list. In combination order of selection doesn’t matter. The unique combination of two lists in Python can be formed by pairing each element of the first list with the elements of the second list.

How to find the unique combination of two lists in Python?

The unique combination of two lists in Python can be formed by pairing each element of the first list with the elements of the second list. Example: List_1 = ["a","b"] List_2 = [1,2] Unique_combination =

How to get the number of possible combinations of a list?

Approach : 1 Import itertools package and initialize list_1 and list_2. 2 Create an empty list of ‘unique_combinations’ to store the resulting combinations so obtained. 3 Call itertools.permutations ( ) which will return permutations of list_1 with length of list_2. ... More items...

How to access all combinations of the list indexes in Python?

We can access all combinations of the list using two loops to iterate over list indexes. If both the index counters are on the same index value, we skip it, else we print the element at index i followed by the element at index j in order. The time complexity of this method is O (n 2) since we require two loops to iterate over lists.


Video Answer


2 Answers

Here is an implementation of the idea of Marat from the comments:

import bisect

def close_pairs(list1,list2,c):
  #assumes that list2 is sorted
  for x in list1:
    i = bisect.bisect_left(list2,x-c)
    j = bisect.bisect_right(list2,x+c)
    yield from ((x,y) for y in list2[i:j])

list_1 = [1, 5, 10]
list_2 = [3, 4, 15]
print(list(close_pairs(list_1,list_2,2)))
#prints [(1, 3), (5, 3), (5, 4)]

To demonstrate the potential improvement of this strategy over what might be thought of as the "naive" approach, let's timeit.

import timeit

setup_naive = '''
import numpy
list_a = numpy.random.randint(0, 2500, 500).tolist()
list_b = numpy.random.randint(0, 2500, 500).tolist()
c = 2
def close_pairs(list_a, list_b, c):
    yield from ((x,y) for x in list_a for y in list_b if abs(x-y) <= c)
'''

setup_john_coleman = '''
import bisect
import numpy
list_a = numpy.random.randint(0, 2500, 500).tolist()
list_b = numpy.random.randint(0, 2500, 500).tolist()
c = 2
def close_pairs(list_a, list_b, c):
    list_a = sorted(list_a)
    list_b = sorted(list_b)
    for x in list_a:
        i = bisect.bisect_left(list_b,x-c)
        j = bisect.bisect_right(list_b,x+c)
        yield from ((x,y) for y in list_b[i:j])
'''

print(f"john_coleman: {timeit.timeit('list(close_pairs(list_a, list_b, c))', setup=setup_john_coleman, number=1000):.2f}")
print(f"naive: {timeit.timeit('list(close_pairs(list_a, list_b, c))', setup=setup_naive, number=1000):.2f}")

On a handy laptop that gives result like:

john_coleman: 0.50
naive: 18.35
like image 78
John Coleman Avatar answered Oct 23 '22 17:10

John Coleman


If the lists are sorted as your example suggests, then remove the sorting and then this has runtime complexity O(M+N+P) where M and N are the list sizes and P is the number of close pairs. It keeps an index i so that ys[i] is the smallest y-value not too small, and then walks over ys[i:...] as long as they're not too large, yielding each pair.

def close_pairs(xs, ys, c):
    xs = sorted(xs)
    ys = sorted(ys) + [float('inf')]
    i = 0
    for x in xs:
        while x - ys[i] > c:
            i += 1
        j = i
        while ys[j] - x <= c:
            yield x, ys[j]
            j += 1

Benchmark results with lists/ranges 1000 times larger than your example:

 904.4 ms  close_pairs_naive
   4.9 ms  close_pairs_John_Coleman
   1.8 ms  close_pairs_Kelly_Bundy

Benchmark code:

from timeit import timeit
import random
import bisect
from collections import deque

def close_pairs_naive(list_a, list_b, c):
    yield from ((x,y) for x in list_a for y in list_b if abs(x-y) <= c)

def close_pairs_John_Coleman(list_a, list_b, c):
    list_a = sorted(list_a)
    list_b = sorted(list_b)
    for x in list_a:
        i = bisect.bisect_left(list_b,x-c)
        j = bisect.bisect_right(list_b,x+c)
        yield from ((x,y) for y in list_b[i:j])

def close_pairs_Kelly_Bundy(xs, ys, c):
    xs = sorted(xs)
    ys = sorted(ys) + [float('inf')]
    i = 0
    for x in xs:
        while x - ys[i] > c:
            i += 1
        j = i
        while ys[j] - x <= c:
            yield x, ys[j]
            j += 1

funcs = [
    close_pairs_naive,
    close_pairs_John_Coleman,
    close_pairs_Kelly_Bundy,
]

xs = random.choices(range(15000), k=3000)
ys = random.choices(range(15000), k=3000)
c = 2
args = xs, ys, c

expect = sorted(funcs[0](*args))
for func in funcs:
    result = sorted(func(*args))
    print(result == expect, func.__name__, len(result))
print()

for _ in range(3):
    for func in funcs:
        t = timeit(lambda: deque(func(*args), 0), number=1)
        print('%6.1f ms ' % (t * 1e3), func.__name__)
    print()
like image 4
Kelly Bundy Avatar answered Oct 23 '22 18:10

Kelly Bundy