Suppose I have two lists list_1
and list_2
list_1 = [1, 5, 10]
list_2 = [3, 4, 15]
I want to get a list of tuples containing elements from both list_1 and list_2 such that the difference between the numbers in a tuple is under a some constant c.
E.g. suppose c is 2 then the tuples I would have would be:
[(1, 3), (5, 3), (5, 4)]
Of course one can iterate over both lists and check that the difference between 2 elements is less than c, but that has a complexity of n^2 and I would rather reduce that complexity.
The unique combination of two lists in Python can be formed by pairing each element of the first list with the elements of the second list. Method 1 : Using permutation() of itertools package and zip() function. Approach : Import itertools package and initialize list_1 and list_2.
The combination is a mathematical technique which calculates the number of possible arrangements in a collection of items or list. In combination order of selection doesn’t matter. The unique combination of two lists in Python can be formed by pairing each element of the first list with the elements of the second list.
The unique combination of two lists in Python can be formed by pairing each element of the first list with the elements of the second list. Example: List_1 = ["a","b"] List_2 = [1,2] Unique_combination =
Approach : 1 Import itertools package and initialize list_1 and list_2. 2 Create an empty list of ‘unique_combinations’ to store the resulting combinations so obtained. 3 Call itertools.permutations ( ) which will return permutations of list_1 with length of list_2. ... More items...
We can access all combinations of the list using two loops to iterate over list indexes. If both the index counters are on the same index value, we skip it, else we print the element at index i followed by the element at index j in order. The time complexity of this method is O (n 2) since we require two loops to iterate over lists.
Here is an implementation of the idea of Marat from the comments:
import bisect
def close_pairs(list1,list2,c):
#assumes that list2 is sorted
for x in list1:
i = bisect.bisect_left(list2,x-c)
j = bisect.bisect_right(list2,x+c)
yield from ((x,y) for y in list2[i:j])
list_1 = [1, 5, 10]
list_2 = [3, 4, 15]
print(list(close_pairs(list_1,list_2,2)))
#prints [(1, 3), (5, 3), (5, 4)]
To demonstrate the potential improvement of this strategy over what might be thought of as the "naive" approach, let's timeit
.
import timeit
setup_naive = '''
import numpy
list_a = numpy.random.randint(0, 2500, 500).tolist()
list_b = numpy.random.randint(0, 2500, 500).tolist()
c = 2
def close_pairs(list_a, list_b, c):
yield from ((x,y) for x in list_a for y in list_b if abs(x-y) <= c)
'''
setup_john_coleman = '''
import bisect
import numpy
list_a = numpy.random.randint(0, 2500, 500).tolist()
list_b = numpy.random.randint(0, 2500, 500).tolist()
c = 2
def close_pairs(list_a, list_b, c):
list_a = sorted(list_a)
list_b = sorted(list_b)
for x in list_a:
i = bisect.bisect_left(list_b,x-c)
j = bisect.bisect_right(list_b,x+c)
yield from ((x,y) for y in list_b[i:j])
'''
print(f"john_coleman: {timeit.timeit('list(close_pairs(list_a, list_b, c))', setup=setup_john_coleman, number=1000):.2f}")
print(f"naive: {timeit.timeit('list(close_pairs(list_a, list_b, c))', setup=setup_naive, number=1000):.2f}")
On a handy laptop that gives result like:
john_coleman: 0.50
naive: 18.35
If the lists are sorted as your example suggests, then remove the sorting and then this has runtime complexity O(M+N+P) where M and N are the list sizes and P is the number of close pairs. It keeps an index i
so that ys[i]
is the smallest y-value not too small, and then walks over ys[i:...]
as long as they're not too large, yielding each pair.
def close_pairs(xs, ys, c):
xs = sorted(xs)
ys = sorted(ys) + [float('inf')]
i = 0
for x in xs:
while x - ys[i] > c:
i += 1
j = i
while ys[j] - x <= c:
yield x, ys[j]
j += 1
Benchmark results with lists/ranges 1000 times larger than your example:
904.4 ms close_pairs_naive
4.9 ms close_pairs_John_Coleman
1.8 ms close_pairs_Kelly_Bundy
Benchmark code:
from timeit import timeit
import random
import bisect
from collections import deque
def close_pairs_naive(list_a, list_b, c):
yield from ((x,y) for x in list_a for y in list_b if abs(x-y) <= c)
def close_pairs_John_Coleman(list_a, list_b, c):
list_a = sorted(list_a)
list_b = sorted(list_b)
for x in list_a:
i = bisect.bisect_left(list_b,x-c)
j = bisect.bisect_right(list_b,x+c)
yield from ((x,y) for y in list_b[i:j])
def close_pairs_Kelly_Bundy(xs, ys, c):
xs = sorted(xs)
ys = sorted(ys) + [float('inf')]
i = 0
for x in xs:
while x - ys[i] > c:
i += 1
j = i
while ys[j] - x <= c:
yield x, ys[j]
j += 1
funcs = [
close_pairs_naive,
close_pairs_John_Coleman,
close_pairs_Kelly_Bundy,
]
xs = random.choices(range(15000), k=3000)
ys = random.choices(range(15000), k=3000)
c = 2
args = xs, ys, c
expect = sorted(funcs[0](*args))
for func in funcs:
result = sorted(func(*args))
print(result == expect, func.__name__, len(result))
print()
for _ in range(3):
for func in funcs:
t = timeit(lambda: deque(func(*args), 0), number=1)
print('%6.1f ms ' % (t * 1e3), func.__name__)
print()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With