I would like to intersect two lists in Python (2.7). I need the result to be iterable:
list1 = [1,2,3,4]
list2 = [3,4,5,6]
result = (3,4) # any kind of iterable
Providing a full iteration will be performed first thing after the intersection, which of the following is more efficient?
Using a generator:
result = (x for x in list1 if x in list2)
Using filter():
result = filter(lambda x: x in list2, list1)
Other suggestions?
Thanks in advance,
Amnon
Neither of these. The best way is to use sets.
list1 = [1,2,3,4]
list2 = [3,4,5,6]
result = set(list1).intersection(list2)
Sets are iterable, so no need to convert the result into anything.
Your solution has a complexity of O(m*n)
, where m
and n
are the respective lengths of the two lists. You can improve the complexity to O(m+n)
using a set for one of the lists:
s = set(list1)
result = [x for x in list2 if x in s]
In cases where speed matters more than readability (that is, almost never), you can also use
result = filter(set(a).__contains__, b)
which is about 20 percent faster than the other solutions on my machine.
I tried to compare the speed of 3 methods of list intersection:
import random
a = [random.randint(0, 1000) for _ in range(1000)]
b = [random.randint(0, 1000) for _ in range(1000)]
Time elapse: 8.95265507698059
import time
start = time.time()
for _ in range(1000):
result = [x for x in a if x in b]
elapse = time.time() - start
print(elapse)
Time elapse: 0.09089064598083496
start = time.time()
for _ in range(1000):
result = set.intersection(set(a), set(b))
elapse = time.time() - start
print(elapse)
Time elapse: 0.323300838470459
start = time.time()
for _ in range(1000):
result = np.intersect1d(a, b)
elapse = time.time() - start
print(elapse)
I think use set.intersection
is the fastest way.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With