Given two lists, list1
and list2
list3 = filter(lambda x: x in list1,list2)
This returns the intersection of the two lists.
How can I find the complexity of this algorithm? I have found out that the time complexity of x in list1
is O(n) where n is number of elements in the list, but how about the filter
?
Your code does O(len(list1) * len(list2))
comparison operations for elements.
Your lambda function is executed O(len(list2))
times, once per each element filtered. See documentation for filter
in Python 3 (Python 2):
filter(function, iterable)
Construct an
iterator
from those elements ofiterable
for which function returns true.iterable
may be either a sequence, a container which supports iteration, or an iterator
(emphasis mine)
Clearly the function is called at least 1 time for each (distinct) element in iterable - knowing when not to need call it would mean also solving the Halting problem in general case, which not even the Python core developers are yet to solve ;-). In practice in CPython 3 the filter
builtin creates an iterator which when advanced, executes the function once for each element (distinct or not) in the iteration order.
The x in list1
does O(len(list1))
comparisons in average and worst case, as documented.
To speed it up, use a set
; also you do not need a lambda function at all (use __contains__
magic method)
list3 = filter(set(list1).__contains__, list2)
This builds a set
of the list1
once in O(len(list1))
time and runs the filter against it with O(len(list2))
average complexity for O(len(list1) + len(list2))
If the ordering of elements of list2
does not matter then you can also do
set(list1).intersection(list2)
which should have lower constants than doing the filter
above; and for really fast code, you should order the lists so that the smaller is turned into a set (since both intersection and set building have documented average complexity of O(n)
, but set building most probably will have larger constants due to resizing the set
, thus it will make sense to build the set from the smaller to decrease the weight of these constants):
smaller, larger = sorted([list1, list2], key=len)
result = set(smaller).intersection(larger)
Notice that Python 2 and 3 differ from each other. filter
in Python 3 returns a generator, and the actual running time depends on the number of elements consumed from the resulting generator, whereas in Python 2 a list will be generated upfront, which might be more costly if you need only the first values.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With