Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

O(n) complexity of Python list comprehension with zip() function

I am currently working through a tutorial with the following code:

# numpy where
A = np.array([1,2,3,4])
B = np.array([100, 200, 300, 400])
condition = np.array([True, True, False, False])

answer = [(A_val if cond else B_val) for A_val, B_val, cond in zip(A, B, condition)]

answer
# Out: [1, 2, 300, 400]

Question: What may be the complexity of this python this construct, the mixture of list comprehension and zip() function?

Does every variable passed to zip() act like another for-loop? and what about the list comprehension itself?

Thanks for your help!

like image 601
Lucien Chardon Avatar asked Mar 05 '23 15:03

Lucien Chardon


2 Answers

You're iterating over lists A,B,condition once adding an element to answer with every step taken, so the complexity is O(n) where n is the the size of shortest list

Does every variable passed to zip() act like another for-loop?

Think of it as one loop, adding 1 more argument to zip will add O(1) in each iteration, or O(n) over all n iterations. (assuming smallest arguments is of size n)
For example, for zip(X1,X2,...,Xm) you're doing O(mn) work, but m is constant so it's O(n). (again assuming the smallest size of the arguments is n)

like image 76
sam46 Avatar answered Apr 02 '23 14:04

sam46


The running time of zip(*args) is O(len(args) * min(len(a) for a in args). You can't necessarily boil that down to O(n) without specific assumptions about the arguments (for instance, that the number of lists (len(args)) is constant).

Now, often you can make such assumptions. If the lists are all of the same length, you can use that single length n in place of the the minimum length calculation and use m to stand in for the number of lists, and write it as O(m * n). If the number of lists won't change, then the factor of m is a constant and can be dropped, leaving just O(n).

But if you can't make those specific assumptions, thinking of it as O(n) may mislead you.

If one list can sometimes be shorter than the others, then the length of the longer lists doesn't matter as their last items will never be yielded by zip. If the number of lists is variable, then you can't ignore the m term.

like image 30
Blckknght Avatar answered Apr 02 '23 13:04

Blckknght