Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Big-O complexity of random.choice(list) in Python3

What is Big-O complexity of random.choice(list) in Python3, where n is amount of elements in a list?

Edit: Thank You all for give me the answer, now I understand.

like image 261
mil Avatar asked Oct 19 '16 23:10

mil


People also ask

What is time complexity of random choice?

The complexity of random. choice(list) is O(log n) where n is the number of elements in the list. The cpython implementation uses _randbelow(len(seq)) to get a pseudo-random index and then returns the item at that index.

How do you find the time complexity of a Python program?

Time complexity is commonly estimated by counting the number of elementary operations performed by the algorithm, supposing that each elementary operation takes a fixed amount of time to perform. When analyzing the time complexity of an algorithm we may find three cases: best-case, average-case and worst-case.

What is the complexity of in keyword in Python?

Here is the summary for in : list - Average: O(n) set/dict - Average: O(1), Worst: O(n)


2 Answers

O(1). Or to be more precise, it's equivalent to the big-O random access time for looking up a single index in whatever sequence you pass it, and list has O(1) random access indexing (as does tuple). Simplified, all it does is seq[random.randrange(len(seq))], which is obviously equivalent to a single index lookup operation.

An example where it would be O(n) is collections.deque, where indexing in the middle of the deque is O(n) (with a largish constant divisor though, so it's not that expensive unless the deque is reaching the thousands of elements range or higher). So basically, don't use a deque if it's going to be large and you plan to select random elements from it repeatedly, stick to list, tuple, str, byte/bytearray, array.array and other sequence types with O(1) indexing.

like image 174
ShadowRanger Avatar answered Sep 20 '22 04:09

ShadowRanger


Though the question is about random.choice and previous answers on it have several explanations, when I searched for the complexity of np.random.choice, I didn't find an answer, so I decide to explain about np.random.choice.

choice(a, size=None, replace=True, p=None). Assume a.shape=(n,) and size=m.

When with replacement:

The complexity for np.random.choice is O(m) if p is not specified (assuming it as uniform distribution), and is O(n + n log m ) if p is specified.

The github code can be find here np.random.choice.

When p is not specified, choice generates an index array by randint and returns a[index], so the complexity is O(m). (I assume the operation of generating a random integer by randint is O(1).)

When p is specified, the function first computes prefix sum of p. Then it draws m samples from [0, 1), followed by using binary search to find a corresponding interval in the prefix sum for each drawn sample. The evidence to use binary search can be found here. So this process is O(n + m log n). If you need a faster method in this situation, you can use Alias Method, which needs O(n) time for preprocessing and O(m) time to sample m items.


When without replacement: (It's kind of complicated, and maybe I'll finish it in the future.)

If p is not specified, the complexity is the same as np.permutation(n), even when m is only 1. See more here.

If p is specified, the expected complexity is at least $n \log n \log\frac{n}{n + 1 - m}$. (This is an upperbound, but not tight.)

like image 33
Muzhi Avatar answered Sep 20 '22 04:09

Muzhi