Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which is the more efficient way to choose a random pair of objects from a list of lists or tuples?

I have got a list of 2d coordinates with this structure:

coo = [(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0)]

Where coo[0] is the first coordinate stored in a tuple.

I would like to choose two different random coordinates. I can of course use this method:

import numpy  as np
rndcoo1 = coo[np.random.randint(0,len(coo))]
rndcoo2 = coo[np.random.randint(0,len(coo))]
if rndcoo1 != rndcoo2:
     #do something

But because I have to repeat this operation 1'000'000 times I was wondering if there is a faster method to do that. np.random.choice() can't be used for 2d array is there any alternative that I can use?

like image 724
G M Avatar asked Nov 29 '16 11:11

G M


People also ask

Which is more efficient tuple or list?

The key takeaways are: The key difference between the tuples and lists is that while the tuples are immutable objects the lists are mutable. This means that tuples cannot be changed while the lists can be modified. Tuples are more memory efficient than the lists.

Why are tuples more efficient?

Tuple is stored in a single block of memory. Creating a tuple is faster than creating a list. Creating a list is slower because two memory blocks need to be accessed. An element in a tuple cannot be removed or replaced.

What is the advantage of using tuples over lists?

The advantages of tuples over the lists are as follows: Tuples are faster than lists. Tuples make the code safe from any accidental modification. If a data is needed in a program which is not supposed to be changed, then it is better to put it in 'tuples' than in 'list'.

Which is faster tuple or set?

In Python 2, set is always the slowest. or is the fastest for 2 to 3 literals, and tuple and list are faster with 4 or more literals.


2 Answers

import random
result = random.sample(coo, 2)

will give you the expected output. And it is (probably) as fast as you can get with Python.

like image 186
freakish Avatar answered Oct 23 '22 11:10

freakish


Listed in this post is a vectorized approach that gets us a number of such random choices for a number of iterations in one go without looping through those many times of iterations. The idea uses np.argpartition and is inspired by this post.

Here's the implementation -

def get_items(coo, num_items = 2, num_iter = 10):
    idx = np.random.rand(num_iter,len(coo)).argpartition(num_items,axis=1)[:,:2]
    return np.asarray(coo)[idx]

Please note that we would return a 3D array with the first dimension being the number of iterations, second dimension being the number of choices to be made at each iteration and the last dimension is the length of each tuple.

A sample run should present a bit more clearer picture -

In [55]: coo = [(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0)]

In [56]: get_items(coo, 2, 5)
Out[56]: 
array([[[2, 0],
        [1, 1]],

       [[0, 0],
        [1, 1]],

       [[0, 2],
        [2, 0]],

       [[1, 1],
        [1, 0]],

       [[0, 2],
        [1, 1]]])

Runtime test comparing a loopy implementation with random.sample as listed in @freakish's post -

In [52]: coo = [(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0)]

In [53]: %timeit [random.sample(coo, 2) for i in range(10000)]
10 loops, best of 3: 34.4 ms per loop

In [54]: %timeit get_items(coo, 2, 10000)
100 loops, best of 3: 2.81 ms per loop
like image 42
Divakar Avatar answered Oct 23 '22 09:10

Divakar