Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Shuffling NumPy array along a given axis

Tags:

Given the following NumPy array,

> a = array([[1, 2, 3, 4, 5], [1, 2, 3, 4, 5],[1, 2, 3, 4, 5]]) 

it's simple enough to shuffle a single row,

> shuffle(a[0]) > a array([[4, 2, 1, 3, 5],[1, 2, 3, 4, 5],[1, 2, 3, 4, 5]]) 

Is it possible to use indexing notation to shuffle each of the rows independently? Or do you have to iterate over the array. I had in mind something like,

> numpy.shuffle(a[:]) > a array([[4, 2, 3, 5, 1],[3, 1, 4, 5, 2],[4, 2, 1, 3, 5]]) # Not the real output 

though this clearly doesn't work.

like image 895
lafras Avatar asked Feb 18 '11 11:02

lafras


People also ask

How do I shuffle data in NumPy array?

You can use numpy. random. shuffle() . This function only shuffles the array along the first axis of a multi-dimensional array.

How do I randomly shuffle two NumPy arrays together?

Suppose we have two arrays of the same length or same leading dimensions, and we want to shuffle them both in a way that the corresponding elements in both arrays remain corresponding. In that case, we can use the shuffle() function inside the sklean. utils library in Python.

What does NP shuffle do?

Modify a sequence in-place by shuffling its contents. This function only shuffles the array along the first axis of a multi-dimensional array. The order of sub-arrays is changed but their contents remains the same.


1 Answers

Vectorized solution with rand+argsort trick

We could generate unique indices along the specified axis and index into the the input array with advanced-indexing. To generate the unique indices, we would use random float generation + sort trick, thus giving us a vectorized solution. We would also generalize it to cover generic n-dim arrays and along generic axes with np.take_along_axis. The final implementation would look something like this -

def shuffle_along_axis(a, axis):     idx = np.random.rand(*a.shape).argsort(axis=axis)     return np.take_along_axis(a,idx,axis=axis) 

Note that this shuffle won't be in-place and returns a shuffled copy.

Sample run -

In [33]: a Out[33]:  array([[18, 95, 45, 33],        [40, 78, 31, 52],        [75, 49, 42, 94]])  In [34]: shuffle_along_axis(a, axis=0) Out[34]:  array([[75, 78, 42, 94],        [40, 49, 45, 52],        [18, 95, 31, 33]])  In [35]: shuffle_along_axis(a, axis=1) Out[35]:  array([[45, 18, 33, 95],        [31, 78, 52, 40],        [42, 75, 94, 49]]) 
like image 53
Divakar Avatar answered Oct 08 '22 17:10

Divakar