Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python: why does random.shuffle change the array

I'm using random.shuffle to shuffle a 2D numpy array. I met with the following problem:

import numpy as np
from random import shuffle as sf 

b = np.array([1,2,3,4,5])
print b
# [1 2 3 4 5]
sf(b)
print b
# [1 4 5 3 2]

a = np.array([[1,2,3],[4,5,6],[7,8,9]])
print a
# [[1 2 3]
#  [4 5 6]
#  [7 8 9]]
sf(a)
print a
# [[1 2 3]
#  [4 5 6]
#  [1 2 3]]

The result shows that when shuffling 1D array, everything is correct. But while shuffling 2D array, the result becomes strange.

Why is the third row of the original array thrown away and the first row duplicated by twice?

I know there could be solutions to solve this problem, such as firstly shuffle a 1D array indicating the row ids and then extract the 2D array in the order of the shuffled ids. But I do want to make clear what happens to the implementation of random.shuffle, or what's wrong with my code.

like image 375
pfc Avatar asked Jul 05 '17 05:07

pfc


1 Answers

Shuffle from the random module isn’t made to deal with numpy arrays since it’s not exactly the same as nested python lists. You should use the numpy.random module’s shuffle instead.

import numpy as np
from numpy.random import shuffle

arr = np.array([[1,2,3],[4,5,6],[7,8,9]])
shuffle(arr)
print(arr)
# output:
# [[4 5 6]
# [1 2 3]
# [7 8 9]]
like image 197
Taku Avatar answered Sep 21 '22 13:09

Taku