I am writing a program compatible with both Python 2.7 and 3.5. Some parts of it rely on stochastic process. My unit tests use an arbitrary seed, which leads to the same results across executions and languages... except for the code using random.shuffle
.
Example in Python 2.7:
In[]: import random
random.seed(42)
print(random.random())
l = list(range(20))
random.shuffle(l)
print(l)
Out[]: 0.639426798458
[6, 8, 9, 15, 7, 3, 17, 14, 11, 16, 2, 19, 18, 1, 13, 10, 12, 4, 5, 0]
Same input in Python 3.5:
In []: import random
random.seed(42)
print(random.random())
l = list(range(20))
random.shuffle(l)
print(l)
Out[]: 0.6394267984578837
[3, 5, 2, 15, 9, 12, 16, 19, 6, 13, 18, 14, 10, 1, 11, 4, 17, 7, 8, 0]
Note that the pseudo-random number is the same, but the shuffled lists are different. As expected, reexecuting the cells does not change their respective output.
How could I write the same test code for the two versions of Python?
Use random. seed() instead before calling random. shuffle() with just one argument. See Python shuffle(): Granularity of its seed numbers / shuffle() result diversity.
Python Random shuffle() Method The shuffle() method takes a sequence, like a list, and reorganize the order of the items. Note: This method changes the original list, it does not return a new list.
To shuffle strings or tuples, use random. sample() , which creates a new object. random. sample() returns a list even when a string or tuple is specified to the first argument, so it is necessary to convert it to a string or tuple.
shuffle() functions together. The primary purpose of using the seed() and shuffle() function together is to produce the same result every time after each shuffle. If we set the same seed value every time before calling the shuffle() function, we will get the same item sequence.
In Python 3.2 the random module was refactored a little to make the output uniform across architectures (given the same seed), see issue #7889. The shuffle()
method was switched to using Random._randbelow()
.
However, the _randbelow()
method was also adjusted, so simply copying the 3.5 version of shuffle()
is not enough to fix this.
That said, if you pass in your own random()
function, the implementation in Python 3.5 is unchanged from the 2.7 version, and thus lets you bypass this limitation:
random.shuffle(l, random.random)
Note however, than now you are subject to the old 32-bit vs 64-bit architecture differences that #7889 tried to solve.
Ignoring several optimisations and special cases, if you include _randbelow()
the 3.5 version can be backported as:
import random
import sys
if sys.version_info >= (3, 2):
newshuffle = random.shuffle
else:
try:
xrange
except NameError:
xrange = range
def newshuffle(x):
def _randbelow(n):
"Return a random int in the range [0,n). Raises ValueError if n==0."
getrandbits = random.getrandbits
k = n.bit_length() # don't use (n-1) here because n can be 1
r = getrandbits(k) # 0 <= r < 2**k
while r >= n:
r = getrandbits(k)
return r
for i in xrange(len(x) - 1, 0, -1):
# pick an element in x[:i+1] with which to exchange x[i]
j = _randbelow(i+1)
x[i], x[j] = x[j], x[i]
which gives you the same output on 2.7 as 3.5:
>>> random.seed(42)
>>> print(random.random())
0.639426798458
>>> l = list(range(20))
>>> newshuffle(l)
>>> print(l)
[3, 5, 2, 15, 9, 12, 16, 19, 6, 13, 18, 14, 10, 1, 11, 4, 17, 7, 8, 0]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With