Why would I want to use a custom RNG for Array#shuffle?

Question

The documentation for Array#shuffle states:

shuffle(random: rng) → new_ary

The optional rng argument will be used as the random number generator.
a.shuffle(random: Random.new(1))  #=> [1, 3, 2]

What does that mean and why would I want to do that?

shivam · Accepted Answer

optional rng argument will create a fixed random pattern.

Lets try shuffle without rng argument, we should get different random patterns:

a = [ 1, 2, 3 ] 
a.shuffle
# => [3, 2, 1]
a.shuffle
# => [2, 3, 1]

Now with rng:

a.shuffle(random: Random.new(1))
# => [1, 3, 2] 
a.shuffle(random: Random.new(1))
# => [1, 3, 2]

As you can see the shuffled array will always contain the same Random pattern -[1, 3, 2] in this case.

why would I want to do that?

(As mentioned in comments below)

Reproducible random is very valuable. It comes handy in tests, games etc.

Neil Slater · Answer

Internally the Array#shuffle method needs a source of random numbers. When you provide the optional RNG parameter, you are telling the method to use that object as the data source.

It is not directly for reproducibility. By default .shuffle uses the shared Kernel#rand RNG and this can be seeded using srand.

You can reproduce shuffles as follows:

srand(30)
[0,1,2,3,4,5,6].shuffle
# => [3, 1, 2, 0, 4, 6, 5]

srand(30)
[0,1,2,3,4,5,6].shuffle
# => [3, 1, 2, 0, 4, 6, 5]

If all you need is repeatability for tests, then srand will cover your needs.

So what is it for?

Shuffling an array requires a source of random numbers in order to work. By allowing you to over-ride the default Kernel#rand, the design allows you control over how these are sourced. Other functions that require a source of randomness also allow similar over-rides e.g. Array#sample.

Having this level of control allows you to build shuffled arrays arbitrarily, and separately from any other parts of your code that rely on sources of random numbers. Reproducible output is one useful outcome, with the addition of independence from other parts of a program using random numbers that may or may not need reproducible results, or may run at different times that you cannot control.

In addition, for shuffling algorithms there is a problem creating an even distribution when you have a long list. If you are shuffling N items you need factorial(N) or N! possible unique lists of numbers to come from your RNG, otherwise it cannot possibly produce all allowed arrangements. For Ruby's built in RNG, this limit occurs when shuffling around 2000 items in theory - provided the srand value has been chosen from a high quality original random source. You can do better by using an RNG that has has an even higher limit, or a "true" RNG that sources its data from a physical system.

Why would I want to use a custom RNG for Array#shuffle?

Tags:

random

ruby

awendt

2 Answers

shivam

Neil Slater

Recent Activity

Donate For Us

Why would I want to use a custom RNG for Array#shuffle?

Tags:

random

ruby

awendt

2 Answers

shivam

Neil Slater

Related questions

Recent Activity

Donate For Us