Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is Collections.shuffle() really random enough? Practical examples seem to deny this statement

I have 1000 unique objects in a java.util.List, each referring to an image, each image in the 1000-list is unique and now I'd like to shuffle them, so that I can use the first 20 objects and present them to the website-user. The user can then click a button saying "Shuffle", and I retrieve the 1000 images again from scratch and calling again shuffle(). However, it seems that out of 1000 image objects, I very often see the same image again and again between the 20-image-selections.

Something seems to be wrong, any better suggestion, advices?

My code is very simple:

List<String> imagePaths = get1000Images();
Collections.shuffle(imagePaths);

int i = 0;
for (String path: imagePaths) {
  ... do something with the path ...
  i++;
  if (i >= 20) break;
}

I know that Collections.shuffle() is well distributed: see for instance http://blog.ryanrampersad.com/2012/03/03/more-on-shuffling-an-array-correctly/

However, I just have the feeling that the probability of seeing the same image over and over again in a set of 20 images out of 1000 should be much less...

Inputs highly appreciated.

like image 336
basZero Avatar asked Mar 14 '12 12:03

basZero


People also ask

Is shuffle method available in Collection class?

The shuffle() is a Java Collections class method which works by randomly permuting the specified list elements. There is two different types of Java shuffle() method which can be differentiated depending on its parameter. These are: Java Collections shuffle(list) Method.

Can you shuffle a set in Java?

Set is unordered, so randomizing an unordered Collection doesn't make any logical sense. An ordered Set is ordered using a Comparator which means it has a fixed order, you can't shuffle it, that has no meaning as the order is determined by the Comparator or the compare() method.

How do you random shuffle an Array?

Shuffle Array using Random Class We can iterate through the array elements in a for loop. Then, we use the Random class to generate a random index number. Then swap the current index element with the randomly generated index element. At the end of the for loop, we will have a randomly shuffled array.

How do you randomly shuffle elements in an Array Java?

Use the shuffle() Method to Shuffle an Array in Java The shuffle() function of the Collection class takes a list given by the user and shuffles it randomly.


2 Answers

If you're showing 20 images out of 1000 the probability of seeing any one of that 20 repeated in the next iteration is approximately 0.34 so you shouldn't be surprised to see images repeating.

The chances of seeing a specific image is still one in a thousand, but if you're looking for twenty images the chances are much higher.

We can calculate the probability of none of the previous 20 images repeating as:

 980   979         961
———— × ——— × ... × ——— ≈ 0.66
1000   999         981

And so the probability of seeing a repeat is one minus this, or approximately 0.34.

And the probability of seeing an image repeated in either of the next two iterations is:

1 - (0.66 × 0.66) ≈ 0.56

In other words, it's more likely than not that you'll see a repeated image over the two following cycles. (And this isn't including images repeated from the second cycle in the third which will only make it more likely.)

For what it's worth, here's some Java code to do the above calculation:

float result = 1.0f;
int totalImages = 1000;
int displayedImages = 20;

for (int i = 0; i < displayedImages; i++) {
  result = result * (totalImages - displayedImages - i) / (totalImages - i);
}

System.out.println(result);
like image 83
Dave Webb Avatar answered Oct 13 '22 10:10

Dave Webb


Its human nature to see patterns which are not there. Many people see patterns in the planets and stars as guiding their life.

In the first 1000 digits of PI there are six nines in a row. Does that mean the digits of PI are not random? no. The pattern doesn't occur again any more than your might expect.

Having said that, Random is not completely random and it will repeat after 2^48 calls. (it uses a 48-bit seed) This means its not possible to produce every possible long or double using it. If you want more randomness you can use SecureRandom with shuffle instead.

It sounds like what you want is something like this

List<String> imagePaths = new ArrayList<>();

// called repeatedly
if (imagePaths.size() <= 500) {
    imagePaths = get1000Images();
    Collections.shuffle(imagePaths);
}

for (String path: imagePaths.subList(0, 20)) {
  ... do something with the path ...
}

imagePaths = imagePaths.subList(20, imagePaths.size());

This will ensure that you don't see the same image in the last 500 calls.

like image 39
Peter Lawrey Avatar answered Oct 13 '22 09:10

Peter Lawrey