Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get a random element from a sequential collection

Tags:

java

iterator

I talk to an API that gives me an java.util.Iterator over a collection. That means I can iterate over it but I can't get direct/random access to the elements.

Now to my problem: I want to get one random element from this collection. How do i do that? I guess I could build a new collection that allows direct access, but isn't that a little memory consuming? I could also iterate over the entire collection and for each element "roll a dice" to see if I should take that element and quit iteration or continue. But then I need the size of the collection and I can't get that from the Iterator.

Thanks in advance.

like image 939
Sven Avatar asked Jan 04 '11 16:01

Sven


People also ask

How do you generate a random value from a HashSet?

In order to get random elements from the HashSet object, we need to generate a random number between 0 (inclusive) and the size of the HashSet (exclusive). And then iterate through the set till we reach the element located at the random number position as given below.

How do you take a random item from a List in Python?

Using random. randrange() to select random value from a list. random. randrange() method is used to generate a random number in a given range, we can specify the range to be 0 to the length of the list, and get the index, and then the corresponding value.


1 Answers

There's a way to do it on one pass through the collection that doesn't use a lot of extra memory (just the size of one element of the collection plus a float). In pseudocode:

  • Iterate through the collection.
  • For each item, generate a random float.
  • If the float is the lowest (or highest, it doesn't matter) one you've seen so far, store the current item from the collection in a temporary variable. (Also store the new lowest random value.)
  • Once you reach the end of the collection, you have a random item in the temp variable.

Obviously this has the drawback of iterating through the entire collection every time you call it, but you don't have a lot of choice with the constraints you're facing.

Update: The name of this type of problem finally came back to me. This is called Reservoir sampling.

like image 103
Bill the Lizard Avatar answered Sep 19 '22 23:09

Bill the Lizard