Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Randomly select n elements from Array

Tags:

random

scala

I want to select n unique elements from array where size of my array is normally 1000 and value of n is 3. I want to implement this in iterative algorithm where iterations are around 3000000 and I have to get n unique elements in each iteration. Here are some available solutions I fond but I can not use them due to their draw backs as discussed below.

import scala.util.Random
val l = Seq("a", "b", "c", "d", "e")
val ran = l.map(x => (Random.nextFloat(), x)).sortBy(_._1).map(_._2).take(3)

This method is slower as have to create three arrays and Sorting the array too.

 val list = List(1,2,3,4,5,1,2,3,4,5)
 val uniq = list.distinct
 val shuffled = scala.util.Random.shuffle(uniq)
 val sampled = shuffled.take(n)

Two arrays are generated and Shuffling the large array is slower process.

 val arr = Array.fill(1000)(math.random )
 for (i <- 1 to n; r = (Math.random * xs.size).toInt) yield arr(r)

This is faster tecnique but some times returns same element more than one time. Here is an output.

val xs = List(60, 95, 24, 85, 50, 62, 41, 68, 34, 57)
for (i <- 1 to n; r = (Math.random * xs.size).toInt) yield xs(r)

res: scala.collection.immutable.IndexedSeq[Int] = Vector( 24 , 24 , 41)

It can be observed that 24 is returned 2 times.

How can I change last method to get unique elements? Is there any other more optimal way to perform same task?

like image 685
Asif Avatar asked Jul 28 '20 06:07

Asif


2 Answers

Here is a recursive routine that does the job more efficiently than the other answers.

It builds a list of indices and then checks that the values are distinct. In the rare cases where there are duplicates, these are removed and new values added until there is a distinct set of indices.

The other answers check that the list is distinct each time an element is added.

def randomIndices(arraySize: Int, nIndices: Int): List[Int] = {
  def loop(done: Int, res: List[Int]): List[Int] =
    if (done < nIndices) {
      loop(done + 1, (Math.random * arraySize).toInt +: res)
    } else {
      val d = res.distinct
      val dSize = d.size

      if (dSize < nIndices) {
        loop(dSize, d)
      } else {
        res
      }
    }

  if (nIndices > arraySize) {
    randomIndices(arraySize, arraySize)
  } else {
    loop(0, Nil)
  }
}

randomIndices(xs.size, 3).map(xs)

This should be efficient when the number of elements is small compared to the size of the array.

like image 181
Tim Avatar answered Nov 07 '22 16:11

Tim


Your examples don't seem to be related (Strings? Ints?) but maybe something like this will work.

import scala.util.Random.nextInt

val limit = 1000  // 0 to 999 inclusive
val n = 3
Iterator.iterate(Set.fill(n)(nextInt(limit)))(_ + nextInt(limit))
        .dropWhile(_.size < n)
        .next()
//res0: Set[Int] = Set(255, 426, 965)
like image 4
jwvh Avatar answered Nov 07 '22 15:11

jwvh