How do you efficiently generate a list of K non-repeating integers between 0 and an upper bound N [duplicate]

Tags:

The question gives all necessary data: what is an efficient algorithm to generate a sequence of K non-repeating integers within a given interval [0,N-1]. The trivial algorithm (generating random numbers and, before adding them to the sequence, looking them up to see if they were already there) is very expensive if K is large and near enough to N.

The algorithm provided in Efficiently selecting a set of random elements from a linked list seems more complicated than necessary, and requires some implementation. I've just found another algorithm that seems to do the job fine, as long as you know all the relevant parameters, in a single pass.

595

asked Oct 01 '08 17:10

tucuxi

2 Answers

In The Art of Computer Programming, Volume 2: Seminumerical Algorithms, Third Edition, Knuth describes the following selection sampling algorithm:

Algorithm S (Selection sampling technique). To select n records at random from a set of N, where 0 < n ≤ N.

S1. [Initialize.] Set t ← 0, m ← 0. (During this algorithm, m represents the number of records selected so far, and t is the total number of input records that we have dealt with.)

S2. [Generate U.] Generate a random number U, uniformly distributed between zero and one.

S3. [Test.] If (N – t)U ≥ n – m, go to step S5.

S4. [Select.] Select the next record for the sample, and increase m and t by 1. If m < n, go to step S2; otherwise the sample is complete and the algorithm terminates.

S5. [Skip.] Skip the next record (do not include it in the sample), increase t by 1, and go back to step S2.

An implementation may be easier to follow than the description. Here is a Common Lisp implementation that select n random members from a list:

(defun sample-list (n list &optional (length (length list)) result)   (cond ((= length 0) result)         ((< (* length (random 1.0)) n)          (sample-list (1- n) (cdr list) (1- length)                       (cons (car list) result)))         (t (sample-list n (cdr list) (1- length) result))))

And here is an implementation that does not use recursion, and which works with all kinds of sequences:

(defun sample (n sequence)   (let ((length (length sequence))         (result (subseq sequence 0 n)))     (loop        with m = 0        for i from 0 and u = (random 1.0)        do (when (< (* (- length i) u)                     (- n m))             (setf (elt result m) (elt sequence i))             (incf m))        until (= m n))     result))

192

answered Sep 22 '22 09:09

Vebjorn Ljosa

The random module from Python library makes it extremely easy and effective:

from random import sample print sample(xrange(N), K)

sample function returns a list of K unique elements chosen from the given sequence.
xrange is a "list emulator", i.e. it behaves like a list of consecutive numbers without creating it in memory, which makes it super-fast for tasks like this one.

answered Sep 24 '22 09:09

DzinX

Related questions
                            
                                PHP - Convert multidimensional array to 2D array with dot notation keys
                            
                                How to find the size of integer array
                            
                                c++ optimize array of ints
                            
                                ArrayList of String Arrays
                            
                                Change a value in mutable list in Kotlin
                            
                                How do I replace an array's element?
                            
                                Creating Byte[] in PowerShell
                            
                                Flatten multidimensional array concatenating keys [duplicate]
                            
                                Svelte 3 - How to loop each block X amount of times
                            
                                Process CSV Into Array With Column Headings For Key
                            
                                Why Selection sort can be stable or unstable
                            
                                Getting the first and last element of an array in CoffeeScript
                            
                                Comparing two integer arrays in Java
                            
                                Why is a fixed size buffers (arrays) must be unsafe?
                            
                                How does GCC implement variable-length arrays?
                            
                                Notice: Array to string conversion in
                            
                                Array.length gives incorrect length
                            
                                Php multi-dimensional array from mysql result
                            
                                PHP - define static array of objects
                            
                                Does the C standard require the size of an array of n elements to be n times the size of an element?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How do you efficiently generate a list of K non-repeating integers between 0 and an upper bound N [duplicate]

Tags:

arrays

algorithm

random

permutation

tucuxi

People also ask

2 Answers

Vebjorn Ljosa

DzinX

Recent Activity

Donate For Us