To retrieve k random numbers from an array of undetermined size we use a technique called reservoir sampling. Can anybody briefly highlight how it happens with a sample code?
Biased Reservoir Sampling In biased reservoir sampling Alg. 3.1, [2] the probability of a data point x(t) being in the reservoir is a decreasing function of its lingering time within R. So the probability of finding points of the sooner history in R is high. Very old data points will be in R with very low probability.
There are two basic ways of generating a random sample of any data set – sampling without replacement and sampling with replacement. Consider a data stream with N elements and a sample size n. In random sampling with replacement, each element of the sample is chosen at random from among all N elements of the data set.
There are two types of sampling methods: Probability sampling involves random selection, allowing you to make strong statistical inferences about the whole group. Non-probability sampling involves non-random selection based on convenience or other criteria, allowing you to easily collect data.
In weighted random sampling (WRS) the items are weighted and the probability of each item to be selected is determined by its relative weight.
I actually did not realize there was a name for this, so I proved and implemented this from scratch:
import random def random_subset( iterator, K ): result = [] N = 0 for item in iterator: N += 1 if len( result ) < K: result.append( item ) else: s = int(random.random() * N) if s < K: result[ s ] = item return result
From: http://web.archive.org/web/20141026071430/http://propersubset.com:80/2010/04/choosing-random-elements.html
With a proof near the end.
Following Knuth's (1981) description more closely, Reservoir Sampling (Algorithm R) could be implemented as follows:
import random def sample(iterable, n): """ Returns @param n random items from @param iterable. """ reservoir = [] for t, item in enumerate(iterable): if t < n: reservoir.append(item) else: m = random.randint(0,t) if m < n: reservoir[m] = item return reservoir
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With