Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does runif() not predict the interval maximum value?

Tags:

r

I was responding to question posed over at Reddit AskScience and I came across something odd with respect to the functionality of runif(). I was attempting to sample a set from 1 to 52 uniformly. My first thought was to use runif():

as.integer(runif(n, min = 1, max = 52))

However, I found that the operation never produced a value of 52. For example:

length(unique(as.integer(runif(1000000, 1, 52))))
[1] 51

For my purposes, I just turned to sample() instead:

sample(52, n, replace = TRUE)

In the runif() documentation it states:

runif will not generate either of the extreme values unless max = min or max-min is small compared to min, and in particular not for the default arguments.

I'm wondering why runif() acts this way. It seems like it should be able to produce the 'extreme values' from the set if its attempting to generate samples uniformly. Is this a feature, and why?

like image 220
user2059737 Avatar asked Sep 06 '17 01:09

user2059737


People also ask

What does the Runif command do in R?

The runif() function generates random deviates of the uniform distribution and is written as runif(n, min = 0, max = 1) . We may easily generate n number of random samples within any interval, defined by the min and the max argument.

What is the difference between Runif and sample?

To generate random numbers from a uniform distribution you can use the runif() function. Alternatively, you can use sample() to take a random sample using with or without replacements.

What does Runif return?

runif can be used to produce random numbers; runif does not stand for run if. runif(n) generates n uniform random numbers between 0 and 1. runif(n, a, b) generates n uniform random numbers between a and b .

What does Dunif do in R?

dunif() function in R Language is used to provide the density of the distribution function.


3 Answers

This is indeed a feature. The C source code of runif contains the following C code:

/* This is true of all builtin generators, but protect against
       user-supplied ones */
    do {u = unif_rand();} while (u <= 0 || u >= 1);
return a + (b - a) * u;

this implies that unif_rand() could return 0 or 1, but runif() is engineered to skip those (unlikely) cases.

My guess would be that this is done to protect user code that would fail in the edge cases (values exactly on the boundaries of the range).

This feature was implemented by Brian Ripley on Sep 19 2006 (from the comments it seems that 0<u<1 is automatically true of the built-in uniform generator, but might not be true for user-supplied ones).

sample(1:52,size=n,replace=TRUE) is an idiomatic (although not necessarily the most efficient) way to achieve your goal.

like image 160
Ben Bolker Avatar answered Sep 30 '22 13:09

Ben Bolker


as.integer works like trunc. It will form an integer by truncating the given value toward 0. And since values can't exceed 52 (see Ben's answer) they will always be truncated to a value between 1 and 51.

You would see different result with floor (or ceiling). Note that you have to adjust the max of runif by adding 1 (or adjust min in case of ceiling). Also note that in this case, since both min and max are above 0, you could replace floor with trunc or as.integer too.

set.seed(42)
x = floor(runif(n = 1000000, min = 1, max = 52 + 1))
plot(prop.table(table(x)), las = 2, cex.axis = 0.75)

enter image description here

like image 26
d.b Avatar answered Sep 30 '22 13:09

d.b


as.integer(51.999)

51

It is because how as.integer works.

If you want to draw from a discrete distribution, then use sample. runif is not for discrete distributions.

like image 21
Suren Avatar answered Sep 30 '22 11:09

Suren