Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

generate random integers between two values with a given probability using R

Tags:

random

r

I have the following four number sets:

A=[1,207];
B=[208,386];
C=[387,486];
D=[487,586].

I need to generate 20000 random numbers between 1 and 586 in which the probability that the generated number belongs to A is 1/2 and to B,C,D is 1/6.

in which way I can do this using R?

like image 685
dp.carlo Avatar asked Jun 08 '13 17:06

dp.carlo


People also ask

How do you generate random numbers according to a given distribution in R?

Random numbers from a normal distribution can be generated using rnorm() function. We need to specify the number of samples to be generated. We can also specify the mean and standard deviation of the distribution. If not provided, the distribution defaults to 0 mean and 1 standard deviation.

Does R have a random number generator?

Random Number Generator in R is the mechanism which allows the user to generate random numbers for various applications such as representation of an event taking various values, or samples with random numbers, facilitated by functions such as runif() and set.


2 Answers

You can directly use sample, more specifcally the probs argument. Just divide the probability over all the 586 numbers. Category A get's 0.5/207 weight each, etc.

A <- 1:207
B <- 208:386
C <- 387:486
D <- 487:586
L <- sapply(list(A, B, C, D), length)

x <- sample(c(A, B, C, D),
            size = 20000,
            prob = rep(c(1/2, 1/6, 1/6, 1/6) / L, L),
            replace = TRUE)
like image 118
Paul Hiemstra Avatar answered Oct 23 '22 15:10

Paul Hiemstra


I would say use the Roulette selection method. I will try to give a brief explanation here. Take a line of say length 1 unit. Now break this in proportion of the probability values. So in our case, first piece will be of 1.2 length and next three pieces will be of 1/6 length. Now sample a number between 0,1 from uniform distribution. As all the number have same probability of occurring, a sampled number belonging to a piece will be equal to length of the piece. Hence which ever piece the number belongs too, sample from that vector. (I will give you the R code below you can run it for a huge number to check if what I am saying is true. I might not be doing a good job of explaining it here.)

It is called Roulette selection because another analogy for the same situation can be, take a circle and split it into sectors where the angle of each sector is proportional to the probability values. Now sample a number again from uniform distribution and see which sector it falls in and sample from that vector with the same probability

A <- 1:207
B <- 208:386
C <- 387:486
D <- 487:586

cumList <- list(A,B,C,D)

probVec <- c(1/2,1/6,1/6,1/6)

cumProbVec <- cumsum(probVec)

ret <- NULL

for( i in 1:20000){

  rand <- runif(1)

  whichVec <- which(rand < cumProbVec)[1] 

  ret <- c(ret,sample(cumList[[whichVec]],1))

}

#Testing the results

length(which(ret %in% A)) # Almost 1/2*20000 of the values

length(which(ret %in% B)) # Almost 1/6*20000 of the values

length(which(ret %in% C)) # Almost 1/6*20000 of the values

length(which(ret %in% D)) # Almost 1/6*20000 of the values
like image 2
Avinash Avatar answered Oct 23 '22 13:10

Avinash