I believe there should be a function for this in R. However, I am not able to find it. What I need is to get vectors depending on the probability given. I thought sample
can do this but it is not what I exactly want.
sample(c(1, 2, 3, 4), size = 4, prob=c(0.25, 0.25, 0.25, 0.25))
gives
# [1] 1 3 4 2
which is correct.
Then I try
sample(c(1, 2, 3, 4), size = 8, replace = T, prob=c(0.25, 0.25, 0.25, 0.25))
# [1] 1 4 4 3 2 3 1 3
What I actually need is something like
#[1] 1 4 4 2 2 3 1 3
OR
#[1] 2 3 1 1 4 4 2 3
OR something of similar sort where the given vector is divided exactly according to the probability given. So in the given example the output vector should contain 0.25
of every vector in c(1, 2, 3, 4)
. So if size = 8
then 0.25 of it is 2 which should be the length of every element in c(1, 2, 3, 4)
. Is there already a function in R for this or I would have to write a custom one?
Since you want the number of repetitions of each value to be deterministic, rather than random, use rep
(instead of sample
) to repeat each value in proportion to its probability in prob
. Then you can create random permutations of the resulting vector.
x = c(1,2,3,4)
prob = c(0.1,0.2,0.3,0.4)
# Total sample size
n = 20
result = rep(x, round(n * prob))
[1] 1 1 2 2 2 2 3 3 3 3 3 3 4 4 4 4 4 4 4 4
Then to create, say, 100 random permutations:
replicate(100, sample(result))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With