sample vector exactly according to the probability given

Question

I believe there should be a function for this in R. However, I am not able to find it. What I need is to get vectors depending on the probability given. I thought sample can do this but it is not what I exactly want.

sample(c(1, 2, 3, 4), size = 4, prob=c(0.25, 0.25, 0.25, 0.25))

gives

# [1] 1 3 4 2

which is correct.

Then I try

sample(c(1, 2, 3, 4), size = 8, replace = T, prob=c(0.25, 0.25, 0.25, 0.25)) 

# [1] 1 4 4 3 2 3 1 3

What I actually need is something like

#[1] 1 4 4 2 2 3 1 3

OR

#[1] 2 3 1 1 4 4 2 3

OR something of similar sort where the given vector is divided exactly according to the probability given. So in the given example the output vector should contain 0.25 of every vector in c(1, 2, 3, 4). So if size = 8 then 0.25 of it is 2 which should be the length of every element in c(1, 2, 3, 4). Is there already a function in R for this or I would have to write a custom one?

eipi10 · Accepted Answer

Since you want the number of repetitions of each value to be deterministic, rather than random, use rep (instead of sample) to repeat each value in proportion to its probability in prob. Then you can create random permutations of the resulting vector.

x = c(1,2,3,4)

prob = c(0.1,0.2,0.3,0.4)

# Total sample size
n = 20

result = rep(x, round(n * prob))

[1] 1 1 2 2 2 2 3 3 3 3 3 3 4 4 4 4 4 4 4 4

Then to create, say, 100 random permutations:

replicate(100, sample(result))

sample vector exactly according to the probability given

Tags:

r

sample

Ronak Shah

1 Answers

eipi10

Recent Activity

Donate For Us

sample vector exactly according to the probability given

Tags:

r

sample

Ronak Shah

1 Answers

eipi10

Related questions

Recent Activity

Donate For Us