I have a number of individuals
that I want to - randomly - divide in subgroups of size groupsize
. This process I want to repeat n_group
times - with no repeating group constellation.
How can I achieve this in R?
I tried the following so far:
set.seed(1)
individuals <- 1:6
groupsize <- 3
n_groups <- 4
for(i in 1:n_groups) { print(sample(individuals, groupsize))}
[1] 1 4 3
[1] 1 2 6
[1] 3 2 6
[1] 3 1 5
..but am not sure whether that really does not lead to repeating constellations..?
Edit: After looking at the first suggestions and answers I realized, that another restriction could be interesting to me (sorry for not seeing it upfront..).
Is there (in the concrete example above) a way to ensure, that every individual was in contact with every other individual?
Based on your edited question, I assuma that you want to make sure that all indivuals are in at least one subgroup?
Then this might be the solution:
individuals <- 1:6
groupsize <- 3
n_groups <- 4
#sample groups
library(RcppAlgos)
#initialise
answer <- matrix()
# If the length of all unique elements in the answer is smaller than
# the number of individuals, take a new sample
while (length(unique(as.vector(answer))) < length(individuals)) {
answer <- comboSample(individuals, groupsize, n = n_groups)
# Line below isfor demonstration only
#answer <- comboSample(individuals, groupsize, n = n_groups, seed = 123)
}
# sample answer with seed = 123 (see commented line above)
# [,1] [,2] [,3]
# [1,] 1 3 4
# [2,] 1 3 6
# [3,] 2 3 5
# [4,] 2 3 4
test for groups that contain not every individual
# Test with the following matrix
# [,1] [,2] [,3]
# [1,] 1 2 3
# [2,] 1 3 4
# [3,] 1 4 5
# [4,] 2 3 4
# Note that individual '6' is not present
answer <- matrix(c(1,2,3,1,3,4,1,4,5,2,3,4), nrow = 4, ncol = 3)
while (length(unique(as.vector(answer))) < length(individuals)) {
answer <- comboSample(individuals, groupsize, n = n_groups)
}
# is recalculated to (in this case) the following answer
# [,1] [,2] [,3]
# [1,] 4 5 6
# [2,] 3 4 5
# [3,] 1 3 6
# [4,] 2 4 5
PASSED ;-)
You can use while
to dynamically update your combination set, which avoids duplicates, e.g.,
res <- c()
while (length(res) < pmin(n_groups, choose(length(individuals), groupsize))) {
v <- list(sort(sample(individuals, groupsize)))
if (!v %in% res) res <- c(res, v)
}
which gives
> res
[[1]]
[1] 2 5 6
[[2]]
[1] 2 3 6
[[3]]
[1] 1 5 6
[[4]]
[1] 1 2 6
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With