Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sampling a number of indivuals in subgroups with no repeating group constellation in R

Tags:

r

I have a number of individuals that I want to - randomly - divide in subgroups of size groupsize. This process I want to repeat n_group times - with no repeating group constellation.

How can I achieve this in R?


I tried the following so far:

set.seed(1)
individuals <- 1:6
groupsize <- 3
n_groups <- 4

for(i in 1:n_groups) { print(sample(individuals, groupsize))}

[1] 1 4 3
[1] 1 2 6
[1] 3 2 6
[1] 3 1 5

..but am not sure whether that really does not lead to repeating constellations..?


Edit: After looking at the first suggestions and answers I realized, that another restriction could be interesting to me (sorry for not seeing it upfront..).

Is there (in the concrete example above) a way to ensure, that every individual was in contact with every other individual?

like image 782
symbolrush Avatar asked Mar 01 '23 15:03

symbolrush


2 Answers

Based on your edited question, I assuma that you want to make sure that all indivuals are in at least one subgroup?

Then this might be the solution:

individuals <- 1:6
groupsize <- 3
n_groups <- 4

#sample groups
library(RcppAlgos)
#initialise
answer <- matrix()
# If the length of all unique elements in the answer is smaller than
# the number of individuals, take a new sample
while (length(unique(as.vector(answer))) < length(individuals)) {
  answer <- comboSample(individuals, groupsize, n = n_groups)
  # Line below isfor demonstration only
  #answer <- comboSample(individuals, groupsize, n = n_groups, seed = 123)
}
# sample answer with seed = 123 (see commented line above)
#      [,1] [,2] [,3]
# [1,]    1    3    4
# [2,]    1    3    6
# [3,]    2    3    5
# [4,]    2    3    4

test for groups that contain not every individual

# Test with the following matrix
#      [,1] [,2] [,3]
# [1,]    1    2    3
# [2,]    1    3    4
# [3,]    1    4    5
# [4,]    2    3    4
# Note that individual '6' is not present
answer <- matrix(c(1,2,3,1,3,4,1,4,5,2,3,4), nrow = 4, ncol = 3)
while (length(unique(as.vector(answer))) < length(individuals)) {
  answer <- comboSample(individuals, groupsize, n = n_groups)
}
# is recalculated to (in this case) the following answer
#      [,1] [,2] [,3]
# [1,]    4    5    6
# [2,]    3    4    5
# [3,]    1    3    6
# [4,]    2    4    5

PASSED ;-)

like image 163
Wimpel Avatar answered Mar 04 '23 05:03

Wimpel


You can use while to dynamically update your combination set, which avoids duplicates, e.g.,

res <- c()
while (length(res) < pmin(n_groups, choose(length(individuals), groupsize))) {
  v <- list(sort(sample(individuals, groupsize)))
  if (!v %in% res) res <- c(res, v)
}

which gives

> res
[[1]]
[1] 2 5 6

[[2]]
[1] 2 3 6

[[3]]
[1] 1 5 6

[[4]]
[1] 1 2 6
like image 34
ThomasIsCoding Avatar answered Mar 04 '23 04:03

ThomasIsCoding