I have spent over a day trying to accomplish what seems to be a very simple thing. I have to create 300 'random' sequences in which the numbers 1,2,3 and 4 all appear exactly 12 times, but the same number is never used twice 'in a row'/consecutively. My best attempts (I guess) were: <ol> <li>have R sample 48 items without replacement, test whether there are consecutive values with rle, then use only the sequences that do not contain consecutive values. Problem: there are almost no random sequences that meet this criterion, so it takes forever.</li> <li>have R create sequences without consecutive values (see code).</li> </ol> <pre class="prettyprint"><code>pop<-rep(1:4,12) y=c() while(length(y)!=48) { y= c(y,sample(pop,48-length(y),replace=F)) y=y[!c(FALSE, diff(y) == 0)] } </code></pre> Problem: this creates sequences with varying numbers of each value. I then tried to use only those sequences with exactly 12 of each value, but that only brought me back to problem 1: takes forever. There must be some easy way to do this, right? Any help is greatly appreciated!

Maybe using <code>replicate()</code> with a <code>repeat</code> loop is faster. here an example with <code>3</code> sequences. Looks like this would take approx. 1490 seconds with <code>300</code> (not tested). <pre class="prettyprint"><code>set.seed(42) seqc <- rep(1:4, each=12) # starting sequence system.time( res <- replicate(3, { repeat { seqcs <- sample(seqc, 48, replace=FALSE) if (!any(diff(seqcs) == 0)) break } seqcs }) ) # user system elapsed # 14.88 0.00 14.90 res[1:10, ] # [,1] [,2] [,3] # [1,] 4 2 3 # [2,] 1 1 4 # [3,] 3 2 1 # [4,] 1 1 4 # [5,] 2 3 1 # [6,] 4 1 2 # [7,] 3 4 4 # [8,] 2 1 1 # [9,] 3 4 4 # [10,] 4 3 2 </code></pre>

R: how to sample without replacement AND without consecutive same values

Tags:

r

sample

I have spent over a day trying to accomplish what seems to be a very simple thing. I have to create 300 'random' sequences in which the numbers 1,2,3 and 4 all appear exactly 12 times, but the same number is never used twice 'in a row'/consecutively.

My best attempts (I guess) were:

have R sample 48 items without replacement, test whether there are consecutive values with rle, then use only the sequences that do not contain consecutive values. Problem: there are almost no random sequences that meet this criterion, so it takes forever.
have R create sequences without consecutive values (see code).

pop<-rep(1:4,12)
y=c()
while(length(y)!=48)
  {
  y= c(y,sample(pop,48-length(y),replace=F))
  y=y[!c(FALSE, diff(y) == 0)]
  }

Problem: this creates sequences with varying numbers of each value. I then tried to use only those sequences with exactly 12 of each value, but that only brought me back to problem 1: takes forever.

There must be some easy way to do this, right? Any help is greatly appreciated!

538

asked Oct 24 '19 11:10

CookieMons

2 Answers

Another option is to use a Markov Chain Monte-Carlo method to swap 2 numbers randomly and move to the new sample only when 1) we are not swapping the same number and 2) no 2 identical numbers are adjacent. To address correlated samples, we can generate a lot of samples and then randomly select 300 of them:

v <- rep(1:4, 12)
l <- 48
nr <- 3e5
m <- matrix(0, nrow=nr, ncol=l)
count <- 0
while(count < nr) {
    i <- sample(l, 2)
    if (i[1L] != i[2L]) {
        v[i] = v[i[2:1]]
        if (!any(diff(v)==0)) {
            count <- count + 1
            m[count, ] <- v
        } else {
            v[i] = v[i[2:1]]
        }
    }
}
a <- m[sample(nr, 300),]
a

176

answered Oct 11 '22 12:10

chinsoon12

Maybe using replicate() with a repeat loop is faster. here an example with 3 sequences. Looks like this would take approx. 1490 seconds with 300 (not tested).

set.seed(42)
seqc <- rep(1:4, each=12)  # starting sequence

system.time(
  res <- replicate(3, {
    repeat {
      seqcs <- sample(seqc, 48, replace=FALSE) 
      if (!any(diff(seqcs) == 0)) break
    }
    seqcs
  })
)
#  user  system elapsed 
# 14.88    0.00   14.90 

res[1:10, ]
#       [,1] [,2] [,3]
#  [1,]    4    2    3
#  [2,]    1    1    4
#  [3,]    3    2    1
#  [4,]    1    1    4
#  [5,]    2    3    1
#  [6,]    4    1    2
#  [7,]    3    4    4
#  [8,]    2    1    1
#  [9,]    3    4    4
# [10,]    4    3    2

answered Oct 11 '22 10:10

jay.sf

Related questions
                            
                                How to change and set Rcpp compile arguments
                            
                                Find matching intervals in data frame by range of two column values
                            
                                Reproduce Fisher linear discriminant figure
                            
                                rvest, html_nodes() error: cannot coerce type 'environment' to vector of type 'list'. Fails RScript, works in Session
                            
                                Debugging package::function() although lazy evaluation is used
                            
                                Split a matrix in blocks of size n with offset i (vectorized method)
                            
                                Start multiple h2o cluster from within R
                            
                                Tail recursion in R
                            
                                Is there a way to deal with nested data with sparklyr?
                            
                                Programmatically scraping a response header within R
                            
                                How to identify the function used by geom_smooth()
                            
                                sum non NA elements only, but if all NA then return NA
                            
                                Finding specific strings in an array using R
                            
                                R Shiny authentication using AWS Cognito
                            
                                fuzzy matching in R
                            
                                Stored Input values in shiny widgets?
                            
                                Understanding Keras prediction output of a rnn model in R
                            
                                Prevent pagebreak in kableExtra landscape table
                            
                                How to save a leaflet map with drawn shapes/points on it in Shiny?
                            
                                Write a loop to select all combination of variable values generating positive equation values in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With