Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R - generate unique sequences of list with reoccuring items

I wish to generate the unique sequences of elements in a list where some elements are not unique in R

sequence <- c(1,0,1,0)

e.g:

result<-function(sequence)  
result:
  seq1 seq2 seq3 seq4 seq5 seq6
1    1    1    0    0    0    1
2    0    1    0    1    1    0
3    1    0    1    0    1    0
4    0    0    1    1    0    1

notice that all sequences contain every element from the original sequence, such that the sum of the sequence is always 2

gtools returns "too few different elements"

result <- gtools::permutations(4, 4, coseq)

I am not finding any SO post that directly solve this, but instead allow element repeats:Creating combination of sequences achievable with expand.grid and different lengths of sequences.

EDIT: The above is a minimal example, ideally it would work on the sequence:

 sequence = c(0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1)

It is somewhat important that the solution does not generate duplicates that are then subsequently removed, since a longer sequence, say 20 or 30 will be very computationally demanding if duplicates are generated.

like image 736
Timo Kvamme Avatar asked Jan 19 '26 22:01

Timo Kvamme


1 Answers

There are a couple of packages specifically built for this.

First the arrangements package:

## sequence is a bad name as it is a base R function so we use s instead
s <- c(1,0,1,0)
arrangements::permutations(unique(s), length(s), freq = table(s))
     [,1] [,2] [,3] [,4]
[1,]    1    1    0    0
[2,]    1    0    1    0
[3,]    1    0    0    1
[4,]    0    1    1    0
[5,]    0    1    0    1
[6,]    0    0    1    1

Next, we have RcppAlgos (I am the author):

RcppAlgos::permuteGeneral(unique(s), length(s), freqs = table(s))
     [,1] [,2] [,3] [,4]
[1,]    1    1    0    0
[2,]    1    0    1    0
[3,]    1    0    0    1
[4,]    0    1    1    0
[5,]    0    1    0    1
[6,]    0    0    1    1

They are both very efficient as well. To give you an idea, for the actual need by the OP, the other methods will fail (I think there is a limit on the number of rows for a matrix ... 2^31 - 1, not sure though) or take a very long time as they will have to generate 16! ~= 2.092e+13 permutations before any further processing. However, with these two packages, the return is instant:

## actual example needed by OP
sBig <- c(0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1)

system.time(a <- arrangements::permutations(unique(sBig), length(sBig), freq = table(sBig)))
user  system elapsed 
0.001   0.001   0.002 

system.time(b <- RcppAlgos::permuteGeneral(unique(sBig), length(sBig), freqs = table(sBig)))
user  system elapsed 
0.001   0.001   0.002 

identical(a, b)
[1] TRUE

dim(a)
[1] 11440    16
like image 186
Joseph Wood Avatar answered Jan 22 '26 11:01

Joseph Wood



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!