Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generate N distinct permutations of a very large list in R?

So I have a vector of numbers, say 1:8000. I want generate exactly n distinct permutations of this vector. What is a way to do this without having to calculate all permutations (since 8000! is huge and can't fit in RAM).

function distinct_permutations(L, N){
  # Return N distinct permutations of L as a list of lists
  return(x)
}

x <- seq(1:8000)

like image 419
statsman Avatar asked Oct 17 '25 05:10

statsman


2 Answers

Using RcppAlgos::permuteSample, here demonstrated with n=3 samples of 10 out of 10.

set.seed(42)
RcppAlgos::permuteSample(v=10, m=10, n=3)
#      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,]    9    8    5    3    4    1   10    6    2     7
# [2,]    1    3    5    9    7    6    8   10    2     4
# [3,]    2    8    5    9    4    7    3   10    6     1
like image 125
jay.sf Avatar answered Oct 18 '25 23:10

jay.sf


I will implement my first idea from 3 proposed in comments and compared with the already provided by other User. My solution is more universal as any set of values could be provided (other ones have to be treated as indexes), I provide a list of vectors and my solution is faster for many scenarios.

Current solutions:

  • perm_n my own solution powered by sample and unique
  • RcppAlgos::permuteSample
  • arrangements::permutations

EDIT: added additional method

perm_n <- function(vec, n) {
  # sample 2 * N samples as a risk of duplicates for bigger samples
  unique(
    lapply(seq_len(n * 2), function(x) sample(vec, length(vec)))
  )[seq_len(n)]
}

microbenchmark::microbenchmark(
  RcppAlgos::permuteSample(v=10, m=10, n=10),
  perm_n(1:10, 10),
  arrangements::permutations(10, 10, nsample = 10)
)
#> Warning in microbenchmark::microbenchmark(RcppAlgos::permuteSample(v = 10, :
#> less accurate nanosecond times to avoid potential integer overflows
#> Unit: microseconds
#>                                              expr     min       lq      mean
#>  RcppAlgos::permuteSample(v = 10, m = 10, n = 10) 230.174 249.7105 1605.4817
#>                                  perm_n(1:10, 10)  65.559  68.0190  182.8387
#>  arrangements::permutations(10, 10, nsample = 10)   3.649   4.1205  238.7746
#>    median       uq      max neval
#>  1038.961 2292.556 28600.70   100
#>    71.217   82.861 10617.61   100
#>     4.633    7.298 23295.99   100

microbenchmark::microbenchmark(
  RcppAlgos::permuteSample(v=8000, m=8000, n=10),
  perm_n(1:8000, 10),
  arrangements::permutations(8000, 8000, nsample = 10),
  times = 10
)
#> Unit: milliseconds
#>                                                  expr      min         lq
#>  RcppAlgos::permuteSample(v = 8000, m = 8000, n = 10) 199.5939 199.858108
#>                                    perm_n(1:8000, 10)   4.3911   4.428697
#>  arrangements::permutations(8000, 8000, nsample = 10) 593.0577 594.338993
#>       mean     median         uq        max neval
#>  200.44507 200.282540 200.944936 201.960178    10
#>    4.49205   4.457807   4.555182   4.663914    10
#>  600.67245 598.277678 604.146808 619.717132    10

microbenchmark::microbenchmark(
  RcppAlgos::permuteSample(v=8000, m=8000, n=300),
  perm_n(1:8000, 300),
  arrangements::permutations(8000, 8000, nsample = 300),
  times = 3
)
#> Unit: milliseconds
#>                                                   expr        min         lq
#>  RcppAlgos::permuteSample(v = 8000, m = 8000, n = 300)  5939.2831  5965.0268
#>                                    perm_n(1:8000, 300)   136.7174   137.3693
#>  arrangements::permutations(8000, 8000, nsample = 300) 17875.7788 17877.2049
#>       mean     median         uq        max neval
#>   5984.365  5990.7704  6006.9057  6023.0410     3
#>    138.048   138.0212   138.7133   139.4053     3
#>  17891.089 17878.6310 17898.7438 17918.8566     3

Created on 2022-11-23 with reprex v2.0.2

like image 40
polkas Avatar answered Oct 18 '25 23:10

polkas



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!