Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to generate permutations or combinations of object in R?

How to generate sequences of r objects from n objects? I'm looking for a way to do either permutations or combinations, with/without replacement, with distinct and non-distinct items (aka multisets).

This is related to twelvefold way. The "distinct" solutions could be included in twelvefold way, while the "non-distinct" are not included.

like image 779
Randy Lai Avatar asked Mar 21 '14 20:03

Randy Lai


People also ask

How do you do permutations and combinations in r?

Combinat package in R programming language can be used to calculate permutations and combinations of the numbers. It provides routines and methods to perform combinatorics. combn() method in R language belonging to this package is used to generate all combinations of the elements of x taken m at a time.

How do you make all combinations of a vector in r?

To create combination of multiple vectors, we can use expand. grid function. For example, if we have six vectors say x, y, z, a, b, and c then the combination of vectors can be created by using the command expand. grid(x,y,z,a,b,c).

What is the formula for finding the number of combinations of r objects from a set of n objects?

The number of combinations of n objects taken r at a time is determined by the following formula: C(n,r)=n! (n−r)!


2 Answers

A Walk Through a Slice of Combinatorics in R*

Below, we examine packages equipped with the capabilities of generating combinations & permutations. If I have left out any package, please forgive me and please leave a comment or better yet, edit this post.

Outline of analysis:

  1. Introduction
  2. Combinations
  3. Permutations
  4. Multisets
  5. Summary
  6. Memory

Before we begin, we note that combinations/permutations with replacement of distinct vs. non-distint items chosen m at a time are equivalent. This is so, because when we have replacement, it is not specific. Thus, no matter how many times a particular element originally occurs, the output will have an instance(s) of that element repeated 1 to m times.

1. Introduction

  1. gtools v 3.8.1
  2. combinat v 0.0-8
  3. multicool v 0.1-10
  4. partitions v 1.9-19
  5. RcppAlgos v 2.0.1 (I am the author)
  6. arrangements v 1.1.0
  7. gRbase v 1.8-3

I did not include permute, permutations, or gRbase::aperm/ar_perm as they are not really meant to attack these types of problems.

|--------------------------------------- OVERVIEW ----------------------------------------|

|_______________| gtools | combinat | multicool | partitions |  |      comb rep |  Yes   |          |           |            |  |   comb NO rep |  Yes   |   Yes    |           |            |  |      perm rep |  Yes   |          |           |            |   |   perm NO rep |  Yes   |   Yes    |    Yes    |    Yes     | | perm multiset |        |          |    Yes    |            |   | comb multiset |        |          |           |            |   |accepts factors|        |   Yes    |           |            |   |   m at a time |  Yes   |  Yes/No  |           |            |   |general vector |  Yes   |   Yes    |    Yes    |            | |    iterable   |        |          |    Yes    |            | |parallelizable |        |          |           |            | |  big integer  |        |          |           |            |  |_______________| iterpc | arrangements | RcppAlgos | gRbase | |      comb rep |  Yes   |     Yes      |    Yes    |        | |   comb NO rep |  Yes   |     Yes      |    Yes    |  Yes   |    |      perm rep |  Yes   |     Yes      |    Yes    |        | |   perm NO rep |  Yes   |     Yes      |    Yes    |   *    | | perm multiset |  Yes   |     Yes      |    Yes    |        | | comb multiset |  Yes   |     Yes      |    Yes    |        | |accepts factors|        |     Yes      |    Yes    |        | |   m at a time |  Yes   |     Yes      |    Yes    |  Yes   | |general vector |  Yes   |     Yes      |    Yes    |  Yes   | |    iterable   |        |     Yes      | Partially |        | |parallelizable |        |     Yes      |    Yes    |        | |  big integer  |        |     Yes      |           |        | 

The tasks, m at a time and general vector, refer to the capability of generating results "m at a time" (when m is less than the length of the vector) and rearranging a "general vector" as opposed to 1:n. In practice, we are generally concerned with finding rearrangements of a general vector, therefore all examinations below will reflect this (when possible).

All benchmarks were ran on 3 different set-ups.

  1. Macbook Pro i7 16Gb
  2. Macbook Air i5 4Gb
  3. Lenovo Running Windows 7 i5 8Gb

The listed results were obtained from setup #1 (i.e. MBPro). The results for the other two systems were similar. Also, gc() is periodically called to ensure all memory is available (See ?gc).

2. Combinations

First, we examine combinations without replacement chosen m at a time.

  1. RcppAlgos
  2. combinat (or utils)
  3. gtools
  4. arrangements
  5. gRbase

How to:

library(RcppAlgos) library(arrangements) library(microbenchmark) options(digits = 4) set.seed(13) testVector1 <- sort(sample(100, 17)) m <- 9 t1 <- comboGeneral(testVector1, m)  ## returns matrix with m columns t3 <- combinat::combn(testVector1, m)  ## returns matrix with m rows t4 <- gtools::combinations(17, m, testVector1)  ## returns matrix with m columns identical(t(t3), t4) ## must transpose to compare #> [1] TRUE t5 <- combinations(testVector1, m) identical(t1, t5) #> [1] TRUE t6 <- gRbase::combnPrim(testVector1, m) identical(t(t6)[do.call(order, as.data.frame(t(t6))),], t1) #> [1] TRUE 

Benchmark:

microbenchmark(cbRcppAlgos = comboGeneral(testVector1, m),                cbGRbase = gRbase::combnPrim(testVector1, m),                cbGtools = gtools::combinations(17, m, testVector1),                cbCombinat = combinat::combn(testVector1, m),                cbArrangements = combinations(17, m, testVector1),                unit = "relative") #> Unit: relative #>            expr     min      lq    mean  median      uq    max neval #>     cbRcppAlgos   1.064   1.079   1.160   1.012   1.086  2.318   100 #>        cbGRbase   7.335   7.509   5.728   6.807   5.390  1.608   100 #>        cbGtools 426.536 408.807 240.101 310.848 187.034 63.663   100 #>      cbCombinat  97.756  97.586  60.406  75.415  46.391 41.089   100 #>  cbArrangements   1.000   1.000   1.000   1.000   1.000  1.000   100 

Now, we examine combinations with replacement chosen m at a time.

  1. RcppAlgos
  2. gtools
  3. arrangements

How to:

library(RcppAlgos) library(arrangements) library(microbenchmark) options(digits = 4) set.seed(97) testVector2 <- sort(rnorm(10)) m <- 8 t1 <- comboGeneral(testVector2, m, repetition = TRUE) t3 <- gtools::combinations(10, m, testVector2, repeats.allowed = TRUE) identical(t1, t3) #> [1] TRUE ## arrangements t4 <- combinations(testVector2, m, replace = TRUE) identical(t1, t4) #> [1] TRUE 

Benchmark:

microbenchmark(cbRcppAlgos = comboGeneral(testVector2, m, TRUE),                cbGtools = gtools::combinations(10, m, testVector2, repeats.allowed = TRUE),                cbArrangements = combinations(testVector2, m, replace = TRUE),                unit = "relative") #> Unit: relative #>            expr     min      lq   mean  median      uq     max neval #>     cbRcppAlgos   1.000   1.000  1.000   1.000   1.000 1.00000   100 #>        cbGtools 384.990 269.683 80.027 112.170 102.432 3.67517   100 #>  cbArrangements   1.057   1.116  0.618   1.052   1.002 0.03638   100 

3. Permutations

First, we examine permutations without replacement chosen m at a time.

  1. RcppAlgos
  2. gtools
  3. arrangements

How to:

library(RcppAlgos) library(arrangements) library(microbenchmark) options(digits = 4) set.seed(101) testVector3 <- as.integer(c(2, 3, 5, 7, 11, 13, 17, 19, 23, 29))  ## RcppAlgos... permuteGeneral same as comboGeneral above t1 <- permuteGeneral(testVector3, 6) ## gtools... permutations same as combinations above t3 <- gtools::permutations(10, 6, testVector3) identical(t1, t3) #> [1] TRUE ## arrangements t4 <- permutations(testVector3, 6) identical(t1, t4) #> [1] TRUE 

Benchmark:

microbenchmark(cbRcppAlgos = permuteGeneral(testVector3, 6),                cbGtools = gtools::permutations(10, 6, testVector3),                cbArrangements = permutations(testVector3, 6),                unit = "relative") #> Unit: relative #>            expr     min     lq   mean median     uq   max neval #>     cbRcppAlgos   1.079  1.027  1.106  1.037  1.003  5.37   100 #>        cbGtools 158.720 92.261 85.160 91.856 80.872 45.39   100 #>  cbArrangements   1.000  1.000  1.000  1.000  1.000  1.00   100 

Next, we examine permutations without replacement with a general vector (returning all permutations).

  1. RcppAlgos
  2. gtools
  3. combinat
  4. multicool
  5. arrangements

How to:

library(RcppAlgos) library(arrangements) library(microbenchmark) options(digits = 4) set.seed(89) testVector3 <- as.integer(c(2, 3, 5, 7, 11, 13, 17, 19, 23, 29)) testVector3Prime <- testVector3[1:7] ## For RcppAlgos, & gtools (see above)  ## combinat t4 <- combinat::permn(testVector3Prime) ## returns a list of vectors ## convert to a matrix t4 <- do.call(rbind, t4) ## multicool.. we must first call initMC t5 <- multicool::allPerm(multicool::initMC(testVector3Prime)) ## returns a matrix with n columns all.equal(t4[do.call(order,as.data.frame(t4)),],           t5[do.call(order,as.data.frame(t5)),]) #> [1] TRUE 

Benchmark:

microbenchmark(cbRcppAlgos = permuteGeneral(testVector3Prime, 7),                cbGtools = gtools::permutations(7, 7, testVector3Prime),                cbCombinat = combinat::permn(testVector3Prime),                cbMulticool = multicool::allPerm(multicool::initMC(testVector3Prime)),                cbArrangements = permutations(x = testVector3Prime, k = 7),                unit = "relative") #> Unit: relative #>            expr      min       lq     mean   median       uq     max neval #>     cbRcppAlgos    1.152    1.275   0.7508    1.348    1.342  0.3159   100 #>        cbGtools  965.465  817.645 340.4159  818.137  661.068 12.7042   100 #>      cbCombinat  280.207  236.853 104.4777  238.228  208.467  9.6550   100 #>     cbMulticool 2573.001 2109.246 851.3575 2039.531 1638.500 28.3597   100 #>  cbArrangements    1.000    1.000   1.0000    1.000    1.000  1.0000   100 

Now, we examine permutations without replacement for 1:n (returning all permutations).

  1. RcppAlgos
  2. gtools
  3. combinat
  4. multicool
  5. partitions
  6. arrangements

How to:

library(RcppAlgos) library(arrangements) library(microbenchmark) options(digits = 4) set.seed(89) t1 <- partitions::perms(7)  ## returns an object of type 'partition' with n rows identical(t(as.matrix(t1)), permutations(7,7)) #> [1] TRUE 

Benchmark:

microbenchmark(cbRcppAlgos = permuteGeneral(7, 7),                cbGtools = gtools::permutations(7, 7),                cbCombinat = combinat::permn(7),                cbMulticool = multicool::allPerm(multicool::initMC(1:7)),                cbPartitions = partitions::perms(7),                cbArrangements = permutations(7, 7),                unit = "relative") #> Unit: relative #>            expr      min       lq     mean   median       uq      max #>     cbRcppAlgos    1.235    1.429    1.412    1.503    1.484    1.720 #>        cbGtools 1152.826 1000.736  812.620  939.565  793.373  499.029 #>      cbCombinat  347.446  304.866  260.294  296.521  248.343  284.001 #>     cbMulticool 3001.517 2416.716 1903.903 2237.362 1811.006 1311.219 #>    cbPartitions    2.469    2.536    2.801    2.692    2.999    2.472 #>  cbArrangements    1.000    1.000    1.000    1.000    1.000    1.000 #>  neval #>    100 #>    100 #>    100 #>    100 #>    100 #>    100 

Lastly, we examine permutations with replacement.

  1. RcppAlgos
  2. iterpc
  3. gtools
  4. arrangements

How to:

library(RcppAlgos) library(arrangements) library(microbenchmark) options(digits = 4) set.seed(34) testVector3 <- as.integer(c(2, 3, 5, 7, 11, 13, 17, 19, 23, 29)) t1 <- permuteGeneral(testVector3, 5, repetition = TRUE) t3 <- gtools::permutations(10, 5, testVector3, repeats.allowed = TRUE) t4 <- permutations(x = testVector3, k = 5, replace = TRUE) 

This next benchmark is a little surprising given the results up until now.

microbenchmark(cbRcppAlgos = permuteGeneral(testVector3, 5, TRUE),                cbGtools = gtools::permutations(10, 5, testVector3, repeats.allowed = TRUE),                cbArrangements = permutations(x = testVector3, k = 5, replace = TRUE),                unit = "relative") #> Unit: relative #>            expr   min     lq  mean median    uq   max neval #>     cbRcppAlgos 1.106 0.9183 1.200  1.030 1.063 1.701   100 #>        cbGtools 2.426 2.1815 2.068  1.996 2.127 1.367   100 #>  cbArrangements 1.000 1.0000 1.000  1.000 1.000 1.000   100 

That is not a typo... gtools::permutations is almost as fast as the other compiled functions. I encourage the reader to go check out the source code of gtools::permutations as it is one of the most elegant displays of programming out there (R or otherwise).

4. Multisets

First, we examine combinations of multisets.

  1. RcppAlgos
  2. arrangements

To find combinations/permutations of multisets, with RcppAlgos use the freqs arguments to specify how many times each element of the source vector, v, is repeated.

library(RcppAlgos) library(arrangements) library(microbenchmark) options(digits = 4) set.seed(496) myFreqs <- sample(1:5, 10, replace = TRUE) ## This is how many times each element will be repeated myFreqs #>  [1] 2 4 4 5 3 2 2 2 3 4 testVector4 <- as.integer(c(1, 2, 3, 5, 8, 13, 21, 34, 55, 89)) t1 <- comboGeneral(testVector4, 12, freqs = myFreqs) t3 <- combinations(freq = myFreqs, k = 12, x = testVector4) identical(t1, t3) #> [1] TRUE 

Benchmark:

microbenchmark(cbRcppAlgos = comboGeneral(testVector4, 12, freqs = myFreqs),                cbArrangements = combinations(freq = myFreqs, k = 12, x = testVector4),                unit = "relative") #> Unit: relative #>            expr   min    lq  mean median    uq   max neval #>     cbRcppAlgos 1.000 1.000 1.000  1.000 1.000 1.000   100 #>  cbArrangements 1.254 1.221 1.287  1.259 1.413 1.173   100 

For permutations of multisets chosen m at a time, we have:

  1. RcppAlgos
  2. arrangements

How to:

library(RcppAlgos) library(arrangements) library(microbenchmark) options(digits = 4) set.seed(8128) myFreqs <- sample(1:3, 5, replace = TRUE) testVector5 <- sort(runif(5)) myFreqs #> [1] 2 2 2 1 3 t1 <- permuteGeneral(testVector5, 7, freqs = myFreqs) t3 <- permutations(freq = myFreqs, k = 7, x = testVector5) identical(t1, t3) #> [1] TRUE 

Benchmark:

microbenchmark(cbRcppAlgos = permuteGeneral(testVector5, 7, freqs = myFreqs),                cbArrangements = permutations(freq = myFreqs, k = 7, x = testVector5),                unit = "relative") #> Unit: relative #>            expr   min    lq  mean median    uq   max neval #>     cbRcppAlgos 1.461 1.327 1.282  1.177 1.176 1.101   100 #>  cbArrangements 1.000 1.000 1.000  1.000 1.000 1.000   100 

For permutations of multisets returning all permutations, we have:

  1. RcppAlgos
  2. multicool
  3. arrangements

How to:

library(RcppAlgos) library(arrangements) library(microbenchmark) options(digits = 4) set.seed(8128) myFreqs2 <- c(2,1,2,1,2) testVector6 <- (1:5)^3 ## For multicool, you must have the elements explicitly repeated testVector6Prime <- rep(testVector6, times = myFreqs2) t3 <- multicool::allPerm(multicool::initMC(testVector6Prime))  ## for comparison t1 <- permuteGeneral(testVector6, freqs = myFreqs2) identical(t1[do.call(order,as.data.frame(t1)),],           t3[do.call(order,as.data.frame(t3)),]) #> [1] TRUE 

Benchmark:

microbenchmark(cbRcppAlgos = permuteGeneral(testVector6, freqs = myFreqs2),                cbMulticool = multicool::allPerm(multicool::initMC(testVector6Prime)),                cbArrangements = permutations(freq = myFreqs2, x = testVector6),                unit = "relative") #> Unit: relative #>            expr      min       lq    mean   median      uq     max neval #>     cbRcppAlgos    1.276    1.374   1.119    1.461    1.39  0.8856   100 #>     cbMulticool 2434.652 2135.862 855.946 2026.256 1521.74 31.0651   100 #>  cbArrangements    1.000    1.000   1.000    1.000    1.00  1.0000   100 

5. Summary

Both gtools and combinat are well established packages for rearranging elements of a vector. With gtools there are a few more options (see the overview above) and with combinat, you can rearrange factors. With multicool, one is able to rearrange multisets. Although partitions and gRbase are limited for the purposes of this question, they are powerhouses packed with highly efficient functions for dealing with partitions and array objects respectively.

arrangements

  1. The output is in dictionary order.
  2. Allows the user to specify the format via the layout argument (r = row-major, c = column-major, and l = list).
  3. Offers convenient methods such as collect & getnext when working with iterators.
  4. Allows for the generation of more than 2^31 - 1 combinations/permutations via getnext. N.B. RcppAlgos (via lower/upper see below) and multicool (via nextPerm) are also capable of doing this.
  5. Speaking of getnext, this function, allows for a specific number of results by utilizing the d argument.
  6. Supports gmp's big integers to compute number of combinations/permutations.

Observe:

library(arrangements) icomb <- icombinations(1000, 7) icomb$getnext(d = 5) #>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] #> [1,]    1    2    3    4    5    6    7 #> [2,]    1    2    3    4    5    6    8 #> [3,]    1    2    3    4    5    6    9 #> [4,]    1    2    3    4    5    6   10 #> [5,]    1    2    3    4    5    6   11 

This feature is really nice when you only want a few combinations/permutations. With traditional methods, you would have to generate all combinations/permutations and then subset. This would render the previous example impossible as there are more than 10^17 results (i.e. ncombinations(1000, 7, bigz = TRUE) = 194280608456793000).

This feature along with the improvements to the generators in arrangements, allow it to be very efficient with respect to memory.

RcppAlgos

  1. The output is in dictionary order.
  2. There are convenient constraint features that we will not discuss here as they are off-topic for this question. I will only note that the types of problems that can be solved by utilizing these features were the motivation for creating this package.
  3. There is an argument upper (formally rowCap) that is analogous to the d argument of getnext.

Observe:

library(RcppAlgos) comboGeneral(1000, 7, upper = 5) #>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] #> [1,]    1    2    3    4    5    6    7 #> [2,]    1    2    3    4    5    6    8 #> [3,]    1    2    3    4    5    6    9 #> [4,]    1    2    3    4    5    6   10 #> [5,]    1    2    3    4    5    6   11 
  1. Additionally, as of 2.0.0, there is an argument called lower that allows one to start generation at a specific combination/permutation. This sets up nicely for parallelization and allows for fast generation beyond 2^31 - 1 as chunks are generated independently.

Parallel example with more than 6 billion combinations:

system.time(parallel::mclapply(seq(1,6397478649,4390857), function(x) {         a <- comboGeneral(25, 15, freqs = c(rep(1:5, 5)), lower = x, upper = x + 4390856)         ## do something         x     }, mc.cores = 7)) #>     user  system elapsed  #>  510.623 140.970 109.496 

In case you were wondering how each package scales, I will leave you with this final example that measures how fast each package can generate over 100 million results (N.B. gtools::combinations is left out here as it will throw the error: evaluation nested too deeply...). Also, we are explicitly calling combn from the utils package because I was unable to get a successful run from combinat::combn. The differences in memory usage between these two is quite bizarre given that they are only marginally different (see ?utils::combn under the "Authors" section).

Observe:

library(RcppAlgos) library(arrangements) library(microbenchmark) options(digits = 4) set.seed(2187) testVector7 <- sort(sample(10^7, 10^3)) system.time(utils::combn(testVector7, 3)) #>    user  system elapsed  #> 179.956   5.687 187.159 system.time(RcppAlgos::comboGeneral(testVector7, 3)) #>    user  system elapsed  #>   1.136   0.758   1.937 system.time(arrangements::combinations(x = testVector7, k = 3)) #>    user  system elapsed  #>   1.963   0.930   2.910 system.time(RcppAlgos::permuteGeneral(testVector7[1:500], 3)) #>    user  system elapsed  #>   1.095   0.631   1.738 system.time(arrangements::permutations(x = testVector7[1:500], k = 3)) #>    user  system elapsed  #>   1.399   0.584   1.993 

6. Memory

When executing comboGeneral as well as arrangements::combinations, the memory will jump almost 2 Gbs before calling gc. This seems about right as #rows * #nols * bytesPerCell / 2^30 bytes = choose(1000,3) * 3 * 4 / 2^30 bytes = (166167000 * 3 * 4)/2^30 = 1.857 Gbs). However, when executing combn, the memory behavior was eratic (e.g. sometimes it would use all 16 Gb of memory and other times it would only spike a couple of Gbs). When I tested this on the Windows set-up, it would often crash.

We can confirm this using Rprof along with summaryRporf. Observe:

Rprof("RcppAlgos.out", memory.profiling = TRUE) t1 <- RcppAlgos::comboGeneral(testVector7, 3) Rprof(NULL) summaryRprof("RcppAlgos.out", memory = "both")$by.total                           total.time total.pct mem.total self.time self.pct "CombinatoricsRcpp"              1.2       100    1901.6       1.2      100 "RcppAlgos::comboGeneral"        1.2       100    1901.6       0.0        0  Rprof("arrangements.out", memory.profiling = TRUE) t3 <- arrangements::combinations(10^3, 3, testVector7) Rprof(NULL) summaryRprof("arrangements.out", memory = "both")$by.total                              total.time total.pct mem.total self.time self.pct ".Call"                            2.08     99.05    1901.6      2.08    99.05 

With RcppAlgos & arrangements, mem.total registers just over 1900 Mb.

And here is the memory profile on a smaller vector comparing gtools, utils, and combinat.

testVector7Prime <- testVector7[1:300]  Rprof("combinat.out", memory.profiling = TRUE) t3 <- combinat::combn(testVector7Prime, 3) Rprof(NULL) summaryRprof("combinat.out", memory = "both")$by.total                   total.time total.pct mem.total self.time self.pct "combinat::combn"       3.98    100.00    1226.9      3.72    93.47  Rprof("utils.out", memory.profiling = TRUE) t4 <- utils::combn(testVector7Prime, 3) Rprof(NULL) summaryRprof("utils.out", memory = "both")$by.total                total.time total.pct mem.total self.time self.pct "utils::combn"       2.52    100.00    1952.7      2.50    99.21  Rprof("gtools.out", memory.profiling = TRUE) t5 <- gtools::combinations(300, 3, testVector7Prime) Rprof(NULL) summaryRprof("gtools.out", memory = "both")$by.total                       total.time total.pct mem.total self.time self.pct "rbind"                     4.94     95.00    6741.6      4.40    84.62 

Interestingly, utils::combn and combinat::combn use different amounts of memory and take different amounts of time to execute. This does not hold up with smaller vectors:

microbenchmark(combinat::combn(2:13, 6), utils::combn(2:13, 6)) Unit: microseconds                     expr     min      lq     mean  median       uq      max neval combinat::combn(2:13, 6) 527.378 567.946 629.1268 577.163 604.3270 1816.744   100    utils::combn(2:13, 6) 663.150 712.872 750.8008 725.716 771.1345 1205.697   100 

And with gtools the total memory used is a little over 3x as much as utils. It should be noted that for these 3 packages, I obtained different results every-time I ran them (e.g. for combinat::combn sometimes I would get 9000 Mb and then I would get 13000 Mb).

Still, none can match RcppAlgos OR arrangements. Both only use 51 Mb when ran on the example above.

benchmark script: https://gist.github.com/randy3k/bd5730a6d70101c7471f4ae6f453862e (rendered by https://github.com/tidyverse/reprex)

*: An homage to A Walk through Combinatorics by Miklós Bóna

like image 163
13 revs, 2 users 82% Avatar answered Sep 29 '22 12:09

13 revs, 2 users 82%


EDIT: I have updated the answer to use a more efficient package arrangements

Getting start of using arrangement

arrangements contains some efficient generators and iterators for permutations and combinations. It has been demonstrated that arrangements outperforms most of the existing packages of similar kind. Some benchmarks could be found here.

Here are the answers to the above questions

# 1) combinations: without replacement: distinct items  combinations(5, 2)        [,1] [,2]  [1,]    1    2  [2,]    1    3  [3,]    1    4  [4,]    1    5  [5,]    2    3  [6,]    2    4  [7,]    2    5  [8,]    3    4  [9,]    3    5 [10,]    4    5   # 2) combinations: with replacement: distinct items  combinations(5, 2, replace=TRUE)        [,1] [,2]  [1,]    1    1  [2,]    1    2  [3,]    1    3  [4,]    1    4  [5,]    1    5  [6,]    2    2  [7,]    2    3  [8,]    2    4  [9,]    2    5 [10,]    3    3 [11,]    3    4 [12,]    3    5 [13,]    4    4 [14,]    4    5 [15,]    5    5    # 3) combinations: without replacement: non distinct items  combinations(x = c("a", "b", "c"), freq = c(2, 1, 1), k = 2)       [,1] [,2] [1,] "a"  "a"  [2,] "a"  "b"  [3,] "a"  "c"  [4,] "b"  "c"     # 4) combinations: with replacement: non distinct items  combinations(x = c("a", "b", "c"), k = 2, replace = TRUE)  # as `freq` does not matter       [,1] [,2] [1,] "a"  "a"  [2,] "a"  "b"  [3,] "a"  "c"  [4,] "b"  "b"  [5,] "b"  "c"  [6,] "c"  "c"   # 5) permutations: without replacement: distinct items  permutations(5, 2)        [,1] [,2]  [1,]    1    2  [2,]    1    3  [3,]    1    4  [4,]    1    5  [5,]    2    1  [6,]    2    3  [7,]    2    4  [8,]    2    5  [9,]    3    1 [10,]    3    2 [11,]    3    4 [12,]    3    5 [13,]    4    1 [14,]    4    2 [15,]    4    3 [16,]    4    5 [17,]    5    1 [18,]    5    2 [19,]    5    3 [20,]    5    4    # 6) permutations: with replacement: distinct items  permutations(5, 2, replace = TRUE)        [,1] [,2]  [1,]    1    1  [2,]    1    2  [3,]    1    3  [4,]    1    4  [5,]    1    5  [6,]    2    1  [7,]    2    2  [8,]    2    3  [9,]    2    4 [10,]    2    5 [11,]    3    1 [12,]    3    2 [13,]    3    3 [14,]    3    4 [15,]    3    5 [16,]    4    1 [17,]    4    2 [18,]    4    3 [19,]    4    4 [20,]    4    5 [21,]    5    1 [22,]    5    2 [23,]    5    3 [24,]    5    4 [25,]    5    5   # 7) permutations: without replacement: non distinct items  permutations(x = c("a", "b", "c"), freq = c(2, 1, 1), k = 2)       [,1] [,2] [1,] "a"  "a"  [2,] "a"  "b"  [3,] "a"  "c"  [4,] "b"  "a"  [5,] "b"  "c"  [6,] "c"  "a"  [7,] "c"  "b"     # 8) permutations: with replacement: non distinct items  permutations(x = c("a", "b", "c"), k = 2, replace = TRUE)  # as `freq` doesn't matter        [,1] [,2]  [1,] "a"  "a"   [2,] "a"  "b"   [3,] "a"  "c"   [4,] "b"  "a"   [5,] "b"  "b"   [6,] "b"  "c"   [7,] "c"  "a"   [8,] "c"  "b"   [9,] "c"  "c"  

Compare to other packages

There are few advantages of using arrangements over the existing packages.

  1. Integral framework: you don't have to use different packages for different methods.

  2. It is very efficient. See https://randy3k.github.io/arrangements/articles/benchmark.html for some benchmarks.

  3. It is memory efficient, it is able to generate all 13! permutation of 1 to 13, existing packages will fail to do so because of the limitation of matrix size. The getnext() method of the iterators allow users to get the arrangements one by one.

  4. The generated arrangements are in dictionary order which may be desired for some users.

like image 23
17 revs Avatar answered Sep 29 '22 13:09

17 revs