Problem:
Is there a simple way to get all combinations of two (or more) identical vectors. But only show unique combinations.
Reproducible example:
library(tidyr)
x = 1:3
expand_grid(a = x,
b = x,
c = x)
# A tibble: 27 x 3
a b c
<int> <int> <int>
1 1 1 1
2 1 1 2
3 1 1 3
4 1 2 1
5 1 2 2
6 1 2 3
7 1 3 1
8 1 3 2
9 1 3 3
10 2 1 1
# ... with 17 more rows
But, if row 1 2 1
exists, then I do not want to see 1 1 2
or 2 1 1
. I.e. show only unique combinations of the three vectors (any order).
To find the unique pair combinations of an R data frame column values, we can use combn function along with unique function.
expand. grid() function in R Language is used to create a data frame with all the values that can be formed with the combinations of all the vectors or factors passed to the function as argument.
The function expand. grid() creates a data frame with all possible combinations of vectors or factors given as arguments.
What is the expand.grid () function? It is a function in R’s Base system, meaning that it is already there when you install R for the first time, and does not even require any additional package to be installed. From the function’s documentation, it “Create a Data Frame from All Combinations of Factor Variables”.
The expand.grid function returns the Cartesian product which is fundamentally different. The Cartesian product operates on multiple objects which may or may not be the same. Generally speaking, combination functions are applied to a single vector.
‘expand.grid’ from the base package is a useful function in its own right, most well-known perhaps for its use in generating hyperparameter tuning grids in machine learning models. ‘expand.grid’ produces a data frame in columns rather than a matrix in rows like ‘combn’.
From the function’s documentation, it “Create a Data Frame from All Combinations of Factor Variables”. There is also a more recent adaptation of it into a tidyr::expand_grid () one, which takes care of some annoying side effects, and also allows expanding data.frames.
library(gtools)
x = 1:3
df <- as.data.frame(combinations(n=3,r=3,v=x,repeats.allowed=T))
df
output
V1 V2 V3
1 1 1 1
2 1 1 2
3 1 1 3
4 1 2 2
5 1 2 3
6 1 3 3
7 2 2 2
8 2 2 3
9 2 3 3
10 3 3 3
You can just sort rowwise and remove duplicates. Continuing from your expand_grid()
, then
df <- tidyr::expand_grid(a = x,
b = x,
c = x)
data.frame(unique(t(apply(df, 1, sort))))
X1 X2 X3
1 1 1 1
2 1 1 2
3 1 1 3
4 1 2 2
5 1 2 3
6 1 3 3
7 2 2 2
8 2 2 3
9 2 3 3
10 3 3 3
Using comboGeneral
from the RcppAlgos
package, it's implemented in C++ and pretty fast.
x <- 1:3
RcppAlgos::comboGeneral(x, repetition=TRUE)
# [,1] [,2] [,3]
# [1,] 1 1 1
# [2,] 1 1 2
# [3,] 1 1 3
# [4,] 1 2 2
# [5,] 1 2 3
# [6,] 1 3 3
# [7,] 2 2 2
# [8,] 2 2 3
# [9,] 2 3 3
# [10,] 3 3 3
Note: If you're running Linux, you will need gmp
installed, e.g. for Ubuntu do:
sudo apt install libgmp3-dev
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With