I'm trying to get all combinations of rows from one column to itself, while keeping the values from a second column.
library(dplyr)
library(tidyr)
dt0 <-
data.frame(
row = letters[1:10],
n1 = c(2, 2, 1, 3, 1, 5, 1, 3, 2, 2)
)
dt0 |>
expand(
row1 = row,
row2 = row
) |>
filter(row1 < row2) |>
left_join(
dt0 |>
rename(n1.x = n1),
by = join_by(row1 == row)
) |>
left_join(
dt0 |>
rename(n1.y = n1),
by = join_by(row2 == row)
)
the expected result is:
# A tibble: 45 × 4
row1 row2 n1.x n1.y
<chr> <chr> <dbl> <dbl>
1 a b 2 2
2 a c 2 1
3 a d 2 3
4 a e 2 1
5 a f 2 5
6 a g 2 1
7 a h 2 3
8 a i 2 2
9 a j 2 2
10 b c 2 1
# ℹ 35 more rows
# ℹ Use `print(n = ...)` to see more rows
But I don't know how to generalize this to generate all combinations of the elements of the rows of the data.frame taken m
at a time, so my question is:
How can I generalize this pattern for any number of rows in expand(...)
? For example, with three
dt0 |>
expand(
row1 = row,
row2 = row,
row3 = row
) |>
filter(row1 < row2) |>
filter(row2 < row3) |>
left_join(
dt0 |>
rename(n1.x = n1),
by = join_by(row1 == row)
) |>
left_join(
dt0 |>
rename(n1.y = n1),
by = join_by(row2 == row)
) |>
left_join(
dt0 |>
rename(n1.z = n1),
by = join_by(row3 == row)
)
# A tibble: 120 × 6
row1 row2 row3 n1.x n1.y n1.z
<chr> <chr> <chr> <dbl> <dbl> <dbl>
1 a b c 2 2 1
2 a b d 2 2 3
3 a b e 2 2 1
4 a b f 2 2 5
5 a b g 2 2 1
6 a b h 2 2 3
7 a b i 2 2 2
8 a b j 2 2 2
9 a c d 2 1 3
10 a c e 2 1 1
# ℹ 110 more rows
# ℹ Use `print(n = ...)` to see more rows
I guess combn
rather than expand
fits your purpose better
f <- function(dt0, k) {
with(
dt0,
cbind(
setNames(data.frame(t(combn(row, k))), paste0("row", seq(k))),
setNames(data.frame(t(combn(n1, k))), paste0("n1.", seq(k)))
)
)
}
or
f <- function(dt0, k) {
do.call(
cbind,
lapply(
seq_along(dt0),
\(i) setNames(
data.frame(t(combn(dt0[[i]], k))),
paste0(names(dt0[i]), ".", seq(k))
)
)
)
}
such that
> head(f(dt0, 2), 10)
row.1 row.2 n1.1 n1.2
1 a b 2 2
2 a c 2 1
3 a d 2 3
4 a e 2 1
5 a f 2 5
6 a g 2 1
7 a h 2 3
8 a i 2 2
9 a j 2 2
10 b c 2 1
> head(f(dt0, 3), 10)
row.1 row.2 row.3 n1.1 n1.2 n1.3
1 a b c 2 2 1
2 a b d 2 2 3
3 a b e 2 2 1
4 a b f 2 2 5
5 a b g 2 2 1
6 a b h 2 2 3
7 a b i 2 2 2
8 a b j 2 2 2
9 a c d 2 1 3
10 a c e 2 1 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With