I have the following, working, code:
test_hierarchie <- tribble(~child, ~parent,
"A", "B",
"B", "C",
"D", "E"
)
test_hierarchie_transformed <- test_hierarchie %>%
left_join(test_hierarchie, by = c("parent" = "child"), suffix = c("", "_grant")) %>%
left_join(test_hierarchie, by = c("parent_grant" = "child"), suffix = c("", "_grant")) %>%
left_join(test_hierarchie, by = c("parent_grant_grant" = "child"), suffix = c("", "_grant")) %>%
left_join(test_hierarchie, by = c("parent_grant_grant_grant" = "child"), suffix = c("", "_grant")) %>%
left_join(test_hierarchie, by = c("parent_grant_grant_grant_grant" = "child"), suffix = c("", "_grant")) %>%
pivot_longer(names_to = "relation", cols = contains("parent"), values_to = "parent") %>%
filter(!is.na(parent))
With result:
# A tibble: 4 x 3
child relation parent
<chr> <chr> <chr>
1 A parent B
2 A parent_grant C
3 B parent C
4 D parent E
This is the desired result, the large amount of left_joins are there because I'm for the real data not sure what is the maximum hierarchy.
My question is: is there a way to do this more succinct and dynamic? Thanks!
EDIT 1: Yes, I do mean 'grand' instead of 'grant', haha EDIT 2: Great solution, exactly what I was looking for! Thanks everyone for pitching in, the other day I was thinking about another project and iGraph does seem very helpful for that.
The fastest and easiest way to perform multiple left joins in R is by using reduce function from purrr package and, of course, left_join from dplyr. If you have to combine only a few data sets, then other solutions may be nested left_join functions from the dplyr package.
More rows may also appear if you have NA values in both A 's and B 's names on which you join. So make sure you exclude those.
A self-join, also known as an inner join, is a structured query language (SQL) statement where a queried table is joined to itself. The self-join statement is necessary when two sets of data, within the same table, are compared.
Joins with dplyr. dplyr uses SQL database syntax for its join functions. A left join means: Include everything on the left (what was the x data frame in merge() ) and all rows that match from the right (y) data frame. If the join columns have the same name, all you need is left_join(x, y) .
Figure 3: dplyr left_join Function. The difference to the inner_join function is that left_join retains all rows of the data table, which is inserted first into the function (i.e. the X-data). Have a look at the R documentation for a precise definition: Example 3: right_join dplyr R Function. Right join is the reversed brother of left join:
Figure 7: dplyr anti_join Function. As you can see, the anti_join functions keeps only rows that are non-existent in the right-hand data AND keeps only columns of the left-hand data. The R help documentation of anti join is shown below: At this point you have learned the basic principles of the six dplyr join functions.
In order to merge our data based on inner_join, we simply have to specify the names of our two data frames (i.e. data1 and data2) and the column based on which we want to merge (i.e. the column ID ): Figure 2: dplyr inner_join Function. Figure 2 illustrates the output of the inner join that we have just performed.
How do I join multiple dataframes in R using dplyr ? this is the code I am using to left join x and y the code doesn't work for multiple joins This is how you join multiple data sets in R usually. You can use left_join instead of merge if you like. Use Reduce (function (dtf1,dtf2) left_join (dtf1,dtf2,by="index"), list (x,y,z)).
Following the suggestion by @zx8754 one option to achieve your desired result would be to do the left_joins
via a recursive function which stops when there are no more matches:
library(dplyr)
library(tidyr)
test_hierarchie <- tribble(
~child, ~parent,
"A", "B",
"B", "C",
"D", "E"
)
left_join_recursive <- function(x, by) {
x <- left_join(x, test_hierarchie, by = setNames("child", by), suffix = c("", "_grant"))
byby <- paste0(by, "_grant")
if (!all(is.na(x[[byby]]))) {
left_join_recursive(x, byby)
} else {
x
}
}
test_hierarchie_transformed <- left_join_recursive(test_hierarchie, "parent") %>%
pivot_longer(names_to = "relation", cols = contains("parent"), values_to = "parent") %>%
filter(!is.na(parent))
test_hierarchie_transformed
#> # A tibble: 4 × 3
#> child relation parent
#> <chr> <chr> <chr>
#> 1 A parent B
#> 2 A parent_grant C
#> 3 B parent C
#> 4 D parent E
To check wether the approach works in a more general case I added another row to your example data:
test_hierarchie <- add_row(test_hierarchie, child = "C", parent = "D")
test_hierarchie_transformed <- left_join_recursive(test_hierarchie, "parent") %>%
pivot_longer(names_to = "relation", cols = contains("parent"), values_to = "parent") %>%
filter(!is.na(parent))
test_hierarchie_transformed
#> # A tibble: 10 × 3
#> child relation parent
#> <chr> <chr> <chr>
#> 1 A parent B
#> 2 A parent_grant C
#> 3 A parent_grant_grant D
#> 4 A parent_grant_grant_grant E
#> 5 B parent C
#> 6 B parent_grant D
#> 7 B parent_grant_grant E
#> 8 D parent E
#> 9 C parent D
#> 10 C parent_grant E
As was mentioned you can use the igraph package, but it probably only pays off for more complex cases:
library(tidyverse)
library(igraph)
test_hierarchie <- tribble(~child, ~parent,
"A", "B",
"B", "C",
"D", "E"
)
g <- graph_from_data_frame(test_hierarchie)
finals <- V(g)[degree(g, mode = "out") == 0]
starts <- V(g)[!V(g) %in% finals]
#starts <- V(g)[degree(g, mode = "in") == 0] # use this to avoid sub-paths
imap_dfr(starts,
~enframe(all_simple_paths(g, from = starts[[.y]], to = finals)[[1]],
name = "parent") %>%
mutate(child = .y)) %>%
filter(child != parent) %>%
select(-value) %>%
group_by(child) %>%
mutate(nr = row_number() - 1) %>%
ungroup() %>%
mutate(relation = map_chr(nr, ~str_c("parent", str_c(rep("_grant", .x), collapse = "")))) %>%
select(child, relation, parent)
# # A tibble: 4 x 3
# child relation parent
# <chr> <chr> <chr>
# 1 A parent B
# 2 A parent_grant C
# 3 B parent C
# 4 D parent E
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With