I have a data.frame with two columns representing a hierarchical tree, with parents and nodes.
I want to transform its structure in a way that I can use as an input for the function d3tree
, from d3Network
package.
Here's my data frame:
df <- data.frame(c("Canada","Canada","Quebec","Quebec","Ontario","Ontario"),c("Quebec","Ontario","Montreal","Quebec City","Toronto","Ottawa"))
names(df) <- c("parent","child")
And I want to transform it to this structure
Canada_tree <- list(name = "Canada", children = list(
list(name = "Quebec",
children = list(list(name = "Montreal"),list(name = "Quebec City"))),
list(name = "Ontario",
children = list(list(name = "Toronto"),list(name = "Ottawa")))))
I have succesfully transformed this particular case using this code below:
fill_list <- function(df,node) node <- as.character(node)if (is.leaf(df,node)==TRUE){
return (list(name = node))
}
else {
new_node = df[df[,1] == node,2]
return (list(name = node, children = list(fill_list(df,new_node[1]),fill_list(df,new_node[2]))))
}
The problem is, it only works with trees which every parent node has exactly two children. You can see I hard coded the two children (new_node[1] and new_node[2]) as inputs for my recursive function.
I'm trying to figure out a way that I could call the recursive function as many time as the parent's node children. Example:
fill_list(df,new_node[1]),...,fill_list(df,new_node[length(new_node)])
I tried these 3 possibilities but none of it worked:
First: Creating a string with all the functions and parameters and then evaluating. It return this error could not find function fill_functional(df,new_node[1])
. That's because my function wasn´t created by the time I called it after all.
fill_functional <- function(df,node) {
node <- as.character(node)
if (is.leaf(df,node)==TRUE){
return (list(name = node))
}
else {
new_node = df[df[,1] == node,2]
level <- length(new_node)
xxx <- paste0("(df,new_node[",seq(level),"])")
lapply(xxx,function(x) eval(call(paste("fill_functional",x,sep=""))))
}
}
Second: Using a for loop. But I only got the children of my root node.
L <- list()
fill_list <- function(df,node) {
node <- as.character(node)
if (is.leaf(df,node)==TRUE){
return (list(name = node))
}
else {
new_node = df[df[,1] == node,2]
for (i in 1:length(new_node)){
L[i] <- (fill_list(df,new_node[i]))
}
return (list(name = node, children = L))
}
}
Third: Creating a function that populates a list with elements that are functions, and just changing the arguments. But I wasn't able to accomplish anything interesting, and I'm afraid I'll have the same problem as I did on my first try described above.
Here is a recursive definition:
maketreelist <- function(df, root = df[1, 1]) {
if(is.factor(root)) root <- as.character(root)
r <- list(name = root)
children = df[df[, 1] == root, 2]
if(is.factor(children)) children <- as.character(children)
if(length(children) > 0) {
r$children <- lapply(children, maketreelist, df = df)
}
r
}
canadalist <- maketreelist(df)
That produces what you desire. This function assumes that the first column of the data.frame
(or matrix
) you pass in contains the parent
column and the second column has the child
. it also takes a root
parameter which allows you to specify a starting points. It will default to the first parent in the list.
But if you really are interested in playing round with trees. The igraph
package might be of interest
library(igraph)
g <- graph.data.frame(df)
plot(g)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With