Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Column to nested list separated by /

Tags:

list

r

I have a column where each row element is serpearted by "/":

data.frame(column=c("a","a/air","a/aero/breath","b","b/boy","b/bag/band/brand"))

How can I convert it into nested lists after each "/". So the aim is to get:

list(a=list("air"=1,aero=list("breath"=1)),b=list("boy"=1,bag=list(band=list("brand"=1)))) 

I need this for the shinyTree package to make a tree out of the column.

I have added the "=1" at the end of the last elements in the hierarchy as it is required to show up in the shinyTree output. The list can then be put in the code below to get the tree:

library(shiny)
library(shinyTree)

tree <- list(a=list("air"=1,aero=list("breath"=1)),b=list("boy"=1,bag=list(band=list("brand"=1)))) 


typeof(tree)

ui <- fluidPage(
  fluidPage(
    sidebarLayout(
      sidebarPanel(
        actionButton('reset', 'Reset nodes')
      ),
      mainPanel(
        shinyTree("tree", ),
        hr(),
        "Selected nodes:",
        verbatimTextOutput("idSelected")#,
      )
    )
  )
)

server <- function(input, output, session) {
  
  treeSelection <- reactiveVal(list())
  
  output$tree = renderTree({
    tree
  })
  
  observeEvent(input$reset, {
    updateTree(session, "tree", data = tree)
    treeSelection(list())
  })
  
  observeEvent(input$tree, {
    treeSelection(get_selected(input$tree, format = "classid"))
  })
  
  output$idSelected <- renderPrint({
    treeSelection()
  })
  
}

shinyApp(ui, server)
like image 491
Sahib Avatar asked Feb 17 '21 12:02

Sahib


2 Answers

Since the variables look like paths, I created the sample data as vector like

paths <- c(
  "a",
  "a/air",
  "a/aero/breath",
  "b",
  "b/boy",
  "b/bag/band/brand"
)

Then you can use the following function to get your nested list. I hope the choice of variablenames is explanatory enough.

pathsToNestedList <- function(x) {
  pathSplit <- strsplit(x,"/")
  pathStarts <- sapply(pathSplit,"[[",1)
  uniquePathStarts <- unique(pathStarts)
  
  pathEnds <- sapply(pathSplit, function(pathParts) {
    if(length(pathParts) <= 1) return("")
    paste0(pathParts[2:length(pathParts)],collapse="/")
  })
  
  splitLengths <- sapply(pathSplit,length)
  stillToParse <- unique(pathStarts[splitLengths > 1])
  
  endedIndices <- pathEnds == ""
  endedHere <- pathStarts[endedIndices]
  endedHere <- setdiff(endedHere,stillToParse)
  
  if(length(endedHere)) {
    pathEnds <- pathEnds[!endedIndices]
    pathStarts <- pathStarts[!endedIndices]
    uniquePathStarts <- unique(pathStarts)
    return(c(
      setNames(as.list(rep(1,length(endedHere))),endedHere),
      setNames(lapply(uniquePathStarts, function(ps) {
        pathsToNestedList(pathEnds[pathStarts == ps])
      }),uniquePathStarts)
    ))
  } else {
    return(
      setNames(lapply(uniquePathStarts, function(ps) {
        pathsToNestedList(pathEnds[!endedIndices & (pathStarts == ps)])
      }),uniquePathStarts))
  }
}

Note: I updated my answer according to your updated question.

Update: The function can be simplified to:

pathsToNestedList <- function(x) {
  nonNaIndices <- !is.na(x)
  nonEmptyIndices <- x != ""
  x <- x[nonNaIndices & nonEmptyIndices]
  if(!length(x)) return()
  
  pathSplit <- strsplit(x,"/")
  pathStarts <- sapply(pathSplit,"[[",1)
  
  pathEnds <- sapply(pathSplit, function(pathParts) {
    if(length(pathParts) <= 1) return("")
    paste0(pathParts[2:length(pathParts)],collapse="/")
  })
  
  splitLengths <- sapply(pathSplit,length)
  stillToParse <- unique(pathStarts[splitLengths > 1])
  
  endedIndices <- pathEnds == ""
  endedHere <- pathStarts[endedIndices]
  endedHere <- setdiff(endedHere,stillToParse)
  
  pathEnds <- pathEnds[!endedIndices]
  pathStarts <- pathStarts[!endedIndices]
  uniquePathStarts <- unique(pathStarts)
  
  #Concatenate the list of paths that ended with a list that is parsed again.
  #If one of those lists is empty, the concatenation behaves like
  #one would expect: It does nothing.
  return(
    c(setNames(as.list(rep(1,length(endedHere))),endedHere),
      setNames(lapply(uniquePathStarts, function(ps) {
        pathsToNestedList(pathEnds[pathStarts == ps])
      }),uniquePathStarts)
    )
  )
}

Moreover I recognized that it crashes with NA and empty strings. Hence I added a removal part in the beginning of the function.

like image 130
Jonas Avatar answered Sep 22 '22 07:09

Jonas


Another option is to use rrapply() in the rrapply-package, which has a dedicated option how = "unmelt" to unmelt a data.frame to a nested list:

library(rrapply)
library(data.table)

paths <- c("a","a/air","a/aero/breath","b","b/boy","b/bag/band/brand")

## create data.frame/data.table with node paths
paths_melt <- as.data.table(tstrsplit(paths[grepl("/", paths)], split = "/"))
paths_melt[, value := 1L]
paths_melt
#>    V1   V2     V3    V4 value
#> 1:  a  air   <NA>  <NA>     1
#> 2:  a aero breath  <NA>     1
#> 3:  b  boy   <NA>  <NA>     1
#> 4:  b  bag   band brand     1

## unmelt to nested list
rrapply(paths_melt, how = "unmelt")
#> $a
#> $a$air
#> [1] 1
#> 
#> $a$aero
#> $a$aero$breath
#> [1] 1
#> 
#> 
#> 
#> $b
#> $b$boy
#> [1] 1
#> 
#> $b$bag
#> $b$bag$band
#> $b$bag$band$brand
#> [1] 1
like image 35
Joris C. Avatar answered Sep 20 '22 07:09

Joris C.