Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Names of nested list containing dots (e.g. "c.2)

Tags:

list

r

How can I get the names of the leafs of a nested list (containing a dataframe)

p <- list(a=1,b=list(b1=2,b2=3),c=list(c1=list(c11='a',c12='x'),c.2=data.frame("t"=1)))

into a vector format:

[[1]]
[1] "a"
[[2]]
[1] "b" "b1"
[[3]]
[1] "b" "b2"
[[4]]
[1] "c" "c1" "c11"
[[5]]
[1] "c" "c1" "c12"
[[6]]
[1] "c" "c.2"

The problem is that my list contains names with a dot (e.g. "c.2"). By using unlist, one gets "c.c.2" and I (or possibly strsplit) can't tell if the point is a delimiter of unlist or part of the name. That is the difference to this question.

It should ignore data.frames. My approach so far is adapted from here, but struggles with the points created by unlist:

listNames = function(l, maxDepth = 2) {
  n = 0
  listNames_rec = function(l, n) {
    if(!is.list(l) | is.data.frame(l) | n>=maxDepth) TRUE
    else { 
      n = n + 1
      # print(n)
      lapply(l, listNames_rec, n)
    }
  }
  n = names(unlist(listNames_rec(l, n)))
  return(n)
}
listNames(p, maxDepth = 3)
[1] "a"        "b.b1"     "b.b2"     "c.c1.c11" "c.c1.c12" "c.c.2"  
like image 505
sequoia Avatar asked Aug 18 '21 08:08

sequoia


2 Answers

Like this?

subnames <- function(L, s) {
  if (!is.list(L) || is.data.frame(L)) return(L)
  names(L) <- gsub(".", s, names(L), fixed = TRUE)
  lapply(L, subnames, s)
}

res <- listNames(subnames(p, ":"), maxDepth = 3)
gsub(":", ".",
  gsub(".", "$", res, fixed = TRUE),
  fixed = TRUE
)
#[1] "a"        "b$b1"     "b$b2"     "c$c1$c11" "c$c1$c12" "c$c.2" 
like image 181
Roland Avatar answered Nov 18 '22 17:11

Roland


Not a full answer but I imagine rrapply package could help you here? One option could be to extract all names:

library(rrapply)
library(dplyr)
rrapply(p, how = "melt") %>% 
  select(-value)
#   L1   L2   L3
# 1  a <NA> <NA>
# 2  b   b1 <NA>
# 3  b   b2 <NA>
# 4  c   c1  c11
# 5  c   c1  c12
# 6  c  c.2    t

The problem here is that data.frame names are included above too so you could extract them separately:

#extract data frame name
rrapply(p, classes = "data.frame", how = "melt") %>% 
  select(-value)
#   L1  L2
# 1  c c.2

Then you could play around with these two datasets and perhaps extract duplicates but keep dataframe names

rrapply(p, how = "melt") %>%  
  bind_rows(rrapply(p, classes = "data.frame", how = "melt")) 
  #then filter etc...
like image 3
user63230 Avatar answered Nov 18 '22 18:11

user63230