Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to run function on the deepest level only in a nested list?

Tags:

list

r

Given a nested list for example as below

lst <- list(
    1,
    list(list(c(4, 5, 4)), list(c(6, 7))),
    list(c(2, 3, 3)),
    list(list(c(5, 5, 6)), list(c(7, 7, 7)))
)

> str(lst)
List of 4
 $ : num 1
 $ :List of 2
  ..$ :List of 1
  .. ..$ : num [1:3] 4 5 4
  ..$ :List of 1
  .. ..$ : num [1:2] 6 7
 $ :List of 1
  ..$ : num [1:3] 2 3 3
 $ :List of 2
  ..$ :List of 1
  .. ..$ : num [1:3] 5 5 6
  ..$ :List of 1
  .. ..$ : num [1:3] 7 7 7

Let's say its deepest level is 3, e.g., depths of vectors 4 5 4, 6 7, 5 5 6 and 7 7 7 in lst.

I am wondering if there is a way that only runs a certain function over those deepest levels, while other levels are untouched. For example, if the the function is unique, then my expected output is

lstout <- list(
    1,
    list(list(c(4, 5)),list(c(6,7))),
    list(c(2, 3, 3)),
    list(list(c(5, 6)), list(7))
)

> str(lstout)
List of 4
 $ : num 1
 $ :List of 2
  ..$ :List of 1
  .. ..$ : num [1:2] 4 5
  ..$ :List of 1
  .. ..$ : num [1:2] 6 7
 $ :List of 1
  ..$ : num [1:3] 2 3 3
 $ :List of 2
  ..$ :List of 1
  .. ..$ : num [1:2] 5 6
  ..$ :List of 1
  .. ..$ : num 7

It seems rapply cannot run the function selectively only on the deepest level. I have no clue how to make it.

Any base R idea or solution would be greatly appreciated!

like image 734
ThomasIsCoding Avatar asked May 10 '20 13:05

ThomasIsCoding


Video Answer


2 Answers

We can recursively descend lst to find the maximum depth and then use that to recursively descend again applying unique only at the maximum depth. No packages are used.

maxDepth <- function(x, depth = 0) {
   if (is.list(x)) max(sapply(x, maxDepth, depth+1))
   else depth
}

lstUnique <- function(x, depth = maxDepth(x)) {
  if (depth == 0) unique(x)
  else if (is.list(x)) lapply(x, lstUnique, depth-1)
  else x
}

lstUnique(lst)

Variation using rapply

A variation of the above is to recursively add a class to each leaf equal to its depth. Then we can use rapply three times. First use rapply to extract the classes and take the maximum to find the maximum depth. second use rapply to apply unique on just the nodes having the maximum depth class. Third, remove any remaining classes that were not removed by unique because the node was not at maximum depth. (The third rapply, i.e. the last line of code below, could be omitted if it is ok to leave some leaves with the classes we added.)

addDepth <- function(x, depth = 0) {
  if (is.list(x)) lapply(x, addDepth, depth+1)
  else structure(x, class = format(depth))
}
lst2 <- addDepth(lst)

mx <- max(as.numeric(rapply(lst2, class))) # max depth
lst3 <- rapply(lst2, unique, classes = format(mx), how = "replace")
rapply(lst3, as.vector, how = "replace")

Note on rapply

Note that if you alternately wanted to run unique on all leaves rather than just on the maximum depth leaves then rapply in base R would work.

rapply(lst, unique, how = "replace")

data.tree

This alternative does require the use of a package. First we create a data.tree dt and then traverse it applying unique to the nodes that satisfy the filterFun.

library(data.tree)

dt <- as.Node(lst)
dt$Do(function(x) x$"1" <- unique(x$"1"), 
  filterFun = function(x) x$level == dt$height)
print(dt, "1")

rrapply

The rrapply package provides an enhancement to rapply which can also pass a position vector whose length equals the depth so we can use it first to calculate the maximum depth mx and then again to apply unique only at that depth. (Have updated rrapply call to use how = "unlist" as opposed to applying unlist afterwards as per suggestion in comments.)

library(rrapply)

mx <- max(rrapply(lst, f = function(x, .xpos) length(.xpos), how = "unlist"))
uniq_mx <- function(x, .xpos) if (length(.xpos) == mx) unique(x) else x
rrapply(lst, is.numeric, uniq_mx)
like image 133
G. Grothendieck Avatar answered Oct 31 '22 16:10

G. Grothendieck


Cannot think of a base R option, but with purrr, you can a get close solution:

modify_depth(lst, 3, unique, .ragged = TRUE)

[[1]]
[1] 1

[[2]]
[[2]][[1]]
[[2]][[1]][[1]]
[1] 4 5


[[2]][[2]]
[[2]][[2]][[1]]
[1] 6 7



[[3]]
[[3]][[1]]
[1] 2 3 3


[[4]]
[[4]][[1]]
[[4]][[1]][[1]]
[1] 5 6


[[4]][[2]]
[[4]][[2]][[1]]
[1] 7
like image 37
tmfmnk Avatar answered Oct 31 '22 16:10

tmfmnk