I am trying to achieve the functionality similar to unlist, with the exception that types are not coerced to a vector, but the list with preserved types is returned instead. For instance: <pre class="prettyprint"><code>flatten(list(NA, list("TRUE", list(FALSE), 0L)) </code></pre> should return <pre class="prettyprint"><code>list(NA, "TRUE", FALSE, 0L) </code></pre> instead of <pre class="prettyprint"><code>c(NA, "TRUE", "FALSE", "0") </code></pre> which would be returned by <code>unlist(list(list(NA, list("TRUE", list(FALSE), 0L))</code>. As it is seen from the example above, the flattening should be recursive. Is there a function in standard R library which achieves this, or at least some other function which can be used to easily and efficiently implement this? UPDATE: I don't know if it is clear from the above, but non-lists should not be flattened, i.e. <code>flatten(list(1:3, list(4, 5)))</code> should return <code>list(c(1, 2, 3), 4, 5)</code>.

Interesting non-trivial problem! MAJOR UPDATE With all that's happened, I've rewrote the answer and removed some dead ends. I also timed the various solutions on different cases. Here's the first, rather simple but slow, solution: <pre class="prettyprint"><code>flatten1 <- function(x) { y <- list() rapply(x, function(x) y <<- c(y,x)) y } </code></pre> <code>rapply</code> lets you traverse a list and apply a function on each leaf element. Unfortunately, it works exactly as <code>unlist</code> with the returned values. So I ignore the result from <code>rapply</code> and instead I append values to the variable <code>y</code> by doing <code><<-</code>. Growing <code>y</code> in this manner is not very efficient (it's quadratic in time). So if there are many thousands of elements this will be very slow. A more efficient approach is the following, with simplifications from @JoshuaUlrich: <pre class="prettyprint"><code>flatten2 <- function(x) { len <- sum(rapply(x, function(x) 1L)) y <- vector('list', len) i <- 0L rapply(x, function(x) { i <<- i+1L; y[[i]] <<- x }) y } </code></pre> Here I first find out the result length and pre-allocate the vector. Then I fill in the values. As you can will see, this solution is much faster. Here's a version of @JoshO'Brien great solution based on <code>Reduce</code>, but extended so it handles arbitrary depth: <pre class="prettyprint"><code>flatten3 <- function(x) { repeat { if(!any(vapply(x, is.list, logical(1)))) return(x) x <- Reduce(c, x) } } </code></pre> Now let the battle begin! <pre class="prettyprint"><code># Check correctness on original problem x <- list(NA, list("TRUE", list(FALSE), 0L)) dput( flatten1(x) ) #list(NA, "TRUE", FALSE, 0L) dput( flatten2(x) ) #list(NA, "TRUE", FALSE, 0L) dput( flatten3(x) ) #list(NA_character_, "TRUE", FALSE, 0L) # Time on a huge flat list x <- as.list(1:1e5) #system.time( flatten1(x) ) # Long time system.time( flatten2(x) ) # 0.39 secs system.time( flatten3(x) ) # 0.04 secs # Time on a huge deep list x <-'leaf'; for(i in 1:11) { x <- list(left=x, right=x, value=i) } #system.time( flatten1(x) ) # Long time system.time( flatten2(x) ) # 0.05 secs system.time( flatten3(x) ) # 1.28 secs </code></pre> ...So what we observe is that the <code>Reduce</code> solution is faster when the depth is low, and the <code>rapply</code> solution is faster when the depth is large! As correctness goes, here are some tests: <pre class="prettyprint"><code>> dput(flatten1( list(1:3, list(1:3, 'foo')) )) list(1L, 2L, 3L, 1L, 2L, 3L, "foo") > dput(flatten2( list(1:3, list(1:3, 'foo')) )) list(1:3, 1:3, "foo") > dput(flatten3( list(1:3, list(1:3, 'foo')) )) list(1L, 2L, 3L, 1:3, "foo") </code></pre> Unclear what result is desired, but I lean towards the result from <code>flatten2</code>...

How to flatten a list to a list without coercion?

Tags:

list

flatten

r

standard-library

type-coercion

I am trying to achieve the functionality similar to unlist, with the exception that types are not coerced to a vector, but the list with preserved types is returned instead. For instance:

flatten(list(NA, list("TRUE", list(FALSE), 0L))

should return

list(NA, "TRUE", FALSE, 0L)

instead of

c(NA, "TRUE", "FALSE", "0")

which would be returned by unlist(list(list(NA, list("TRUE", list(FALSE), 0L)).

As it is seen from the example above, the flattening should be recursive. Is there a function in standard R library which achieves this, or at least some other function which can be used to easily and efficiently implement this?

UPDATE: I don't know if it is clear from the above, but non-lists should not be flattened, i.e. flatten(list(1:3, list(4, 5))) should return list(c(1, 2, 3), 4, 5).

678

asked Nov 15 '11 16:11

eold

2 Answers

Interesting non-trivial problem!

MAJOR UPDATE With all that's happened, I've rewrote the answer and removed some dead ends. I also timed the various solutions on different cases.

Here's the first, rather simple but slow, solution:

flatten1 <- function(x) {   y <- list()   rapply(x, function(x) y <<- c(y,x))   y }

rapply lets you traverse a list and apply a function on each leaf element. Unfortunately, it works exactly as unlist with the returned values. So I ignore the result from rapply and instead I append values to the variable y by doing <<-.

Growing y in this manner is not very efficient (it's quadratic in time). So if there are many thousands of elements this will be very slow.

A more efficient approach is the following, with simplifications from @JoshuaUlrich:

flatten2 <- function(x) {   len <- sum(rapply(x, function(x) 1L))   y <- vector('list', len)   i <- 0L   rapply(x, function(x) { i <<- i+1L; y[[i]] <<- x })   y }

Here I first find out the result length and pre-allocate the vector. Then I fill in the values. As you can will see, this solution is much faster.

Here's a version of @JoshO'Brien great solution based on Reduce, but extended so it handles arbitrary depth:

flatten3 <- function(x) {   repeat {     if(!any(vapply(x, is.list, logical(1)))) return(x)     x <- Reduce(c, x)   } }

Now let the battle begin!

# Check correctness on original problem  x <- list(NA, list("TRUE", list(FALSE), 0L)) dput( flatten1(x) ) #list(NA, "TRUE", FALSE, 0L) dput( flatten2(x) ) #list(NA, "TRUE", FALSE, 0L) dput( flatten3(x) ) #list(NA_character_, "TRUE", FALSE, 0L)  # Time on a huge flat list x <- as.list(1:1e5) #system.time( flatten1(x) )  # Long time system.time( flatten2(x) )  # 0.39 secs system.time( flatten3(x) )  # 0.04 secs  # Time on a huge deep list x <-'leaf'; for(i in 1:11) { x <- list(left=x, right=x, value=i) } #system.time( flatten1(x) ) # Long time system.time( flatten2(x) )  # 0.05 secs system.time( flatten3(x) )  # 1.28 secs

...So what we observe is that the Reduce solution is faster when the depth is low, and the rapply solution is faster when the depth is large!

As correctness goes, here are some tests:

> dput(flatten1( list(1:3, list(1:3, 'foo')) )) list(1L, 2L, 3L, 1L, 2L, 3L, "foo") > dput(flatten2( list(1:3, list(1:3, 'foo')) )) list(1:3, 1:3, "foo") > dput(flatten3( list(1:3, list(1:3, 'foo')) )) list(1L, 2L, 3L, 1:3, "foo")

Unclear what result is desired, but I lean towards the result from flatten2...

186

answered Sep 21 '22 12:09

Tommy

For lists that are only a few nestings deep, you could use Reduce() and c() to do something like the following. Each application of c() removes one level of nesting. (For fully general solution, see EDITs below.)

L <- (list(NA, list("TRUE", list(FALSE), 0L))) Reduce(c, Reduce(c, L)) [[1]] [1] NA  [[2]] [1] "TRUE"  [[3]] [1] FALSE  [[4]] [1] 0    # TIMING TEST x <- as.list(1:4e3) system.time(flatten(x))   # Using the improved version     # user  system elapsed  # 0.14    0.00    0.13  system.time(Reduce(c, x)) # user  system elapsed  # 0.04    0.00    0.03

EDIT Just for fun, here's a version of @Tommy's version of @JoshO'Brien's solution that does work for already flat lists. FURTHER EDIT Now @Tommy's solved that problem as well, but in a cleaner way. I'll leave this version in place.

flatten <- function(x) {     x <- list(x)     repeat {         x <- Reduce(c, x)         if(!any(vapply(x, is.list, logical(1)))) return(x)     } }  flatten(list(3, TRUE, 'foo')) # [[1]] # [1] 3 #  # [[2]] # [1] TRUE #  # [[3]] # [1] "foo"

answered Sep 24 '22 12:09

Josh O'Brien

Related questions
                            
                                Why are difference lists more efficient than regular concatenation in Haskell?
                            
                                Dynamically adding elements to ArrayList in Groovy
                            
                                Grouping Python dictionary keys as a list and create a new dictionary with this list as a value
                            
                                Making all possible combinations of a list
                            
                                How to convert BASE64 string into Image with Flutter?
                            
                                List with duplicated values and suffix
                            
                                GIT list of new/modified/deleted files
                            
                                Initialize list with same bool value
                            
                                ArrayList vs List<object>
                            
                                UnsupportedOperationException in AbstractList.remove() when operating on ArrayList
                            
                                How can I combine multiple rows into a comma-delimited list in SQL Server 2005?
                            
                                Access item in a list of lists
                            
                                Groovy Method to combine list of lists
                            
                                Using LINQ to convert List<U> to List<T>
                            
                                What is the difference between sets and lists in Python?
                            
                                LINQ: How to skip one then take the rest of a sequence
                            
                                Slicing a list using a variable, in Python
                            
                                Getting the array as GET query parameters in Python
                            
                                How specify a list value as variable in ansible inventory file?
                            
                                Queue<T> vs List<T>

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With