Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert a list of lists to a character vector

I have a list of lists of characters. For example:

l <- list(list("A"),list("B"),list("C","D"))

So as you can see some elements are lists of length > 1.

I want to convert this list of lists to a character vector, but I'd like the lists with length > 1 to appear as a single element in the character vector.

the unlist function does not achieve that but rather:

> unlist(l)
[1] "A" "B" "C" "D"

Is there anything faster than:

sapply(l,function(x) paste(unlist(x),collapse=""))

To get my desired result:

"A"  "B"  "CD"
like image 849
dan Avatar asked Jan 06 '16 01:01

dan


People also ask

How do I convert a list to a character vector in R?

To convert List to Vector in R, use the unlist() function. The unlist() function simplifies to produce a vector by preserving all atomic components.

How do I create a list vector in R?

Converting a List to Vector in R Language – unlist() Function. unlist() function in R Language is used to convert a list to vector. It simplifies to produce a vector by preserving all components.

How does a vector differ from a list R?

A list holds different data such as Numeric, Character, logical, etc. Vector stores elements of the same type or converts implicitly. Lists are recursive, whereas vector is not. The vector is one-dimensional, whereas the list is a multidimensional object.

How do I unlist a list in R?

unlist() function in R The unlist() function is used to convert a list to vector in R. The unlist() function takes the list as an argument and returns the Vector.


2 Answers

You can skip the unlist step. You already figured out that paste0 needs collapse = TRUE to "bind" sequential elements of a vector together:

> sapply( l, paste0, collapse="")
[1] "A"  "B"  "CD"
like image 100
IRTFM Avatar answered Oct 11 '22 06:10

IRTFM


Here's a variation of @thela's suggestion, if you don't mind a multi-line approach:

x <- lengths(l)                                     ## Get the lengths of each list
l[x > 1] <- lapply(l[x > 1], paste0, collapse = "") ## Paste only those together
unlist(l, use.names = FALSE)                        ## Unlist the result
# [1] "A"  "B"  "CD"

Alternatively, if you don't mind using a package, look at the "stringi" package, specifically stri_flatten, as suggested by @Jota.


Here's a performance comparison:

l <- list(list("A"), list("B"), list("B"), list("B"), list("B"),
          list("C","D"), list("E","F", "G", "H"), 
          as.list(rep(letters,10)), as.list(rep(letters,2)))
l <- unlist(replicate(1000, l, FALSE), recursive = FALSE)

funop <- function() sapply(l,function(x) paste(unlist(x),collapse=""))
fun42 <- function() sapply(l, paste0, collapse="")
funv  <- function() vapply(l, paste0, character(1L), collapse = "")
funam <- function() {
  x <- lengths(l)
  l[x > 1] <- lapply(l[x > 1], paste0, collapse = "")
  unlist(l, use.names = FALSE)
}
funj <- function() sapply(l, stri_flatten)
funamj <- function() {
  x <- lengths(l)
  l[x > 1] <- lapply(l[x > 1], stri_flatten)
  unlist(l, use.names = FALSE)
}

library(microbenchmark)
microbenchmark(funop(), fun42(), funv(), funam(), funj(), times = 20)
# Unit: milliseconds
#      expr      min       lq     mean   median       uq      max neval   cld
#   funop() 78.21822 84.79588 85.30055 85.36399 86.90540 90.48321    20     e
#   fun42() 56.16938 57.35735 61.60008 58.04969 65.82836 81.46482    20    d 
#    funv() 54.64101 56.23245 60.07896 57.26049 63.96815 78.58043    20    d 
#   funam() 45.89760 46.89890 48.99810 47.29617 48.28764 56.92544    20   c  
#    funj() 28.73405 29.94041 32.00676 30.56711 31.11448 39.93765    20  b   
#  funamj() 18.64829 19.01328 21.05989 19.12468 19.52516 32.87569    20 a 

Note: The relative efficiency of this approach would depend on how many list items are going to have length(x) > 1. If most of them are going to be > 1 anyway, then just go with @42-'s approach. stri_flatten only improves performance if you have long character vectors to paste together as in sample list used for the above benchmark, otherwise, it doesn't help.

like image 36
4 revs, 2 users 86% Avatar answered Oct 11 '22 07:10

4 revs, 2 users 86%