I have a list of lists of characters. For example:
l <- list(list("A"),list("B"),list("C","D"))
So as you can see some elements are lists of length > 1.
I want to convert this list of lists to a character vector, but I'd like the lists with length > 1 to appear as a single element in the character vector.
the unlist
function does not achieve that but rather:
> unlist(l)
[1] "A" "B" "C" "D"
Is there anything faster than:
sapply(l,function(x) paste(unlist(x),collapse=""))
To get my desired result:
"A" "B" "CD"
To convert List to Vector in R, use the unlist() function. The unlist() function simplifies to produce a vector by preserving all atomic components.
Converting a List to Vector in R Language – unlist() Function. unlist() function in R Language is used to convert a list to vector. It simplifies to produce a vector by preserving all components.
A list holds different data such as Numeric, Character, logical, etc. Vector stores elements of the same type or converts implicitly. Lists are recursive, whereas vector is not. The vector is one-dimensional, whereas the list is a multidimensional object.
unlist() function in R The unlist() function is used to convert a list to vector in R. The unlist() function takes the list as an argument and returns the Vector.
You can skip the unlist step. You already figured out that paste0
needs collapse = TRUE
to "bind" sequential elements of a vector together:
> sapply( l, paste0, collapse="")
[1] "A" "B" "CD"
Here's a variation of @thela's suggestion, if you don't mind a multi-line approach:
x <- lengths(l) ## Get the lengths of each list
l[x > 1] <- lapply(l[x > 1], paste0, collapse = "") ## Paste only those together
unlist(l, use.names = FALSE) ## Unlist the result
# [1] "A" "B" "CD"
Alternatively, if you don't mind using a package, look at the "stringi" package, specifically stri_flatten
, as suggested by @Jota.
Here's a performance comparison:
l <- list(list("A"), list("B"), list("B"), list("B"), list("B"),
list("C","D"), list("E","F", "G", "H"),
as.list(rep(letters,10)), as.list(rep(letters,2)))
l <- unlist(replicate(1000, l, FALSE), recursive = FALSE)
funop <- function() sapply(l,function(x) paste(unlist(x),collapse=""))
fun42 <- function() sapply(l, paste0, collapse="")
funv <- function() vapply(l, paste0, character(1L), collapse = "")
funam <- function() {
x <- lengths(l)
l[x > 1] <- lapply(l[x > 1], paste0, collapse = "")
unlist(l, use.names = FALSE)
}
funj <- function() sapply(l, stri_flatten)
funamj <- function() {
x <- lengths(l)
l[x > 1] <- lapply(l[x > 1], stri_flatten)
unlist(l, use.names = FALSE)
}
library(microbenchmark)
microbenchmark(funop(), fun42(), funv(), funam(), funj(), times = 20)
# Unit: milliseconds
# expr min lq mean median uq max neval cld
# funop() 78.21822 84.79588 85.30055 85.36399 86.90540 90.48321 20 e
# fun42() 56.16938 57.35735 61.60008 58.04969 65.82836 81.46482 20 d
# funv() 54.64101 56.23245 60.07896 57.26049 63.96815 78.58043 20 d
# funam() 45.89760 46.89890 48.99810 47.29617 48.28764 56.92544 20 c
# funj() 28.73405 29.94041 32.00676 30.56711 31.11448 39.93765 20 b
# funamj() 18.64829 19.01328 21.05989 19.12468 19.52516 32.87569 20 a
Note: The relative efficiency of this approach would depend on how many list items are going to have length(x) > 1
. If most of them are going to be > 1
anyway, then just go with @42-'s approach. stri_flatten
only improves performance if you have long character vectors to paste together as in sample list used for the above benchmark, otherwise, it doesn't help.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With