I have a list of vectors (shown below). I would like to know in which list elements each element of the vectors are. In other words, I would like to invert the list to make a new list whose names
are taken from the vectors.
What is the best method to do this?
lst <- list(a=c(2, 3, 6, 10, 15, 17), b=c(4, 6, 9, 7, 6, 4, 3, 10),
c=c(9, 2, 1, 4, 3), d=c(3, 6, 17))
lst
$a
[1] 2 3 6 10 15 17
$b
[1] 4 6 9 7 6 4 3 10
$c
[1] 9 2 1 4 3
$d
[1] 3 6 17
I would like to get the following answer.
$`1`
[1] "c"
$`10`
[1] "a" "b"
$`15`
[1] "a"
$`17`
[1] "a" "d"
$`2`
[1] "a" "c"
$`3`
[1] "a" "b" "c" "d"
$`4`
[1] "b" "b" "c"
$`6`
[1] "a" "b" "b" "d"
$`7`
[1] "b"
$`9`
[1] "b" "c"
Lists are similar to strings, which are ordered collections of characters, except that the elements of a list can be of any type. Lists and strings — and other collections that maintain the order of their items — are called sequences.
"\n"; ?>
The list can be created using list() function in R. Named list is also created with the same function by specifying the names of the elements to access them. Named list can also be created using names() function to specify the names of elements after defining the list.
Python has a great built-in list type named "list". List literals are written within square brackets [ ].
Here's a base R way with stack
and unstack
:
unstack(stack(lst), ind ~ values)
# $`1`
# [1] "c"
#
# $`2`
# [1] "a" "c"
#
# $`3`
# [1] "a" "b" "c" "d"
#
# $`4`
# [1] "b" "b" "c"
#
# $`6`
# [1] "a" "b" "b" "d"
#
# $`7`
# [1] "b"
#
# $`9`
# [1] "b" "c"
#
# $`10`
# [1] "a" "b"
#
# $`15`
# [1] "a"
#
# $`17`
# [1] "a" "d"
Here's an approach using split
from base R after using melt
from "reshape2":
library(reshape2)
x <- melt(lst)
split(x$L1, x$value)
# $`1`
# [1] "c"
#
# $`2`
# [1] "a" "c"
#
# $`3`
# [1] "a" "b" "c" "d"
#
# $`4`
# [1] "b" "b" "c"
#
# $`6`
# [1] "a" "b" "b" "d"
#
# $`7`
# [1] "b"
#
# $`9`
# [1] "b" "c"
#
# $`10`
# [1] "a" "b"
#
# $`15`
# [1] "a"
#
# $`17`
# [1] "a" "d"
Similarly, in base R with stack
:
x <- stack(lapply(lst, c))
split(as.character(x$ind), x$values)
Or even more directly if you were working with "lst" and not "lst":
x <- stack(lst)
split(as.character(x$ind), x$values)
To elaborate on my comment, the more efficient way I was describing to was:
split(rep(names(lst), lapply(lst, nrow)), unlist(lst, use.names = FALSE))
Applied to a much bigger lst
, we get the following:
fun1 <- function() split(rep(names(lst), lapply(lst, nrow)), unlist(lst, use.names = FALSE))
fun2 <- function() { x <- stack(lapply(lst, c)) ; split(as.character(x$ind), x$values) }
fun3 <- function() { x <- melt(lst) ; split(x$L1, x$value) }
fun4 <- function() unstack(stack(lapply(lst, as.vector)), ind ~ values)
## Make lst much bigger
lst <- unlist(replicate(10000, lst, simplify = FALSE), recursive=FALSE)
names(lst) <- make.unique(names(lst))
library(microbenchmark)
system.time(fun3())
# user system elapsed
# 48.338 0.000 47.643
microbenchmark(fun1(), fun2(), fun4(), times = 5)
# Unit: milliseconds
# expr min lq median uq max neval
# fun1() 454.5913 456.6793 473.901 555.8954 574.4394 5
# fun2() 922.1282 1028.4972 1034.872 1068.4761 1150.8072 5
# fun4() 1222.5296 1300.0643 1323.253 1339.2037 1421.1546 5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With