Using lapply
, I've fed a vector of inputs into a function that for each input returns a list of two vectors - possible nth-grams and their probabilities. I end up with a list of lists (lol) with this structure:
> str(lol)
List of 3
$ :List of 2
..$ np1 : chr [1:7] "a" "years" "the" "my" ...
..$ probs: num [1:7] 0.1481 0.1357 0.0841 0.0698 0.0522 ...
$ :List of 2
..$ np1 : chr [1:167] "the" "a" "my" "years" ...
..$ probs: num [1:167] 0.2745 0.0924 0.0605 0.0437 0.0334 ...
$ :List of 2
..$ np1 : chr [1:9493] "the" "a" "my" "this" ...
..$ probs: num [1:9493] 0.267 0.0777 0.0239 0.0169 0.0158 ...
But what I'm aiming for is a single list in which all vectors $np1
are concatenated and all $probs
vectors are as well. I tried using unlist(..., recursive = F)
to get the list of two vectors, and it's gotten me closer to what I'm looking for than using unlist
without the recursive flag.
> str(unlist(lapply(inputs.list, function(x){...}), recursive = F))
List of 6
$ np1 : chr [1:7] "a" "years" "the" "my" ...
$ probs: num [1:7] 0.1481 0.1357 0.0841 0.0698 0.0522 ...
$ np1 : chr [1:167] "the" "a" "my" "years" ...
$ probs: num [1:167] 0.2745 0.0924 0.0605 0.0437 0.0334 ...
$ np1 : chr [1:9493] "the" "a" "my" "this" ...
$ probs: num [1:9493] 0.267 0.0777 0.0239 0.0169 0.0158 ...
But not quite there...
Is there a method that would help me futher consolidate the flatten list into a list of only two vectors as described?
Here a reproducible example to work with:
example1 <- list("time in"=list(np1=c("the", "a", "my", "years"), probs=c(0.2745, 0.0924, 0.0605, 0.0437)),"in"=list(np1=c("the", "a", "my", "this"), probs=c(0.267, 0.0777, 0.0239, 0.0169)))
> str(example1)
List of 2
$ time in:List of 2
..$ np1 : chr [1:4] "the" "a" "my" "years"
..$ probs: num [1:4] 0.2745 0.0924 0.0605 0.0437
$ in :List of 2
..$ np1 : chr [1:4] "the" "a" "my" "this"
..$ probs: num [1:4] 0.267 0.0777 0.0239 0.0169
Two lists can be combined in your desired way with Map
, as in
Map(c, example1[[1]], example1[[2]])
# $np1
# [1] "the" "a" "my" "years" "the" "a" "my" "this"
#
# $probs
# [1] 0.2745 0.0924 0.0605 0.0437 0.2670 0.0777 0.0239 0.0169
So, as to merge the whole list of lists we may do
Reduce(function(...) Map(c, ...), example1[c(1, 1, 2)])
# $np1
# [1] "the" "a" "my" "years" "the" "a" "my" "years" "the" "a" "my" "this"
#
# $probs
# [1] 0.2745 0.0924 0.0605 0.0437 0.2745 0.0924 0.0605 0.0437 0.2670 0.0777 0.0239 0.0169
where I purposefully made the input of length 3 as to demonstrate the functionality. In your case we need
Reduce(function(...) Map(c, ...), lol)
Here's a solution using purrr
:
library(tidyverse)
transpose(example1) %>% map(flatten) %>% map(unlist)
Output:
$np1
[1] "the" "a" "my" "years" "the" "a" "my" "this"
$probs
[1] 0.2745 0.0924 0.0605 0.0437 0.2670 0.0777 0.0239 0.0169
Here is an "unlist" solution that is similar to what you were working on. It relies on the vectors you are interested in always alternating (e.g., it is always nth
and then probs
. Good luck and let me know if it doesn't work for you!
unlist_ed <- unlist(example1, recursive = F)
list(
np1 = unlist(unlist_ed[c(T, F)]),
probs = unlist(unlist_ed[c(F, T)])
)
$np1
time in.np11 time in.np12 time in.np13 time in.np14 in.np11 in.np12 in.np13 in.np14
"the" "a" "my" "years" "the" "a" "my" "this"
$probs
time in.probs1 time in.probs2 time in.probs3 time in.probs4 in.probs1 in.probs2 in.probs3
0.2745 0.0924 0.0605 0.0437 0.2670 0.0777 0.0239
in.probs4
0.0169
EDIT: I thought of another solution that relies on the vector names being the same, but it is much faster (not that that is the goal). Wanted to update!
dplyr::bind_rows(example1)
# A tibble: 8 x 2
np1 probs
<chr> <dbl>
1 the 0.274
2 a 0.0924
3 my 0.0605
4 years 0.0437
5 the 0.267
6 a 0.0777
7 my 0.0239
8 this 0.0169
Not a perfect benchmark:
example1 <- rapply(example1, function(x) rep(x, 1e4), how = "list")
example1 <- rep(example1, 100)
microbenchmark::microbenchmark(
o1 = {
Reduce(function(...) Map(c, ...), example1)
},
o2 = {
unlist_ed <- unlist(example1, recursive = F)
list(
nth = unlist(unlist_ed[c(T, F)]),
probs = unlist(unlist_ed[c(F, T)])
)
},
o3 = {
transpose(example1) %>% map(flatten) %>% map(unlist)
},
o4 = {
binded <- dplyr::bind_rows(example1)
list(binded$np1,
binded$probs)
},
times = 1
)
Unit: milliseconds
expr min lq mean median uq max neval
o1 5022.25495 5022.25495 5022.25495 5022.25495 5022.25495 5022.25495 1
o2 5146.75265 5146.75265 5146.75265 5146.75265 5146.75265 5146.75265 1
o3 2491.21422 2491.21422 2491.21422 2491.21422 2491.21422 2491.21422 1
o4 83.32919 83.32919 83.32919 83.32919 83.32919 83.32919 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With