I can use do.call
to sum two vectors elementwise:
do.call(what="+", args =list(c(0,0,1), c(1,2,3))
>[1] 1 2 4
However, if I'd like to call the same operator with a list of three vectors, it fails:
do.call(what = "+", args = list(c(0,0,1), c(1,2,3), c(9,1,2)))
>Error in `+`(c(0, 0, 1), c(1, 2, 3), c(9, 1, 2)): operator needs one or two arguments
I could use Reduce
Reduce(f = "+", x = list(c(0,0,1), c(1,2,3), c(9,1,2)))
>[1] 10 3 6
but I am aware of the overhead generated by the Reduce
operation as compared to do.call
and in my REAL application it isn't tolerable, as I need to sum not 3-element lists, but rather 10^5-element list of 10^4-element-long vectors.
UPD: Reduce
turned out to be the fastest method, after all...
lst <- list(1:10000, 10001:20000, 20001:30000)
lst2 <- lst[rep(seq.int(length(lst)), 1000)]
microbenchmark::microbenchmark(colSums(do.call(rbind, lst2)),
vapply(transpose(lst2), sum, 0),
Reduce(f = "+", x = lst2))
Unit: milliseconds
expr min lq mean median uq max neval cld
colSums(do.call(rbind, lst2)) 153.5086 194.9139 222.7954 198.1952 201.8152 915.6354 100 b
vapply(transpose(lst2), sum, 0) 398.9424 537.3834 732.4747 781.7255 813.7376 1538.4301 100 c
Reduce(f = "+", x = lst2) 101.5618 105.5864 139.8651 108.1204 112.7861 2567.1793 100 a
As your list gets larger, you might find that this starts to become fast:
# careful if you use the tidyverse that purrr does not mask transpose
library(data.table)
lst <- list(c(0,0,1), c(1,2,3), c(9, 1, 2))
vapply(transpose(lst), sum, 0)
# [1] 10 3 6
I have taken a few answers to compare speed, which seems to be what you want.
# make the list a bit bigger...
lst2 <- lst[rep(seq.int(length(lst)), 1000)]
microbenchmark::microbenchmark(Reduce(`+`, lst2),
colSums(do.call(rbind, lst2)),
vapply(transpose(lst2), sum, 0),
eval(str2lang(paste0(lst2,collapse = "+"))))
)
Unit: microseconds
expr min lq mean median uq max neval
Reduce(`+`, lst2) 954.9 1088.10 1341.271 1191.05 1389.00 6923.2 100
colSums(do.call(rbind, lst2)) 402.2 474.80 761.473 538.85 843.75 7079.7 100
vapply(transpose(lst2), sum, 0) 81.9 91.85 110.455 103.90 119.00 330.4 100
eval(str2lang(paste0(lst2, collapse = "+"))) 17489.2 18466.65 20767.888 19572.25 20809.80 57770.4 100
Here it is though with longer vectors, as is your use case. This benchmark will take a minute or two to run. Notice the unit is now in milliseconds. I think it will depend on how long the list is.
lst <- list(1:10000, 10001:20000, 20001:30000)
lst2 <- lst[rep(seq.int(length(lst)), 1000)]
microbenchmark::microbenchmark(colSums(do.call(rbind, lst2)),
vapply(transpose(lst2), sum, 0))
)
Unit: milliseconds
expr min lq mean median uq max neval
colSums(do.call(rbind, lst2)) 141.7147 146.6305 188.5108 163.4915 228.7852 270.5679 100
vapply(transpose(lst2), sum, 0) 261.8630 335.6093 348.6241 341.6958 348.6404 495.0994 100
You could use :
colSums(do.call(rbind, lst))
#[1] 10 3 6
Or similarly :
rowSums(do.call(cbind, lst))
where lst
is :
lst <- list(c(0,0,1), c(1,2,3), c(9, 1, 2))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With