I'm looking for as much speed as possible and staying in base to do what expand.grid
does. I have used outer
for similar purposes in the past to create a vector; something like this:
v <- outer(letters, LETTERS, paste0) unlist(v[lower.tri(v)])
Benchmarking has shown me that outer
can be drastically faster than expand.grid
but this time I want to create two columns just like expand.grid
(all possible combos for 2 vectors) but my methods with outer
do not benchmark as fast with outer this time.
I'm hoping to take 2 vectors and create every possible combo as two columns as fast as possible (I think outer
may be the route but am wide open to any base method.
Here's the expand.grid
method and outer
method.
dat <- cbind(mtcars, mtcars, mtcars) expand.grid(seq_len(nrow(dat)), seq_len(ncol(dat))) FOO <- function(x, y) paste(x, y, sep=":") x <- outer(seq_len(nrow(dat)), seq_len(ncol(dat)), FOO) apply(do.call("rbind", strsplit(x, ":")), 2, as.integer)
The microbenchmarking shows outer
is slower:
# expr min lq median uq max # EXPAND.G 812.743 838.6375 894.6245 927.7505 27029.54 # OUTER 5107.871 5198.3835 5329.4860 5605.2215 27559.08
I think my outer
use is slow because I don't know how to use outer
to directly create a length 2 vector that I can do.call('rbind'
together. I have to slow paste
and slow split. How can I do this with outer
(or other methods in base
) in a way that's faster than expand grid
?
EDIT: Adding the microbenchmark results.
**
Unit: microseconds expr min lq median uq max 1 ERNEST 34.993 39.1920 52.255 57.854 29170.705 2 JOHN 13.997 16.3300 19.130 23.329 266.872 3 ORIGINAL 352.720 372.7815 392.377 418.738 36519.952 4 TOMMY 16.330 19.5960 23.795 27.061 6217.374 5 VINCENT 377.447 400.3090 418.505 451.864 43567.334
**
The documentation for rep.int
isn't quite complete. It isn't just fastest in the most common case because you can pass vectors for the times argument, just like with rep
. You can use it straightforward for both sequences reducing the time another 40% or so over Tommy's.
expand.grid.jc <- function(seq1,seq2) { cbind(Var1 = rep.int(seq1, length(seq2)), Var2 = rep.int(seq2, rep.int(length(seq1),length(seq2)))) }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With