I want to apply a function to all pairwise combinations of list elements.
Each element is a vector with the same length. I would like the output in a n x n
matrix format, n
being the number of elements in my list.
Consider the following example:
# Generating data
l <- list()
for(i in 1:5) l[[i]] <- sample(0:9, 5, T)
# Function to apply
foo <- function(x, y) 1 - sum(x * y) / sqrt(sum(x ^ 2) * sum(y ^ 2))
# Generating combinations
comb <- expand.grid(x = 1:5, y = 1:5)
This loop works but it is slow and the output is not formatted as a matrix
# Applying function
out <- list()
for(i in 1:nrow(comb)) {
out[[i]] <- foo(l[[comb[i, 'x']]], l[[comb[i, 'y']]])
}
Any idea?
A nested sapply would do the trick:
sapply(l, function(x) sapply(l, function(y) foo(x,y)))
I was interested in @A. Webb's solution. Here is some benchmarking:
R> for(i in 1:50) l[[i]] <- sample(0:9, 5, T)
R> microbenchmark(sapply(l, function(x) sapply(l, function(y) foo(x,y))), outer(l,l,Vectorize(foo)), time=1000)
Unit: nanoseconds
expr min lq
sapply(l, function(x) sapply(l, function(y) foo(x, y))) 7493739 8479127.0
outer(l, l, Vectorize(foo)) 6778098 8316362.5
time 5 48.5
mean median uq max neval
1.042e+07 1.027e+07 1.155e+07 17982289 100
1.030e+07 1.002e+07 1.187e+07 16076063 100
1.672e+02 1.385e+02 1.875e+02 914 100
R> for(i in 1:500) l[[i]] <- sample(0:9, 5, T)
R> microbenchmark(sapply(l, function(x) sapply(l, function(y) foo(x,y))), outer(l,l,Vectorize(foo)), times=100)
Unit: milliseconds
expr min lq mean
sapply(l, function(x) sapply(l, function(y) foo(x, y))) 677.3 768.5 820.4
outer(l, l, Vectorize(foo)) 828.6 903.0 958.3
median uq max neval
815.9 842.7 1278 100
930.7 960.5 1819 100
So for smaller lists the outer solution is a little faster, but for larger lists it appears that the nested sapply solution may be a bit faster.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With