I am trying to find an elegant and fast way to vectorrize the simple code below. It basically deals with nested for loops, but the nesting is unusual. The function special_print can be replaced by any function taking the sane vector of arguments.
Thanks for the help, Patrick
vector <- 1:30
special_print = function(i,j,k) {
print(c(vector[i], vector[j], vector[k]))
}
for (i in 1:30) {
for (j in i:30) {
for (k in j:30){
special_print(i,j,k)
}
}
}
My question is to find a way to generate a structure "index" to be used in the following code
apply(index, special_print, MARGIN = 1)
and generating the same output as above
I have tried the following subroutines, but they seem to be take too much time
is.increasing = function(x){
return( all(diff(x) > 0))
}
increasing_index = function(a) {
clean_index <- apply(a,is.increasing, MARGIN = 1)
b = a[clean_index == TRUE,]
return(b)
}
data <- replicate(1:30, 3) %>% as.data.frame()
a <- expand.grid(data) %>% as.data.frame()
index <- increasing_index(a)
One thing slowing you down is working with data frames. Don't do that. Matrices are faster, especially with row operations and apply()
. Once you have a matrix, it is very fast to do a <- a[a[,1] <= a[,2] & a[,2] <= a[,3],]
.
So I'd write your code like this:
a <- as.matrix(expand.grid(1:30, 1:30, 1:30))
a <- a[a[,1] <= a[,2] & a[,2] <= a[,3],]
apply(a, special_print, MARGIN = 1)
This produces the rows in a different order than your for loops, with column 1 varying fastest. If that matters, you can do it this way:
a <- as.matrix(expand.grid(1:30, 1:30, 1:30))[,3:1]
a <- a[a[,1] <= a[,2] & a[,2] <= a[,3],]
apply(a, special_print, MARGIN = 1)
where the first line now reverses the order of the columns.
EDITED to add:
Here's an even better way, inspired by @ThomasIsCoding's answer:
a <- t(combn(1:32, 3) - 0:2)
This takes 3 items from 32, then keeps the first, subtracts 1 from the second, and 2 from the third. That gives a non-decreasing sequence of 3 chosen from 1 to 30. It assumes combn()
always returns the values in increasing order; I couldn't spot that as guaranteed in the docs, but it appears to be true in practice.
I would see if mapply(…, )
would succeed.
mapply( special_func(i=i, j=j, k=k),
i=rep(1:30, each=30*30),
j=rep(1:30, each =30, times=30),
k=rep(1:30, times=30*30)
)
The mapply
function should be efficient, but you are on notice that it is your “special function” is probably responsible for any perceived inefficiency. You should probably be examining its algorithms to see if if they can be vectorized.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With