I am trying to do the matrix multiplication <code>S_g</code> for each i, and each g with i. This is what I have tried so far, but it takes a huge amount of time to complete. Is there a more computationally efficient method to do exactly the same thing? The main thing to note from this formula is the <code>S_g</code> uses X_gamma and Y[,i] in matrix multiplication set-up. X_gamma is dependent on value <code>g</code>. Therefore, for each i, I have to perform <code>g</code> matrix multiplications. Here is the logic: <ul> <li>For each i, the computation needs to be done for each g. Then, for each g, X_gamma is selected as a subset of X. Here is how X_gamma is determined. Let's take g = 3. When we look at 'set[3,]', we have that column B is the only one with value != 0. Therefore, I select the column B in X, and that would be X_gamma.</li> </ul> My main problem is that IN REALITY, <code>g = 13,000</code>, and <code>i = 700</code>. <pre class="prettyprint"><code> library(foreach) library(doParallel) ## parallel backend for the foreach function registerDoParallel() T = 3 c = 100 X <- zoo(data.frame(A = c(0.1, 0.2, 0.3), B = c(0.4, 0.5, 0.6), C = c(0.7,0.8,0.9)), order.by = seq(from = as.Date("2013-01-01"), length.out = 3, by = "month")) Y <- zoo(data.frame(Stock1 = rnorm(3,0,0.5), Stock2 = rnorm(3,0,0.5), Stock3 = rnorm(3,0,0.5)), order.by = seq(from = as.Date("2013-01-01"), length.out = 3, by = "month")) l <- rep(list(0:1),ncol(X)) set = do.call(expand.grid, l) colnames(set) <- colnames(X) I = diag(T) denom <- foreach(i=1:ncol(Y)) %dopar% { library(zoo) library(stats) library(Matrix) library(base) result = c() for(g in 1:nrow(set)) { X_gamma = X[,which(colnames(X) %in% colnames(set[which(set[g,] != 0)]))] S_g = Y[,i] %*% (I - (c/(1+c))*(X_gamma %*% solve(crossprod(X_gamma)) %*% t(X_gamma))) %*% Y[,i] result[g] = ((1+c)^(-sum(set[g,])/2)) * ((S_g)^(-T/2)) } sum(result) } </code></pre> Thank you for your help!

The most obvious problem is that you fell victim to one of the classic blunders: not preallocating the output vector <code>result</code>. Appending one value at a time can be very inefficient for large vectors. In your case, <code>result</code> doesn't need to be a vector: you can accumulate the results in a single value: <pre class="prettyprint"><code>result = 0 for(g in 1:nrow(set)) { # snip result = result + ((1+c)^(-sum(set[g,])/2)) * ((S_g)^(-T/2)) } result </code></pre> But I think the most important performance improvement that you could make is to precompute expressions that are currently being computed repeatedly in the <code>foreach</code> loop. You can do that with a separate <code>foreach</code> loop. I also suggest using <code>solve</code> differently to avoid the second matrix multiplication: <pre class="prettyprint"><code>X_gamma_list <- foreach(g=1:nrow(set)) %dopar% { X_gamma <- X[, which(set[g,] != 0)] I - (c/(1+c)) * (X_gamma %*% solve(crossprod(X_gamma), t(X_gamma))) } </code></pre> These computations are now performed only once, rather than once for each column of <code>Y</code>, which is 700 times less work in your case. In a similar vein, it makes sense to factor out the expression <code>((1+c)^(-sum(set[g,])/2))</code>, as suggested by tim riffe, as well as <code>-T / 2</code> while we're at it: <pre class="prettyprint"><code>a <- (1+c) ^ (-rowSums(set) / 2) nT2 <- -T / 2 </code></pre> To iterate over the columns of the <code>zoo</code> object <code>Y</code>, I suggest using the <code>isplitCols</code> function from the <code>itertools</code> package. Make sure you load <code>itertools</code> at the top of your script: <pre class="prettyprint"><code>library(itertools) </code></pre> <code>isplitCols</code> let's you send only the columns that are needed for each task, rather than sending the entire object to all workers. The only trick is that you need to remove the <code>dim</code> attribute from the resulting <code>zoo</code> objects for your code to work, since <code>isplitCols</code> uses <code>drop=TRUE</code>. Finally, here's the main <code>foreach</code> loop: <pre class="prettyprint"><code>denom <- foreach(Yi=isplitCols(Y, chunkSize=1), .packages='zoo') %dopar% { dim(Yi) <- NULL # isplitCols uses drop=FALSE result <- 0 for(g in seq_along(X_gamma_list)) { S_g <- Yi %*% X_gamma_list[[g]] %*% Yi result <- result + a[g] * S_g ^ nT2 } result } </code></pre> Note that I would not perform the inner loop in parallel. That would only make sense if there weren't enough columns in <code>Y</code> to keep all of your processors busy. Parallelizing the inner loop could result in tasks that are too short, effectively unchunking the computation and making the code run much slower. It's much more important to perform the inner loop efficiently since <code>g</code> is large.

I second @eddi that you should give some objects so we can run code. The following remarks are based on staring: 1) you could save <code>S_g</code> in a preallocated vector and do that last line ( <code>((1+c)^(-sum(set[g,])/2)) * ((S_g)^(-T/2))</code> ) out of the loop, since <code>rowSums(set)</code> will give you what you need. That removes one instance of indexing with <code>g</code> 2) indexing is slowing you down. Don't use <code>which()</code>. Logical vectors work just fine. 3) <code>-T/2</code> is dangerous. It can mean <code>-0.5</code>. If that's what you want, then just do <code>1/sqrt(S_g_vec)</code> for speed.

Efficient way to perform matrix multiplication repeatedly

Tags:

performance

loops

foreach

r

parallel-processing

I am trying to do the matrix multiplication S_g for each i, and each g with i. This is what I have tried so far, but it takes a huge amount of time to complete. Is there a more computationally efficient method to do exactly the same thing?

The main thing to note from this formula is the S_g uses X_gamma and Y[,i] in matrix multiplication set-up. X_gamma is dependent on value g. Therefore, for each i, I have to perform g matrix multiplications.

Here is the logic:

For each i, the computation needs to be done for each g. Then, for each g, X_gamma is selected as a subset of X. Here is how X_gamma is determined. Let's take g = 3. When we look at 'set[3,]', we have that column B is the only one with value != 0. Therefore, I select the column B in X, and that would be X_gamma.

My main problem is that IN REALITY, g = 13,000, and i = 700.

 library(foreach)
 library(doParallel) ## parallel backend for the foreach function
 registerDoParallel()

 T = 3
 c = 100

 X <- zoo(data.frame(A = c(0.1, 0.2, 0.3), B = c(0.4, 0.5, 0.6), C = c(0.7,0.8,0.9)),
     order.by = seq(from = as.Date("2013-01-01"), length.out = 3, by = "month")) 

 Y <- zoo(data.frame(Stock1 = rnorm(3,0,0.5), Stock2 = rnorm(3,0,0.5), Stock3 = rnorm(3,0,0.5)), 
    order.by = seq(from = as.Date("2013-01-01"), length.out = 3, by = "month"))

 l <- rep(list(0:1),ncol(X))
 set = do.call(expand.grid, l)
 colnames(set) <- colnames(X)

 I = diag(T)


 denom <- foreach(i=1:ncol(Y)) %dopar% {    
    library(zoo)
    library(stats)
    library(Matrix)
    library(base)

    result = c()
    for(g in 1:nrow(set)) {
        X_gamma = X[,which(colnames(X) %in% colnames(set[which(set[g,] != 0)]))]
        S_g = Y[,i] %*% (I - (c/(1+c))*(X_gamma %*% solve(crossprod(X_gamma)) %*% t(X_gamma))) %*% Y[,i] 
        result[g] = ((1+c)^(-sum(set[g,])/2)) * ((S_g)^(-T/2))
    }
    sum(result) 
 }

Thank you for your help!

556

asked Sep 10 '13 16:09

Mayou

2 Answers

The most obvious problem is that you fell victim to one of the classic blunders: not preallocating the output vector result. Appending one value at a time can be very inefficient for large vectors.

In your case, result doesn't need to be a vector: you can accumulate the results in a single value:

result = 0
for(g in 1:nrow(set)) {
    # snip
    result = result + ((1+c)^(-sum(set[g,])/2)) * ((S_g)^(-T/2))
}
result

But I think the most important performance improvement that you could make is to precompute expressions that are currently being computed repeatedly in the foreach loop. You can do that with a separate foreach loop. I also suggest using solve differently to avoid the second matrix multiplication:

X_gamma_list <- foreach(g=1:nrow(set)) %dopar% {
  X_gamma <- X[, which(set[g,] != 0)]
  I - (c/(1+c)) * (X_gamma %*% solve(crossprod(X_gamma), t(X_gamma)))
}

These computations are now performed only once, rather than once for each column of Y, which is 700 times less work in your case.

In a similar vein, it makes sense to factor out the expression ((1+c)^(-sum(set[g,])/2)), as suggested by tim riffe, as well as -T / 2 while we're at it:

a <- (1+c) ^ (-rowSums(set) / 2)
nT2 <- -T / 2

To iterate over the columns of the zoo object Y, I suggest using the isplitCols function from the itertools package. Make sure you load itertools at the top of your script:

library(itertools)

isplitCols let's you send only the columns that are needed for each task, rather than sending the entire object to all workers. The only trick is that you need to remove the dim attribute from the resulting zoo objects for your code to work, since isplitCols uses drop=TRUE.

Finally, here's the main foreach loop:

denom <- foreach(Yi=isplitCols(Y, chunkSize=1), .packages='zoo') %dopar% {
  dim(Yi) <- NULL  # isplitCols uses drop=FALSE
  result <- 0
  for(g in seq_along(X_gamma_list)) {
    S_g <- Yi %*% X_gamma_list[[g]] %*% Yi
    result <- result + a[g] * S_g ^ nT2
  }
  result
}

Note that I would not perform the inner loop in parallel. That would only make sense if there weren't enough columns in Y to keep all of your processors busy. Parallelizing the inner loop could result in tasks that are too short, effectively unchunking the computation and making the code run much slower. It's much more important to perform the inner loop efficiently since g is large.

168

answered Sep 22 '22 08:09

Steve Weston

I second @eddi that you should give some objects so we can run code. The following remarks are based on staring:

1) you could save S_g in a preallocated vector and do that last line ( ((1+c)^(-sum(set[g,])/2)) * ((S_g)^(-T/2)) ) out of the loop, since rowSums(set) will give you what you need. That removes one instance of indexing with g

2) indexing is slowing you down. Don't use which(). Logical vectors work just fine.

3) -T/2 is dangerous. It can mean -0.5. If that's what you want, then just do 1/sqrt(S_g_vec) for speed.

answered Sep 23 '22 08:09

tim riffe

Related questions
                            
                                Redundancy of Function Parse Trees in R
                            
                                Identity of rows in r data frame
                            
                                Write a dataframe with different number of decimal places per column in R
                            
                                Switch Statement with repeated commands in R
                            
                                Allow foreach workers to register and distribute sub-tasks to other workers
                            
                                R programming - counting the occurrence of a certain range of numbers
                            
                                R : How to write an XYZ file from a SpatialPointsDataFrame?
                            
                                Split vector at unknown index
                            
                                merging endpoints of a range with a sequence
                            
                                What is the name of this syntax trick & where is it documented?
                            
                                Is it possible to have zip iterator (i.e. "zip" two iterators together) in foreach?
                            
                                Saving workspace (in a particular frame) for post-mortem debugging in R
                            
                                invalid line type: must be length 2, 4, 6 or 8
                            
                                Flip facet label and x axis with ggplot2
                            
                                Join results in more than 2^31 rows (internal vecseq reached physical limit)
                            
                                How to define an S4 prototype for inherited slots
                            
                                Multiple boxplots using ggplot
                            
                                How to work with the orderbook with R "by" function?
                            
                                Is it possible to create an ellipsis (`...`) object from scratch?
                            
                                How to Sample a specific proportion of lines from a big file in R?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With