Efficiently replicate matrices in R

Q: How do you replicate a matrix in R?

The matrix can be created by using matrix function in R and if we want to create a matrix by replicating a vector then we just need to focus on the replication. For example, if we have a vector V and we want to create matrix by replicating V two times then the matrix can be created as matrix(replicate(2,V),nrow=2).

Q: What does replicate () do in R?

replicate() function in R Programming Language is used to evaluate an expression N number of times repeatedly.

Q: How do you duplicate a row of a matrix in R?

For example, if we have a matrix that contains only one row and three columns then the replication of that matrix three times will repeat that one row three times. This can be done by using rep function along with matrix function as shown in the below example.

Q: How do I replicate a Dataframe in R?

The replicates of a data frame in R can be created with the help of sapply function, to set the number of times we want to repeat the data frame we can use rep.int,times argument.

Tags:

r

matrix

replication

I have a matrix and look for an efficient way to replicate it n times (where n is the number of observations in the dataset). For example, if I have a matrix A

A <- matrix(1:15, nrow=3)

then I want an output of the form

rbind(A, A, A, ...) #n times.

Obviously, there are many ways to construct such a large matrix, for example using a for loop or apply or similar functions. However, the call to the "matrix-replication-function" takes place in the very core of my optimization algorithm where it is called tens of thousands of times during one run of my program. Therefore, loops, apply-type of functions and anything similar to that are not efficient enough. (Such a solution would basically mean that a loop over n is performed tens of thousands of times, which is obviously inefficient.) I already tried to use the ordinary rep function, but haven't found a way to arrange the output of rep in a matrix of the desired format.

The solution do.call("rbind", replicate(n, A, simplify=F)) is also too inefficient because rbind is used too often in this case. (Then, about 30% of the total runtime of my program are spent performing the rbinds.)

Does anyone know a better solution?

308

asked Oct 23 '12 16:10

Wolfgang Pößnecker

2 Answers

Two more solutions:

The first is a modification of the example in the question

do.call("rbind", rep(list(A), n))

The second involves unrolling the matrix, replicating it, and reassembling it.

matrix(rep(t(A),n), ncol=ncol(A), byrow=TRUE)

Since efficiency is what was requested, benchmarking is necessary

library("rbenchmark")
A <- matrix(1:15, nrow=3)
n <- 10

benchmark(rbind(A, A, A, A, A, A, A, A, A, A),
          do.call("rbind", replicate(n, A, simplify=FALSE)),
          do.call("rbind", rep(list(A), n)),
          apply(A, 2, rep, n),
          matrix(rep(t(A),n), ncol=ncol(A), byrow=TRUE),
          order="relative", replications=100000)

which gives:

                                                 test replications elapsed
1                 rbind(A, A, A, A, A, A, A, A, A, A)       100000    0.91
3                   do.call("rbind", rep(list(A), n))       100000    1.42
5  matrix(rep(t(A), n), ncol = ncol(A), byrow = TRUE)       100000    2.20
2 do.call("rbind", replicate(n, A, simplify = FALSE))       100000    3.03
4                                 apply(A, 2, rep, n)       100000    7.75
  relative user.self sys.self user.child sys.child
1    1.000      0.91        0         NA        NA
3    1.560      1.42        0         NA        NA
5    2.418      2.19        0         NA        NA
2    3.330      3.03        0         NA        NA
4    8.516      7.73        0         NA        NA

So the fastest is the raw rbind call, but that assumes n is fixed and known ahead of time. If n is not fixed, then the fastest is do.call("rbind", rep(list(A), n). These were for a 3x5 matrix and 10 replications. Different sized matrices might give different orderings.

EDIT:

For n=600, the results are in a different order (leaving out the explicit rbind version):

A <- matrix(1:15, nrow=3)
n <- 600

benchmark(do.call("rbind", replicate(n, A, simplify=FALSE)),
          do.call("rbind", rep(list(A), n)),
          apply(A, 2, rep, n),
          matrix(rep(t(A),n), ncol=ncol(A), byrow=TRUE),
          order="relative", replications=10000)

giving

                                                 test replications elapsed
4  matrix(rep(t(A), n), ncol = ncol(A), byrow = TRUE)        10000    1.74
3                                 apply(A, 2, rep, n)        10000    2.57
2                   do.call("rbind", rep(list(A), n))        10000    2.79
1 do.call("rbind", replicate(n, A, simplify = FALSE))        10000    6.68
  relative user.self sys.self user.child sys.child
4    1.000      1.75        0         NA        NA
3    1.477      2.54        0         NA        NA
2    1.603      2.79        0         NA        NA
1    3.839      6.65        0         NA        NA

If you include the explicit rbind version, it is slightly faster than the do.call("rbind", rep(list(A), n)) version, but not by much, and slower than either the apply or matrix versions. So a generalization to arbitrary n does not require a loss of speed in this case.

151

answered Sep 21 '22 15:09

Brian Diggs

Probably this is more efficient:

apply(A, 2, rep, n)

answered Sep 22 '22 15:09

Sven Hohenstein

Related questions
                            
                                apply() is slow - how to make it faster or what are my alternatives?
                            
                                How to extract fitted splines from a GAM (`mgcv::gam`)
                            
                                Variable importance using the caret package (error); RandomForest algorithm
                            
                                Adjust spacing between text in horizontal legend
                            
                                What is the difference between with and within in R?
                            
                                ddply multiple quantiles by group
                            
                                How to get currency exchange rates in R
                            
                                Unlist list of lists that have matrix elements to a list of matrices
                            
                                R: ggplot2 pointrange example
                            
                                How to automatically load settings in R on OSX? How to find R_HOME, configure Rprofile.site, etc?
                            
                                R changing format of scale on y-axis
                            
                                How to draw a standard normal distribution in R [duplicate]
                            
                                Plot a legend and well-spaced universal y-axis and main titles in grid.arrange
                            
                                read/write data in libsvm format
                            
                                Debugging unexpected errors in R -- how can I find where the error occurred?
                            
                                Equivalent of R's paste command for vector of numbers in Python
                            
                                Error Function Erf(z)
                            
                                R: Capitalizing everything after a certain character
                            
                                How to fill NA with median?
                            
                                How to convert in both directions between year,month,day and dates in R?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With