Efficiently replicate matrix rows by group in R

Question

I am trying to find a way to efficiently replicate rows of a matrix in R based on a group. Let's say I have the following matrix a:

a <- matrix(
  c(1, 2, 3,
    4, 5, 6,
    7, 8, 9),
  ncol = 3, byrow = TRUE
)

I want to create a new matrix where each row in a is repeated based on a number specified in a vector (what I'm calling a "group"), e.g.:

reps <- c(2, 3, 4)

In this case, the resulting matrix would be:

     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    1    2    3
[3,]    4    5    6
[4,]    4    5    6
[5,]    4    5    6
[6,]    7    8    9
[7,]    7    8    9
[8,]    7    8    9
[9,]    7    8    9

This is the only solution I've come up with so far:

matrix(
  rep(a, times = rep(reps, times = 3)), 
  ncol = 3, byrow = FALSE
)

Notice that in this solution I have to use rep() twice - first to replicate the reps vector, and then again to actually replicate each row of a.

This solution works fine, but I'm looking for a more efficient solution as in my case this is being done inside an optimization loop and is being computed in each iteration of the loop, and it's rather slow if a is large.

I'll note that this question is very similar, but it is about repeating each row the same number of times. This question is also similarly about efficiency, but it's about replicating entire matrices.

UPDATE

Since I'm interested in efficiency, here is a simple comparison of the solutions provided thus far...I'll update this as more come in, but in general it looks like the seq_along solution by F. Privé is the fastest.

library(dplyr)
library(tidyr)

a <- matrix(seq(9), ncol = 3, byrow = TRUE)
reps <- c(2, 3, 4)

rbenchmark::benchmark(
  "original solution" = {
    result <- matrix(rep(a, times = rep(reps, times = 3)),
      ncol = 3, byrow = FALSE)
  },
  "seq_along" = {
    result <- a[rep(seq_along(reps), reps), ]
  },
  "uncount" = {
    result <- as.data.frame(a) %>%
      uncount(reps)
  },
    replications = 1000,
    columns = c("test", "replications", "elapsed", "relative")
)

               test replications elapsed relative
1 original solution         1000   0.004    1.333
2         seq_along         1000   0.003    1.000
3           uncount         1000   1.722  574.000

F. Privé · Accepted Answer

Simply use a[rep(seq_along(reps), reps), ].

Efficiently replicate matrix rows by group in R

Tags:

performance

r

matrix

replication

UPDATE

jhelvy

1 Answers

F. Privé

Recent Activity

Donate For Us

Efficiently replicate matrix rows by group in R

Tags:

performance

r

matrix

replication

UPDATE

jhelvy

1 Answers

F. Privé

Related questions

Recent Activity

Donate For Us