I have the following data frame: <pre class="prettyprint"><code>data.frame(a = c(1,2,3),b = c(1,2,3)) a b 1 1 1 2 2 2 3 3 3 </code></pre> I want to repeat the rows n times. For example, here the rows are repeated 3 times: <pre class="prettyprint"><code> a b 1 1 1 2 2 2 3 3 3 4 1 1 5 2 2 6 3 3 7 1 1 8 2 2 9 3 3 </code></pre> Is there an easy function to do this in R? Thanks!

EDIT: updated to a better modern R answer. You can use <code>replicate()</code>, then <code>rbind</code> the result back together. The rownames are automatically altered to run from 1:nrows. <pre class="prettyprint"><code>d <- data.frame(a = c(1,2,3),b = c(1,2,3)) n <- 3 do.call("rbind", replicate(n, d, simplify = FALSE)) </code></pre> A more traditional way is to use indexing, but here the rowname altering is not quite so neat (but more informative): <pre class="prettyprint"><code> d[rep(seq_len(nrow(d)), n), ] </code></pre> Here are improvements on the above, the first two using <code>purrr</code> functional programming, idiomatic purrr: <pre class="prettyprint"><code>purrr::map_dfr(seq_len(3), ~d) </code></pre> and less idiomatic purrr (identical result, though more awkward): <pre class="prettyprint"><code>purrr::map_dfr(seq_len(3), function(x) d) </code></pre> and finally via indexing rather than list apply using <code>dplyr</code>: <pre class="prettyprint"><code>d %>% slice(rep(row_number(), 3)) </code></pre>

For <code>data.frame</code> objects, this solution is several times faster than @mdsummer's and @wojciech-sobala's. <pre class="prettyprint"><code>d[rep(seq_len(nrow(d)), n), ] </code></pre> For <code>data.table</code> objects, @mdsummer's is a bit faster than applying the above after converting to <code>data.frame</code>. For large n this might flip. <img src="https://i.stack.imgur.com/hyrhD.png" alt="microbenchmark">. Full code: <pre class="prettyprint"><code>packages <- c("data.table", "ggplot2", "RUnit", "microbenchmark") lapply(packages, require, character.only=T) Repeat1 <- function(d, n) { return(do.call("rbind", replicate(n, d, simplify = FALSE))) } Repeat2 <- function(d, n) { return(Reduce(rbind, list(d)[rep(1L, times=n)])) } Repeat3 <- function(d, n) { if ("data.table" %in% class(d)) return(d[rep(seq_len(nrow(d)), n)]) return(d[rep(seq_len(nrow(d)), n), ]) } Repeat3.dt.convert <- function(d, n) { if ("data.table" %in% class(d)) d <- as.data.frame(d) return(d[rep(seq_len(nrow(d)), n), ]) } # Try with data.frames mtcars1 <- Repeat1(mtcars, 3) mtcars2 <- Repeat2(mtcars, 3) mtcars3 <- Repeat3(mtcars, 3) checkEquals(mtcars1, mtcars2) # Only difference is row.names having ".k" suffix instead of "k" from 1 & 2 checkEquals(mtcars1, mtcars3) # Works with data.tables too mtcars.dt <- data.table(mtcars) mtcars.dt1 <- Repeat1(mtcars.dt, 3) mtcars.dt2 <- Repeat2(mtcars.dt, 3) mtcars.dt3 <- Repeat3(mtcars.dt, 3) # No row.names mismatch since data.tables don't have row.names checkEquals(mtcars.dt1, mtcars.dt2) checkEquals(mtcars.dt1, mtcars.dt3) # Time test res <- microbenchmark(Repeat1(mtcars, 10), Repeat2(mtcars, 10), Repeat3(mtcars, 10), Repeat1(mtcars.dt, 10), Repeat2(mtcars.dt, 10), Repeat3(mtcars.dt, 10), Repeat3.dt.convert(mtcars.dt, 10)) print(res) ggsave("repeat_microbenchmark.png", autoplot(res)) </code></pre>

Repeat rows of a data.frame N times

Tags:

dataframe

r

I have the following data frame:

data.frame(a = c(1,2,3),b = c(1,2,3))   a b 1 1 1 2 2 2 3 3 3

I want to repeat the rows n times. For example, here the rows are repeated 3 times:

  a b 1 1 1 2 2 2 3 3 3 4 1 1 5 2 2 6 3 3 7 1 1 8 2 2 9 3 3

Is there an easy function to do this in R? Thanks!

581

asked Jan 06 '12 04:01

Michael

2 Answers

EDIT: updated to a better modern R answer.

You can use replicate(), then rbind the result back together. The rownames are automatically altered to run from 1:nrows.

d <- data.frame(a = c(1,2,3),b = c(1,2,3)) n <- 3 do.call("rbind", replicate(n, d, simplify = FALSE))

A more traditional way is to use indexing, but here the rowname altering is not quite so neat (but more informative):

 d[rep(seq_len(nrow(d)), n), ]

Here are improvements on the above, the first two using purrr functional programming, idiomatic purrr:

purrr::map_dfr(seq_len(3), ~d)

and less idiomatic purrr (identical result, though more awkward):

purrr::map_dfr(seq_len(3), function(x) d)

and finally via indexing rather than list apply using dplyr:

d %>% slice(rep(row_number(), 3))

147

answered Sep 20 '22 17:09

mdsumner

For data.frame objects, this solution is several times faster than @mdsummer's and @wojciech-sobala's.

d[rep(seq_len(nrow(d)), n), ]

For data.table objects, @mdsummer's is a bit faster than applying the above after converting to data.frame. For large n this might flip. microbenchmark .

Full code:

packages <- c("data.table", "ggplot2", "RUnit", "microbenchmark") lapply(packages, require, character.only=T)  Repeat1 <- function(d, n) {   return(do.call("rbind", replicate(n, d, simplify = FALSE))) }  Repeat2 <- function(d, n) {   return(Reduce(rbind, list(d)[rep(1L, times=n)])) }  Repeat3 <- function(d, n) {   if ("data.table" %in% class(d)) return(d[rep(seq_len(nrow(d)), n)])   return(d[rep(seq_len(nrow(d)), n), ]) }  Repeat3.dt.convert <- function(d, n) {   if ("data.table" %in% class(d)) d <- as.data.frame(d)   return(d[rep(seq_len(nrow(d)), n), ]) }  # Try with data.frames mtcars1 <- Repeat1(mtcars, 3) mtcars2 <- Repeat2(mtcars, 3) mtcars3 <- Repeat3(mtcars, 3)  checkEquals(mtcars1, mtcars2) #  Only difference is row.names having ".k" suffix instead of "k" from 1 & 2 checkEquals(mtcars1, mtcars3)  # Works with data.tables too mtcars.dt <- data.table(mtcars) mtcars.dt1 <- Repeat1(mtcars.dt, 3) mtcars.dt2 <- Repeat2(mtcars.dt, 3) mtcars.dt3 <- Repeat3(mtcars.dt, 3)  # No row.names mismatch since data.tables don't have row.names checkEquals(mtcars.dt1, mtcars.dt2) checkEquals(mtcars.dt1, mtcars.dt3)  # Time test res <- microbenchmark(Repeat1(mtcars, 10),                       Repeat2(mtcars, 10),                       Repeat3(mtcars, 10),                       Repeat1(mtcars.dt, 10),                       Repeat2(mtcars.dt, 10),                       Repeat3(mtcars.dt, 10),                       Repeat3.dt.convert(mtcars.dt, 10)) print(res) ggsave("repeat_microbenchmark.png", autoplot(res))

answered Sep 20 '22 17:09

Max Ghenis

Related questions
                            
                                Get column index from label in a data frame
                            
                                Change the Blank Cells to "NA"
                            
                                Remove multiple objects with rm()
                            
                                Generate a dummy-variable
                            
                                Compile R script into standalone .exe file?
                            
                                Split text string in a data.table columns
                            
                                Find the location of a character in string
                            
                                Why are loops slow in R?
                            
                                How can I arrange an arbitrary number of ggplots using grid.arrange?
                            
                                dplyr on data.table, am I really using data.table?
                            
                                How to subset matrix to one column, maintain matrix data type, maintain row/column names?
                            
                                Pandas version of rbind
                            
                                Collapse / concatenate / aggregate a column to a single comma separated string within each group
                            
                                Understanding the order() function
                            
                                How to listen for more than one event expression within a Shiny eventReactive handler
                            
                                How to do vlookup and fill down (like in Excel) in R?
                            
                                How to convert Excel date format to proper date in R
                            
                                How to delete the first row of a dataframe in R?
                            
                                How can I suppress the vertical gridlines in a ggplot2 plot?
                            
                                How to fix the aspect ratio in ggplot?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With