My data.frame <code>DATA</code> is <pre class="prettyprint"><code> k l g 1 A 2004 12 2 B 2004 3.4 3 C 2004 4.5 </code></pre> Another data.frame <code>DATA2</code> is <pre class="prettyprint"><code> i d t 1 A 2012 22 2 B 2012 4.8 3 C 2012 5.6 </code></pre> I want to get <pre class="prettyprint"><code>1 A 2004 12 1 A 2012 22 2 B 2004 3.4 2 B 2012 4.8 3 C 2004 4.5 3 C 2012 5.6 </code></pre>

You can try the <code>interleave</code> function from the "gdata" package. However, this would require your inputs have the same column names and have the same number of rows. The approach would be: <pre class="prettyprint"><code>library(gdata) # for interleave do.call(interleave, lapply(list(df1, df2), setNames, paste0("V", 1:ncol(df1)))) # V1 V2 V3 # 1 A 2004 12.0 # 11 A 2012 22.0 # 2 B 2004 3.4 # 21 B 2012 4.8 # 3 C 2004 4.5 # 31 C 2012 5.6 </code></pre> Alternatively, as mentioned in my comment @akrun's answer, depending on whether the first column is a grouping variable or not, you may want to modify his approach a little. For instance, imagine there were a third <code>data.frame</code>, with a different number of rows than the others. <code>interleave</code> would not work on that, but the <code>rbindlist</code> approach would. <pre class="prettyprint"><code>df3 <- do.call(rbind, lapply(list(df1, df2), setNames, c("A", "B", "Z"))) rbindlist(list(df1, df2, df3), idcol = TRUE)[, N := sequence(.N), by = .id][order(N)] # .id k l g N # 1: 1 A 2004 12.0 1 # 2: 2 A 2012 22.0 1 # 3: 3 A 2004 12.0 1 # 4: 1 B 2004 3.4 2 # 5: 2 B 2012 4.8 2 # 6: 3 B 2004 3.4 2 # 7: 1 C 2004 4.5 3 # 8: 2 C 2012 5.6 3 # 9: 3 C 2004 4.5 3 # 10: 3 A 2012 22.0 4 # 11: 3 B 2012 4.8 5 # 12: 3 C 2012 5.6 6 </code></pre> Pay specific attention to the last three rows in comparison with @akrun's approach. <hr> The equivalent in base R for that last "data.table" approach would be something like: <pre class="prettyprint"><code>x <- do.call(rbind, lapply(c("df1", "df2", "df3"), function(x) { setNames(cbind(rn = x, get(x)), c("id", paste0("V", 1:ncol(get(x))))) })) x[order(ave(as.numeric(x$id), x$id, FUN = seq_along)), ] </code></pre> (So the moral is, use "data.table".)

Merge two data frames to get alternate rows of each data frame in sequence

Tags:

merge

r

My data.frame DATA is

  k    l   g
1 A 2004  12
2 B 2004 3.4
3 C 2004 4.5

Another data.frame DATA2 is

  i    d   t
1 A 2012  22
2 B 2012 4.8
3 C 2012 5.6

I want to get

1 A 2004  12
1 A 2012  22
2 B 2004 3.4
2 B 2012 4.8
3 C 2004 4.5
3 C 2012 5.6

538

asked Feb 12 '16 07:02

A.M.G16

2 Answers

We can try rbindlist from data.table. Place the datasets in a list, rbind them with rbindlist and order by the first column.

library(data.table)
rbindlist(list(df1, df2))[order(k)]
#   k    l    g
#1: A 2004 12.0
#2: A 2012 22.0
#3: B 2004  3.4
#4: B 2012  4.8
#5: C 2004  4.5
#6: C 2012  5.6

Or using dplyr

library(dplyr)
bind_rows(df1, setNames(df2, names(df1))) %>% 
           arrange(k)

NOTE: I used df1 and df2 in place of DATA and DATA2 as object names as it is easier to type.

answered Oct 13 '22 21:10

akrun

You can try the interleave function from the "gdata" package. However, this would require your inputs have the same column names and have the same number of rows.

The approach would be:

library(gdata)      # for interleave
do.call(interleave, lapply(list(df1, df2), setNames, paste0("V", 1:ncol(df1))))
#    V1   V2   V3
# 1   A 2004 12.0
# 11  A 2012 22.0
# 2   B 2004  3.4
# 21  B 2012  4.8
# 3   C 2004  4.5
# 31  C 2012  5.6

Alternatively, as mentioned in my comment @akrun's answer, depending on whether the first column is a grouping variable or not, you may want to modify his approach a little.

For instance, imagine there were a third data.frame, with a different number of rows than the others. interleave would not work on that, but the rbindlist approach would.

df3 <- do.call(rbind, lapply(list(df1, df2), setNames, c("A", "B", "Z")))

rbindlist(list(df1, df2, df3), idcol = TRUE)[, N := sequence(.N), by = .id][order(N)]
#     .id k    l    g N
#  1:   1 A 2004 12.0 1
#  2:   2 A 2012 22.0 1
#  3:   3 A 2004 12.0 1
#  4:   1 B 2004  3.4 2
#  5:   2 B 2012  4.8 2
#  6:   3 B 2004  3.4 2
#  7:   1 C 2004  4.5 3
#  8:   2 C 2012  5.6 3
#  9:   3 C 2004  4.5 3
# 10:   3 A 2012 22.0 4
# 11:   3 B 2012  4.8 5
# 12:   3 C 2012  5.6 6

Pay specific attention to the last three rows in comparison with @akrun's approach.

The equivalent in base R for that last "data.table" approach would be something like:

x <- do.call(rbind, lapply(c("df1", "df2", "df3"), function(x) {
  setNames(cbind(rn = x, get(x)), c("id", paste0("V", 1:ncol(get(x)))))
}))
x[order(ave(as.numeric(x$id), x$id, FUN = seq_along)), ]

_{(So the moral is, use "data.table".)}

answered Oct 13 '22 22:10

A5C1D2H2I1M1N2O1R2T1

Related questions
                            
                                Why do the results of mad(x) differ from the expected results?
                            
                                R: reading in .csv file removes leading zeros
                            
                                Convert from ANSI to UTF-8
                            
                                Fill 'NA's in data frame with information contained in one of the rows with a patient's ID using R
                            
                                How to write an R function or loop that will print every third number or nth number in [1, 100]?
                            
                                CRAN/ Bioconductor package installs fail: Error: Line starting '<!DOCTYPE HTML PUBLI ...' is malformed
                            
                                More efficient ways to use R than 'for' loops
                            
                                From of list of strings, identify which are human names and which are not
                            
                                How to create a discrete normal distribution in R?
                            
                                Including ASCII art in R
                            
                                Change point colors and color of frame/ellipse around points
                            
                                What is the difference the zoo object and ts object in R?
                            
                                ggplot legend: position of key relative to labels
                            
                                R: How to get a sum of two distributions?
                            
                                How to upload an image into RStudio Notebook?
                            
                                Add leading 0 with gsub
                            
                                Read data from a multi separated csv file in R
                            
                                Get the longest element of a list
                            
                                Probability of the Union of Three or More Sets
                            
                                ggplot2 + plotly : Axis title disappear

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With