Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merge two data frames to get alternate rows of each data frame in sequence

Tags:

merge

r

My data.frame DATA is

  k    l   g
1 A 2004  12
2 B 2004 3.4
3 C 2004 4.5

Another data.frame DATA2 is

  i    d   t
1 A 2012  22
2 B 2012 4.8
3 C 2012 5.6

I want to get

1 A 2004  12
1 A 2012  22
2 B 2004 3.4
2 B 2012 4.8
3 C 2004 4.5
3 C 2012 5.6
like image 538
A.M.G16 Avatar asked Feb 12 '16 07:02

A.M.G16


People also ask

How do I merge two DataFrames with different rows in pandas?

The concat() function can be used to concatenate two Dataframes by adding the rows of one to the other. The merge() function is equivalent to the SQL JOIN clause. 'left', 'right' and 'inner' joins are all possible.

Which function is used to merge two data frames?

Pandas DataFrame merge() Function Syntax These are similar to SQL left outer join, right outer join, full outer join, and inner join. on: Column or index level names to join on. These columns must be present in both the DataFrames. If not provided, the intersection of the columns in both DataFrames are used.

How do you combine two data sets?

To join two data frames (datasets) vertically, use the rbind function. The two data frames must have the same variables, but they do not have to be in the same order. If data frameA has variables that data frameB does not, then either: Delete the extra variables in data frameA or.

How do I merge two data frames in R?

In R we use merge() function to merge two dataframes in R. This function is present inside join() function of dplyr package. The most important condition for joining two dataframes is that the column type should be the same on which the merging happens. merge() function works similarly like join in DBMS.


2 Answers

We can try rbindlist from data.table. Place the datasets in a list, rbind them with rbindlist and order by the first column.

library(data.table)
rbindlist(list(df1, df2))[order(k)]
#   k    l    g
#1: A 2004 12.0
#2: A 2012 22.0
#3: B 2004  3.4
#4: B 2012  4.8
#5: C 2004  4.5
#6: C 2012  5.6

Or using dplyr

library(dplyr)
bind_rows(df1, setNames(df2, names(df1))) %>% 
           arrange(k)

NOTE: I used df1 and df2 in place of DATA and DATA2 as object names as it is easier to type.

like image 60
akrun Avatar answered Oct 13 '22 21:10

akrun


You can try the interleave function from the "gdata" package. However, this would require your inputs have the same column names and have the same number of rows.

The approach would be:

library(gdata)      # for interleave
do.call(interleave, lapply(list(df1, df2), setNames, paste0("V", 1:ncol(df1))))
#    V1   V2   V3
# 1   A 2004 12.0
# 11  A 2012 22.0
# 2   B 2004  3.4
# 21  B 2012  4.8
# 3   C 2004  4.5
# 31  C 2012  5.6

Alternatively, as mentioned in my comment @akrun's answer, depending on whether the first column is a grouping variable or not, you may want to modify his approach a little.

For instance, imagine there were a third data.frame, with a different number of rows than the others. interleave would not work on that, but the rbindlist approach would.

df3 <- do.call(rbind, lapply(list(df1, df2), setNames, c("A", "B", "Z")))

rbindlist(list(df1, df2, df3), idcol = TRUE)[, N := sequence(.N), by = .id][order(N)]
#     .id k    l    g N
#  1:   1 A 2004 12.0 1
#  2:   2 A 2012 22.0 1
#  3:   3 A 2004 12.0 1
#  4:   1 B 2004  3.4 2
#  5:   2 B 2012  4.8 2
#  6:   3 B 2004  3.4 2
#  7:   1 C 2004  4.5 3
#  8:   2 C 2012  5.6 3
#  9:   3 C 2004  4.5 3
# 10:   3 A 2012 22.0 4
# 11:   3 B 2012  4.8 5
# 12:   3 C 2012  5.6 6

Pay specific attention to the last three rows in comparison with @akrun's approach.


The equivalent in base R for that last "data.table" approach would be something like:

x <- do.call(rbind, lapply(c("df1", "df2", "df3"), function(x) {
  setNames(cbind(rn = x, get(x)), c("id", paste0("V", 1:ncol(get(x)))))
}))
x[order(ave(as.numeric(x$id), x$id, FUN = seq_along)), ]

(So the moral is, use "data.table".)

like image 38
A5C1D2H2I1M1N2O1R2T1 Avatar answered Oct 13 '22 22:10

A5C1D2H2I1M1N2O1R2T1