Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R - Concatenate two dataframes?

People also ask

How do I merge two Dataframes based on a column in R?

The merge() function in base R can be used to merge input dataframes by common columns or row names. The merge() function retains all the row names of the dataframes, behaving similarly to the inner join. The dataframes are combined in order of the appearance in the input function call.

What is merge () in R?

The R merge function allows merging two data frames by common columns or by row names. This function allows you to perform different database (SQL) joins, like left join, inner join, right join or full join, among others.


You want "rbind".

b$b <- NA
new <- rbind(a, b)

rbind requires the data frames to have the same columns.

The first line adds column b to data frame b.

Results

> a <- data.frame(a=c(0,1,2), b=c(3,4,5), c=c(6,7,8))
> a
  a b c
1 0 3 6
2 1 4 7
3 2 5 8
> b <- data.frame(a=c(9,10,11), c=c(12,13,14))
> b
   a  c
1  9 12
2 10 13
3 11 14
> b$b <- NA
> b
   a  c  b
1  9 12 NA
2 10 13 NA
3 11 14 NA
> new <- rbind(a,b)
> new
   a  b  c
1  0  3  6
2  1  4  7
3  2  5  8
4  9 NA 12
5 10 NA 13
6 11 NA 14

Try the plyr package:

rbind.fill(a,b,c)

you can use the function

bind_rows(a,b)

from the dplyr library


Here's a simple little function that will rbind two datasets together after auto-detecting what columns are missing from each and adding them with all NAs.

For whatever reason this returns MUCH faster on larger datasets than using the merge function.

fastmerge <- function(d1, d2) {
  d1.names <- names(d1)
  d2.names <- names(d2)

  # columns in d1 but not in d2
  d2.add <- setdiff(d1.names, d2.names)

  # columns in d2 but not in d1
  d1.add <- setdiff(d2.names, d1.names)

  # add blank columns to d2
  if(length(d2.add) > 0) {
    for(i in 1:length(d2.add)) {
      d2[d2.add[i]] <- NA
    }
  }

  # add blank columns to d1
  if(length(d1.add) > 0) {
    for(i in 1:length(d1.add)) {
      d1[d1.add[i]] <- NA
    }
  }

  return(rbind(d1, d2))
}

You may use rbind but in this case you need to have the same number of columns in both tables, so try the following:

b$b<-as.double(NA) #keeping numeric format is essential for further calculations
new<-rbind(a,b)