I have a list of data.frame
objects which i would like to row append to one another, ie merge(..., all=T)
. However, merge
seems to remove the row names which I need to be kept intact. Any ideas? Example:
x = data.frame(a=1:2, b=2:3, c=3:4, d=4:5, row.names=c("row_1", "another_row1"))
y = data.frame(a=c(10,20), b=c(20,30), c=c(30,40), row.names=c("row_2", "another_row2"))
> merge(x, y, all=T, sort=F)
a b c d
1 1 2 3 4
2 2 3 4 5
3 10 20 30 NA
4 20 30 40 NA
To join two data frames (datasets) vertically, use the rbind function. The two data frames must have the same variables, but they do not have to be in the same order. If data frameA has variables that data frameB does not, then either: Delete the extra variables in data frameA or.
Use the full_join Function to Merge Two R Data Frames With Different Number of Rows. full_join is part of the dplyr package, and it can be used to merge two data frames with a different number of rows.
To combine two data frames in R, use the merge() function. The merge() is a built-in R function that merges two data frames by common columns or row names.
Let's find out. In the following example, we will change the column name from 'lastName' to 'surName' for the second data frame. The above code throws an error that the column names must match. So, the column names in both the data frames must be the same if you want to use rbind().
Since you know you are not actually merging, but just rbind-ing, maybe something like this will work. It makes use of rbind.fill
from "plyr". To use it, specify a list
of the data.frame
s you want to rbind
.
RBIND <- function(datalist) {
require(plyr)
temp <- rbind.fill(datalist)
rownames(temp) <- unlist(lapply(datalist, row.names))
temp
}
RBIND(list(x, y))
# a b c d
# row_1 1 2 3 4
# another_row1 2 3 4 5
# row_2 10 20 30 NA
# another_row2 20 30 40 NA
One way is to use row.names
in merge so that you get it as an additional column.
> merge(x, y, by=c("row.names", "a","b","c"), all.x=T, all.y=T, sort=F)
# Row.names a b c d
# 1 row_1 1 2 3 4
# 2 another_row1 2 3 4 5
# 3 row_2 10 20 30 NA
# 4 another_row2 20 30 40 NA
Edit: By looking at the merge
function with getS3method('merge', 'data.frame')
, the row.names
are clearly set to NULL (it is a pretty long code, so I won't paste here).
# Commenting
# Lines 63 and 64
row.names(x) <- NULL
row.names(y) <- NULL
# and
# Line 141 (thanks Ananda for pointing out)
attr(res, "row.names") <- .set_row_names(nrow(res))
and creating a new function, say, MERGE
, works as the OP intends for this example. Just an experimentation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With