Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merging more than 2 dataframes in R by rownames

I gather data from 4 df's and would like to merge them by rownames. I am looking for an efficient way to do this. This is a simplified version of the data I have.

df1           <- data.frame(N= sample(seq(9, 27, 0.5), 40, replace= T),                             P= sample(seq(0.3, 4, 0.1), 40, replace= T),                             C= sample(seq(400, 500, 1), 40, replace= T)) df2           <- data.frame(origin= sample(c("A", "B", "C", "D", "E"), 40,                                            replace= T),                             foo1= sample(c(T, F), 40, replace= T),                             X= sample(seq(145600, 148300, 100), 40, replace= T),                             Y= sample(seq(349800, 398600, 100), 40, replace= T)) df3           <- matrix(sample(seq(0, 1, 0.01), 40), 40, 100) df4           <- matrix(sample(seq(0, 1, 0.01), 40), 40, 100) rownames(df1) <- paste("P", sprintf("%02d", c(1:40)), sep= "") rownames(df2) <- rownames(df1) rownames(df3) <- rownames(df1) rownames(df4) <- rownames(df1) 

This is what I would normally do:

# merge df1 and df2 dat           <- merge(df1, df2, by= "row.names", all.x= F, all.y= F) #merge rownames(dat) <- dat$Row.names #reset rownames dat$Row.names <- NULL  #remove added rownames col  # merge dat and df3 dat           <- merge(dat, df3, by= "row.names", all.x= F, all.y= F) #merge rownames(dat) <- dat$Row.names #reset rownames dat$Row.names <- NULL  #remove added rownames col  # merge dat and df4 dat           <- merge(dat, df4, by= "row.names", all.x= F, all.y= F) #merge rownames(dat) <- dat$Row.names #reset rownames dat$Row.names <- NULL #remove added rownames col 

As you can see, this requires a lot of code. My question is if the same result can be achieved with more simple means. I've tried (without success): UPDATE: this works now!

MyMerge       <- function(x, y){   df            <- merge(x, y, by= "row.names", all.x= F, all.y= F)   rownames(df)  <- df$Row.names   df$Row.names  <- NULL   return(df) } dat           <- Reduce(MyMerge, list(df1, df2, df3, df4)) 

Thanks in advance for any suggestions

like image 378
Hans Roelofsen Avatar asked May 21 '13 09:05

Hans Roelofsen


People also ask

How do I combine 3 Dataframes in R?

Join Multiple R DataFrames To join more than two (multiple) R dataframes, then reduce() is used. It is available in the tidyverse package which will convert all the dataframes to a list and join the dataframes based on the column.

How do I merge Dataframes in Rownames in R?

The merge() function in base R can be used to merge input dataframes by common columns or row names. The merge() function retains all the row names of the dataframes, behaving similarly to the inner join. The dataframes are combined in order of the appearance in the input function call.

Can you merge more than 2 Dataframes in R?

The merge function in R allows you to combine two data frames, much like the join function that is used in SQL to combine data tables. Merge , however, does not allow for more than two data frames to be joined at once, requiring several lines of code to join multiple data frames.

How do I combine multiple Dataframes in R?

To join two data frames (datasets) vertically, use the rbind function. The two data frames must have the same variables, but they do not have to be in the same order. If data frameA has variables that data frameB does not, then either: Delete the extra variables in data frameA or.


1 Answers

join_all from plyr will probably do what you want. But they all must be data frames and the rownames are added as a column

require(plyr)  df3 <- data.frame(df3) df4 <- data.frame(df4)  df1$rn <- rownames(df1) df2$rn <- rownames(df2) df3$rn <- rownames(df3) df4$rn <- rownames(df4)  df <- join_all(list(df1,df2,df3,df4), by = 'rn', type = 'full') 

type argument should help even if the rownames vary and do not match If you do not want the rownames:

df$rn <- NULL 
like image 162
Anto Avatar answered Sep 19 '22 12:09

Anto