Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merging two data.frames by key column

I have two dataframes. In the first one, I have a KEY/ID column and two variables:

KEY V1 V2
1   10  2
2   20  4
3   30  6   
4   40  8
5   50 10

In the second dataframe, I have a KEY/ID column and a third variable

KEY V3 
1    5  
2   10  
3   20  

I would like to extract the rows of the first dataframe that are also in the second dataframe by matching them according to the KEY column. I would also like to add the V3 column to final dataset.

KEY V1 V2 V3 
1   10  2  5
2   20  4 10 
3   30  6 20   

This are my attempts by using the subset and the merge function

subset(data1, data1$KEY == data2$KEY) 
merge(data1, data2, by.x = "KEY", by.y = "KEY")

None of them does the task.

Any hint would be appreaciated. Thank you!

like image 495
user3618451 Avatar asked May 09 '14 09:05

user3618451


People also ask

How do I merge two DataFrames by columns?

We can join columns from two Dataframes using the merge() function. This is similar to the SQL 'join' functionality. A detailed discussion of different join types is given in the SQL lesson. You specify the type of join you want using the how parameter.

How do I concatenate two data frames?

When we concatenate DataFrames, we need to specify the axis. axis=0 tells pandas to stack the second DataFrame UNDER the first one. It will automatically detect whether the column names are the same and will stack accordingly. axis=1 will stack the columns in the second DataFrame to the RIGHT of the first DataFrame.


Video Answer


2 Answers

merge(data1, data2, by="KEY") should do it!

like image 77
Christian Borck Avatar answered Sep 28 '22 11:09

Christian Borck


If what you want is an inner join, then your attempt should do it. If it doesn't check the formats of Key columns in both the table using class(data1$key).

Apart from these and the merge suggested by Christian, you can use -

library(plyr)
join(data1, data2, by="KEY", type="inner")

or

library(data.table)
setkey(data1, KEY)
setkey(data2, KEY)
data1[,list(data1,data2)]
like image 20
RHelp Avatar answered Sep 28 '22 10:09

RHelp