I have two dataframes both with columns that have ID numbers that should match. However, the two dataframes do not have the same number of ID's. One has 118 IDs and one has 103 ID's. The dataframe that has 118 ID's has all 103 ID's though, but some extras. I am wanting to take two columns from 118 row dataframe and merge them with the IDs that match in the 103 row dataframe based on ID number. I dont know the easiest way to go about this problem.
You should consider using merge
instruction. Assuming that you have 2 dataframes and they have the column ID
with the identifiers:
merge(dataframe1, dataframe2, by = "ID")
If they have identifiers but the name of the column is different (I used ID_dfX
) in each dataframe you should use:
merge(dataframe1, dataframe2, by.x = "ID_df1", , by.y = "ID_df2")
With this, you'll have a dataframe where the same ID is in both dataframes. The rest of information is deleted. This is an inner join
.
If you want to add the information of one dataframe to another you can use, assuming that you don't want to touch dataframe1 and you just want to add information of dataframe2 that share the ID (left join
):
merge(dataframe1, dataframe2, by.x = "ID_df1", , by.y = "ID_df2", all.x = TRUE)
Here you'll find some information about joins (merging):
(INNER) JOIN: Returns records that have matching values in both dataframes.
LEFT (OUTER) JOIN: Return all records from the left table, and the matched records from the right table.
RIGHT (OUTER) JOIN: Return all records from the right table, and the matched records from the left table.
FULL (OUTER) JOIN: Return all records when there is a match in either left or right table.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With