Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Avoiding and renaming .x and .y columns when merging or joining in r

Tags:

Often I go about joining two dataframes together that have the same name. Is there a way to do this within the join-step so that I don't end up with a .x and a .y column? So the names might be 'original_mpg', and 'new_mpg'?

  library(dplyr)   left_join(mtcars, mtcars[,c("mpg",'cyl')], by=c("cyl"))   names(mtcars) #ugh 
like image 318
runningbirds Avatar asked Mar 01 '16 20:03

runningbirds


People also ask

What is difference between join and merge in R?

The join() functions from dplyr preserve the original order of rows in the data frames while the merge() function automatically sorts the rows alphabetically based on the column you used to perform the join.

How do you reassign column names in R?

Method 1: using colnames() method colnames() method in R is used to rename and replace the column names of the data frame in R. The columns of the data frame can be renamed by specifying the new column names as a vector. The new name replaces the corresponding old name of the column in the data frame.

How do I merge columns in R?

How do I concatenate two columns in R? To concatenate two columns you can use the <code>paste()</code> function. For example, if you want to combine the two columns A and B in the dataframe df you can use the following code: <code>df['AB'] <- paste(df$A, df$B)</code>.

How do I merge two Dataframes based on a column in R?

The merge() function in base R can be used to merge input dataframes by common columns or row names. The merge() function retains all the row names of the dataframes, behaving similarly to the inner join. The dataframes are combined in order of the appearance in the input function call.


Video Answer


2 Answers

Currently, this is an open issue with dplyr. You'll either have to rename before or after the join or use merge from base R, which takes a suffixes argument.

like image 198
Matthew Plourde Avatar answered Sep 29 '22 09:09

Matthew Plourde


The default suffixes, c(".x", ".y"), can be overridden by passing them as a character vector of length 2:

library(dplyr)     left_join(mtcars, mtcars[,c("mpg","cyl")],                by = c("cyl"),                suffix = c("_original", "_new")) %>%        head() 

Output

 mpg_original cyl disp  hp drat   wt  qsec vs am gear carb mpg_new 1           21   6  160 110  3.9 2.62 16.46  0  1    4    4    21.0 2           21   6  160 110  3.9 2.62 16.46  0  1    4    4    21.0 3           21   6  160 110  3.9 2.62 16.46  0  1    4    4    21.4 4           21   6  160 110  3.9 2.62 16.46  0  1    4    4    18.1 5           21   6  160 110  3.9 2.62 16.46  0  1    4    4    19.2 6           21   6  160 110  3.9 2.62 16.46  0  1    4    4    17.8 
like image 35
mpalanco Avatar answered Sep 29 '22 11:09

mpalanco