Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

r programming --- merge function returns column names with .x and .y

Tags:

r

While merging two tables, I can't control column names in the merge result. To explain my situation, let me use mtcars data:

#load mtcars data.frame
data(mtcars)

Add a new column called 'car' that I will use as merging key

mtcars$car <- row.names(mtcars)

Now create two mutually exclusive tables.

small <- mtcars[mtcars$cyl == 4,]
med.large <- mtcars[mtcars$cyl >4,]

Now when I do a left merge, I should get 'small' table back as the two tables are mutually exclusive:

merge(x = small, y = med.large, by = 'car', all.x=T)

this returns 'small' table back but every column appears twice with .x and .y extension with .y columns all NA (since the two tables have no common records) and looks like the following

 car mpg.x cyl.x disp.x hp.x drat.x  wt.x qsec.x vs.x am.x gear.x carb.x mpg.y cyl.y

 1      Datsun 710  22.8     4  108.0   93   3.85 2.320  18.61    1    1       4      1    NA    NA

how can I get column names only once with column values from the primary merge table in this case LEFT table ('small'). I don't know how to avoid .x and .y. extension?

like image 661
seakyourpeak Avatar asked Jan 17 '16 01:01

seakyourpeak


1 Answers

If every column name is repeated, you can just use

merge(x = small, y = med.large, by = names(small), all.x=T)

If column names differ, you can build a vector of names in both with

intersect(names(small), names(med.large))

and pass that to by. Otherwise, if the two data.frames share a column that is not passed to by, you'll end up with .x or .y suffixes.

like image 147
alistaire Avatar answered Sep 21 '22 23:09

alistaire