If I have two data frames, such as: <pre class="prettyprint"><code>df1 = data.frame(x=1:3,y=1:3,row.names=c('r1','r2','r3')) df2 = data.frame(z=5:7,row.names=c('r5','r6','r7')) </code></pre> ( <pre class="prettyprint"><code>R> df1 x y r1 1 1 r2 2 2 r3 3 3 R> df2 z r5 5 r6 6 r7 7 </code></pre> ), I'd like to merge them by row names, keeping everything (so an outer join, or all=T). This does it: <pre class="prettyprint"><code>merged.df <- merge(df1,df2,all=T,by='row.names') R> merged.df Row.names x y z 1 r1 1 1 NA 2 r2 2 2 NA 3 r3 3 3 NA 4 r5 NA NA 5 5 r6 NA NA 6 6 r7 NA NA 7 </code></pre> but I want the input row names to be the row names in the output dataframe (merged.df). I can do: <pre class="prettyprint"><code>rownames(merged.df) <- merged.df[[1]] merged.df <- merged.df[-1] </code></pre> which works, but seems inelegant and hard to remember. Anyone know of a cleaner way?

Not sure if it's any easier to remember, but you can do it all in one step using <code>transform</code>. <pre class="prettyprint"><code>transform(merge(df1,df2,by=0,all=TRUE), row.names=Row.names, Row.names=NULL) # x y z #r1 1 1 NA #r2 2 2 NA #r3 3 3 NA #r5 NA NA 5 #r6 NA NA 6 #r7 NA NA 7 </code></pre>

How does one merge dataframes by row name without adding a "Row.names" column?

Tags:

merge

dataframe

r

If I have two data frames, such as:

df1 = data.frame(x=1:3,y=1:3,row.names=c('r1','r2','r3'))
df2 = data.frame(z=5:7,row.names=c('r5','r6','r7'))

(

R> df1
   x y
r1 1 1
r2 2 2
r3 3 3

R> df2
   z
r5 5
r6 6
r7 7

), I'd like to merge them by row names, keeping everything (so an outer join, or all=T). This does it:

merged.df <- merge(df1,df2,all=T,by='row.names')
R> merged.df
  Row.names  x  y  z
1        r1  1  1 NA
2        r2  2  2 NA
3        r3  3  3 NA
4        r5 NA NA  5
5        r6 NA NA  6
6        r7 NA NA  7

but I want the input row names to be the row names in the output dataframe (merged.df).

I can do:

rownames(merged.df) <- merged.df[[1]]
merged.df <- merged.df[-1]

which works, but seems inelegant and hard to remember. Anyone know of a cleaner way?

796

asked Jun 29 '13 01:06

user116293

2 Answers

Not sure if it's any easier to remember, but you can do it all in one step using transform.

transform(merge(df1,df2,by=0,all=TRUE), row.names=Row.names, Row.names=NULL)
#    x  y  z
#r1  1  1 NA
#r2  2  2 NA
#r3  3  3 NA
#r5 NA NA  5
#r6 NA NA  6
#r7 NA NA  7

166

answered Oct 25 '22 00:10

thelatemail

From the help of merge:

If the matching involved row names, an extra character column called Row.names is added at the left, and in all cases the result has ‘automatic’ row names.

So it is clear that you can't avoid the Row.names column at least using merge. But maybe to remove this column you can subset by name and not by index. For example:

dd <- merge(df1,df2,by=0,all=TRUE) ## by=0 easier to write than row.names , 
                                   ## TRUE is cleaner than T

Then I use row.names to subset like this :

res <- subset(dd,select=-c(Row.names))
rownames(res) <- dd[,'Row.names']
  x  y  z
1  1  1 NA
2  2  2 NA
3  3  3 NA
4 NA NA  5
5 NA NA  6
6 NA NA  7

answered Oct 24 '22 23:10

agstudy

Related questions
                            
                                Set alpha and remove black outline of density plots in ggpairs
                            
                                R markdown df_print options
                            
                                R piping (%>%) does not work with replicate function
                            
                                flatten a data frame
                            
                                Using outer() with a multivariable function
                            
                                Quickly view an R data.frame, vector, or data.table in Excel
                            
                                R: split elements of a list into sublists
                            
                                Why does not R round function round big numbers
                            
                                Main title at the top of a plot is cut off
                            
                                How to concatenate/compose functions in R?
                            
                                rounding times to the nearest hour in R [duplicate]
                            
                                reading global variables using foreach in R
                            
                                Got message unable to load shared object stats.so when R starts
                            
                                Can ggplot2 control point size and line size (lineweight) separately in one legend?
                            
                                Can I save the old value of a reactive object when it changes?
                            
                                R Knitr PDF: Is there a posssibility to automatically save PDF reports (generated from .Rmd) through a loop?
                            
                                Substitution for deprecated axis.ticks.margin in ggplot2 2.0
                            
                                Multiple markers on same coordinate
                            
                                Error in x[[method]](...) : attempt to apply non-function in testthat test when sourcing file
                            
                                how can one increase size of plotted area wordclouds in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With