Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R - merge() returns NA´s in ALL columns although all.x=T

Tags:

merge

r

na

I am new here and have searched the forum for my problem, but did not find a solution. I have two data frames which I want to merge on a common key field.

          merge(x
               ,y
               ,by.x="a"
               ,by.y="b"
               ,all.x=T
               ,sort=F
               )

Since my x-Dataframe has more rows than my y-Dataframe, I want to keep all rows from x with NA for the column from y but with all values in the columns from x. This code is only giving me extra rows for the unmatched cases with NA in ALL columns (columns from x and y). I would be really grateful if someone could help me out? Where is my mistake?

Example:

a = data.frame(c(111,222,333,444),c(1,5,3,8))
b = data.frame(c(111,222),c(0.1,0.4))
colnames(a)=c("code","value")
colnames(b)=c("code","value")
c = merge(a
          ,b
          ,by="code"
          ,all.x=T)

In this example it is working properly. In my data I obtain NA in all columns in row 3&4.

I hope you can understand my lousy example?!

Thank you! Jessica ;)

like image 675
Jessica Avatar asked Oct 18 '13 14:10

Jessica


People also ask

What does merge () do in R?

The merge() function in R combines two data frames. The most crucial requirement for connecting two data frames is that the column type is the same on which the merging occurs. The merge() function is similar to the join function in a Relational Database Management System (RDMS).

How do I merge columns in R?

How do I concatenate two columns in R? To concatenate two columns you can use the <code>paste()</code> function. For example, if you want to combine the two columns A and B in the dataframe df you can use the following code: <code>df['AB'] <- paste(df$A, df$B)</code>.

How do I merge two Dataframes by columns in R?

The merge() function in base R can be used to merge input dataframes by common columns or row names. The merge() function retains all the row names of the dataframes, behaving similarly to the inner join. The dataframes are combined in order of the appearance in the input function call.

How do I combine unequal data frames in R?

Use the left_join Function to Merge Two R Data Frames With Different Number of Rows. left_join is another method from the dplyr package. It takes arguments similar to the full_join function, but left_join extracts all rows from the first data frame and all columns from both of them.


1 Answers

Just set all=TRUE.

# Create your data
x<-data.frame(val1=c(2,8,6,3),a=c('h','k','b','e'))
y<-data.frame(val2=c(4,1),b=c('h','e'))
# Outer join
merge(x,y,by.x='a',by.y='b',all=TRUE)
#   a val1 val2
# 1 b    6   NA
# 2 e    3    1
# 3 h    2    4
# 4 k    8   NA
like image 98
nograpes Avatar answered Oct 20 '22 15:10

nograpes