Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R dplyr left join - multiple returned values and new rows: how to ask for the first match only?

Tags:

r

dplyr

Let's say I have a list of suburb names, crime rate and their council names on a separate table.

Tables Picture

I know that left_join(table1, table2, by=Suburb) will return the table with newly added rows due to the multiple matches for council. The problem is that suburbs 3 and 4 overlap into two councils.

Is there a way to only get the left join to only return the first match only rather than creating new rows to facilitate for the extra ones?

In addition, on Table 2, is there a function to only keep the first row of each suburb and remove the second/third/fourth instances where the second/third/fourth council overlapping occurs?

like image 408
user7438322 Avatar asked Sep 20 '25 04:09

user7438322


1 Answers

You can do this using the plyr package and the join() function. The equivalent to left_join(table1, table2, by=Suburb) but only using the first Suburb match from table2 would be: join(table1, table2, by=Suburb, type="left", match="first"). I'm not sure what the equivalent is in the dplyr package, though I would love to know myself.

like image 187
Rebecca412 Avatar answered Sep 21 '25 18:09

Rebecca412