Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

*_join with empty suffix

Tags:

r

dplyr

Fair warning: this can hang your operating system.

*_join() from dplyr fails when either of the left or right suffixes are specified as empty (''), e.g.

inner_join(data.frame(x=1, y=2),
           data.frame(x=1, y=3),
           by='x',
           suffix=c('', '.b'))

Whereas the following works fine:

inner_join(data.frame(x=1, y=2),
           data.frame(x=1, y=3),
           by='x',
           suffix=c('.a', '.b'))

Meanwhile, the S3 generic merge() (base) has no problem with empty suffixes:

merge(data.frame(x=1, y=2),
      data.frame(x=1, y=3),
      by='x',
      suffixes=c('', '.b'))

dplyr package info:

> packageVersion('dplyr')
[1] ‘0.5.0’

R version info:

> version

platform       x86_64-w64-mingw32          
arch           x86_64                      
os             mingw32                     
system         x86_64, mingw32             
status                                     
major          3                           
minor          3.0                         
year           2016                        
month          05                          
day            03                          
svn rev        70573                       
language       R                           
version.string R version 3.3.0 (2016-05-03)
nickname       Supposedly Educational 
like image 929
stephematician Avatar asked Nov 14 '16 06:11

stephematician


1 Answers

This was fun when I stumbled across this bug. The following will accomplish the desired effect using dplyr of using suffixes '' and .b

library(dplyr)
inner_join(data.frame(x=1, y=2),
           data.frame(x=1, y=3),
           by='x',
           suffix=c('.a', '.b')) %>%
  setNames(gsub('\\.a$', '', names(.)))
like image 61
manotheshark Avatar answered Oct 19 '22 18:10

manotheshark