Pandas: simple 'join' not working?

Tags:

pandas

I like to think I'm not an idiot, but maybe I'm wrong. Can anyone explain to me why this isn't working? I can achieve the desired results using 'merge'. But I eventually need to join multiple pandas DataFrames so I need to get this method working.

Click to copy

In [2]: left = pandas.DataFrame({'ST_NAME': ['Oregon', 'Nebraska'], 'value': [4.685, 2.491]})

In [3]: right = pandas.DataFrame({'ST_NAME': ['Oregon', 'Nebraska'], 'value2': [6.218, 0.001]})

In [4]: left.join(right, on='ST_NAME', lsuffix='_left', rsuffix='_right')
Out[4]: 
  ST_NAME_left  value ST_NAME_right  value2
0       Oregon  4.685           NaN     NaN
1     Nebraska  2.491           NaN     NaN

367

asked Apr 11 '12 21:04

Phil

2 Answers

Try using merge:

Click to copy

In [14]: right
Out[14]: 
    ST_NAME  value2
0    Oregon   6.218
1  Nebraska   0.001

In [15]: merge(left, right)
Out[15]: 
    ST_NAME  value  value2
0  Nebraska  2.491   0.001
1    Oregon  4.685   6.218

In [18]: merge(left, right, on='ST_NAME', sort=False)
Out[18]: 
    ST_NAME  value  value2
0    Oregon  4.685   6.218
1  Nebraska  2.491   0.001

DataFrame.join is a bit of legacy method and apparently doesn't do column-on-column joins (originally it did index on column using the on parameter, hence the "legacy" designation).

162

answered Oct 06 '22 19:10

Wes McKinney

I can confirm, Pandas join method is faulty. In my case both keys were long strings (18 characters) and result was as if pandas was only matching first couple of characters. Merge function is working properly. Please do not use join function, it should be really removed from available methods, otherwise it can mess it up big time.

answered Oct 06 '22 21:10

Donatas Svilpa

Related questions
                            
                                JTable Clickable Column Sorting: Sorting sorts content of cells, but doesn't update cell formatting?
                            
                                How to support both IPv4 & IPv6 on Java
                            
                                Connect to VPN by Powershell
                            
                                Simulating the passing of time in unittesting
                            
                                Sending HTTP Post request with SOAP action using org.apache.http
                            
                                How to add 1 month on a date without skipping i.e. february [duplicate]
                            
                                How do I format so method parameters are stacked vertically, one per line?
                            
                                Scala file slurp
                            
                                The user data source credentials do not meet the requirements to run this report or shared dataset error when running reports
                            
                                Using Github for Windows to work with own private Git through SSH
                            
                                How do I find a filename, given a FILE pointer? [duplicate]
                            
                                What is the difference between __cause__ and __context__?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With