Pandas join/merge/concat two dataframes

Tags:

pandas

I am having issues with joins in pandas and I am trying to figure out what is wrong. Say I have a dataframe x:

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1941 entries, 2004-10-19 00:00:00 to 2012-07-23 00:00:00
Data columns:
close    1941  non-null values
high     1941  non-null values
low      1941  non-null values
open     1941  non-null values
dtypes: float64(4)

should I be able to join it with y on index with a simple join command where y = x except colnames have +2.

 <class 'pandas.core.frame.DataFrame'>
 DatetimeIndex: 1941 entries, 2004-10-19 00:00:00 to 2012-07-23 00:00:00
 Data columns:
 close2    1941  non-null values
 high2     1941  non-null values
 low2      1941  non-null values
 open2     1941  non-null values
 dtypes: float64(4)

 y.join(x) or pandas.DataFrame.join(y,x):
 <class 'pandas.core.frame.DataFrame'>
 DatetimeIndex: 34879 entries, 2004-12-16 00:00:00 to 2012-07-12 00:00:00
 Data columns:
 close2    34879  non-null values
 high2     34879  non-null values
 low2      34879  non-null values
 open2     34879  non-null values
 close     34879  non-null values
 high      34879  non-null values
 low       34879  non-null values
 open      34879  non-null values
 dtypes: float64(8)

I expect the final to have 1941 non-values for both. I tried merge as well but I have the same issue.

I had thought the right answer was pandas.concat([x,y]), but this does not do what I intend either.

In [83]: pandas.concat([x,y]) 
Out[83]: <class 'pandas.core.frame.DataFrame'> 
DatetimeIndex: 3882 entries, 2004-10-19 00:00:00 to 2012-07-23 00:00:00 
Data columns: 
close2 3882 non-null values 
high2 3882 non-null values 
low2 3882 non-null values 
open2 3882 non-null values 
dtypes: float64(4)

edit: If you are having issues with join, read Wes's answer below. I had one time stamp that was duplicated.

745

asked Jul 24 '12 18:07

2 Answers

Does your index have duplicates x.index.is_unique? If so would explain the behavior you're seeing:

In [16]: left
Out[16]: 
            a
2000-01-01  1
2000-01-01  1
2000-01-01  1
2000-01-02  2
2000-01-02  2
2000-01-02  2

In [17]: right
Out[17]: 
            b
2000-01-01  3
2000-01-01  3
2000-01-01  3
2000-01-02  4
2000-01-02  4
2000-01-02  4

In [18]: left.join(right)
Out[18]: 
            a  b
2000-01-01  1  3
2000-01-01  1  3
2000-01-01  1  3
2000-01-01  1  3
2000-01-01  1  3
2000-01-01  1  3
2000-01-01  1  3
2000-01-01  1  3
2000-01-01  1  3
2000-01-02  2  4
2000-01-02  2  4
2000-01-02  2  4
2000-01-02  2  4
2000-01-02  2  4
2000-01-02  2  4
2000-01-02  2  4
2000-01-02  2  4
2000-01-02  2  4

196

answered Sep 23 '22 19:09

It sounds like maybe you want pandas.concat? merge and join do, well, joins, which means they will give you something based around the Cartesian product of the two inputs, but it sounds like you just want to paste them together into one big table.

Edit: did you try concat with axis=1? It seems to do what you're asking for:

>>> print x
          A         B         C
0  0.155614 -0.252148  0.861163
1  0.973517  1.156465 -0.458846
2  2.504356 -0.356371 -0.737842
3  0.012994  1.785123  0.161667
4  0.574578  0.123689  0.017598
>>> print y
         A2        B2        C2
0 -0.280993  1.278750 -0.704449
1  0.140282  1.955322 -0.953826
2  0.581997 -0.239829  2.227069
3 -0.876146 -1.955199 -0.155030
4 -0.518593 -2.630978  0.333264
>>> print pandas.concat([x, y], axis=1)
          A         B         C        A2        B2        C2
0  0.155614 -0.252148  0.861163 -0.280993  1.278750 -0.704449
1  0.973517  1.156465 -0.458846  0.140282  1.955322 -0.953826
2  2.504356 -0.356371 -0.737842  0.581997 -0.239829  2.227069
3  0.012994  1.785123  0.161667 -0.876146 -1.955199 -0.155030
4  0.574578  0.123689  0.017598 -0.518593 -2.630978  0.333264

answered Sep 22 '22 19:09

BrenBarn

Related questions
                            
                                Speed up loading 24-bit binary data into 16-bit numpy array
                            
                                If the Bokeh has many chart, output file is very heavy and system slow
                            
                                How to fix mock_open differences in calls but not in end result
                            
                                Python xlrd - error handling
                            
                                Pycharm not using nosetests config file
                            
                                Scrapy - Getting duplicated items using JOBDIR
                            
                                get_text() or text property is not working for labels
                            
                                How decorator works? [duplicate]
                            
                                What is the difference between Pycrypto's Random.get_random_bytes and a simple random byte generator?
                            
                                How to create a Numpy array from a large list of list- python
                            
                                Sometimes, map is a sequence, sometimes not?
                            
                                Is it better to use self variable than pass variable in a class? [closed]
                            
                                Can you display an image inside of a python program (without using pygame)?
                            
                                Reading a JPEG in Python (PIL) with broken header
                            
                                Installing lower version of GLIBC and running pyinstaller
                            
                                Opening already opened file does not raise exception
                            
                                numexpr.evaluate("a+b",out=a)
                            
                                How to explain the reentrant RuntimeError caused by printing in signal handlers?
                            
                                Django: CSRF token missing or incorrect
                            
                                Override default get_absolute_url on User objects?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas join/merge/concat two dataframes

Tags:

python

pandas

Michael WS

People also ask

2 Answers

Wes McKinney

BrenBarn

Recent Activity

Donate For Us