Correlation between two dataframes

Tags:

Similar questions have been asked, but I've not seen a lucid answer. Forgive me for asking again. I have two dataframes, and I simply want the correlation of the first data frame with each column in the second. Here is code which does exactly what I want:

df1=pd.DataFrame( {'Y':np.random.randn(10) } )
df2=pd.DataFrame( {'X1':np.random.randn(10), 'X2':np.random.randn(10) ,'X3':np.random.randn(10) } )
for col in df2:
   print df1['Y'].corr(df2[col])

but it doesn't seem like I should be looping through the dataframe. I was hoping that something as simple as

df1.corr(df2)

ought to get the job done. Is there a clear way to perform this function without looping?

206

asked Oct 15 '15 20:10

TPM

1 Answers

You can use corrwith:

>>> df2.corrwith(df1.Y)
X1    0.051002
X2   -0.339775
X3    0.076935
dtype: float64

127

answered Oct 18 '22 01:10

Alexander

Related questions
                            
                                Can I run django tests (manage.py) from a different directory?
                            
                                Finding multiple attributes within the span tag in Python
                            
                                Python anaconda conda issue: updating anaconda package impossible because processes are running
                            
                                multi-monthly mean with pandas' Series
                            
                                Connection Error SMTP python
                            
                                how to provide 'make directory as source root' from pyCharm to terminal
                            
                                How to print a tree in Python?
                            
                                python dynamic array access [:0] [duplicate]
                            
                                make pycaffe fatal error: 'Python.h' file not found
                            
                                AttributeError: type object 'User' has no attribute 'query'
                            
                                Adding Business days to datetime column
                            
                                Pandas number of business days between a DatetimeIndex and a Timestamp
                            
                                Using bool array mask, replace False values with NaN
                            
                                Using OpenCL accelerated functions with OpenCV3 in Python
                            
                                Is sympy pretty printing broken in new jupyter notebook?
                            
                                Python: Is there any difference between "del a" and "del(a)"?
                            
                                Python: forcing precision on a floating point number in json?
                            
                                Generator-based coroutine versus native coroutine
                            
                                Access to Spark from Flask app
                            
                                How can I install Python tool for visual studio 2015?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Correlation between two dataframes

Tags:

python

pandas

dataframe

TPM

People also ask

1 Answers

Alexander

Recent Activity

Donate For Us