pandas correlation matrix between each pair groupby item

Tags:

I have a csv file like this:

date,sym,close
2014.01.01,A,10
2014.01.02,A,11
2014.01.03,A,12
2014.01.04,A,13
2014.01.01,B,20
2014.01.02,B,22
2014.01.03,B,23
2014.01.01,C,33
2014.01.02,C,32
2014.01.03,C,31

Then, I get a dateframe named df via read_csv function

import numpy as np
import pandas as pd
df=pd.read_csv('daily.csv',index_col=[0])
groups=df.groupby('sym')[['close']].apply(lambda x:func(x['close'].values))

The groups look like this:

sym
A    [nan,1.00,2.00,...]
B    [nan,1.00,2.00,...]
C    [nan,1.00,2.00,...]

How to calculate the correlation between each pair of sym?

AA,AB,AC,BB,BA,BC,CA,CB,CC

BTW, the item numbers of each sym may be NOT the same.

551

asked Apr 14 '15 15:04

seizetheday

1 Answers

With df as above, make a pivot table:

dfp = df.pivot('date','sym')
print(dfp)

           close        
sym            A   B   C
date                    
2014-01-01    10  20  33
2014-01-02    11  22  32
2014-01-03    12  23  31
2014-01-04    13 NaN  30

pandas will calculate the pairwise coefficients:

print(dfp.corr())

              close                    
sym               A         B         C
      sym                              
close A    1.000000  0.981981 -1.000000
      B    0.981981  1.000000 -0.981981
      C   -1.000000 -0.981981  1.000000

But if you want to prettify it, check out seaborn:

import seaborn as sns
sns.corrplot(dfp, annot=True)

result:

enter image description here

164

answered Oct 01 '22 02:10

cphlewis

Related questions
                            
                                Python shuffle list not working [duplicate]
                            
                                Issue with strftime (python)
                            
                                Error when "Cancel" while opening a file in PyQt4
                            
                                how to make a rowcount in ponyorm? Python
                            
                                QueryDict to string loses list within JSON
                            
                                How to unpack a tuple into more values than the tuple has?
                            
                                Image foveation in Python
                            
                                Going through HTML DOM in Python
                            
                                Maximum recursion depth exceeded in json tree
                            
                                Login on a site using urllib
                            
                                Zero out matrix higher diagonal using numpy
                            
                                Format a requirements.txt file for pip where one or more packages have a different index-url [duplicate]
                            
                                Sort by last name
                            
                                Bottle web app not serving static css files
                            
                                Geopy: retrieving country names in English
                            
                                Solving for quartile and decile using Python
                            
                                setup.py: run build_ext before anything else
                            
                                Debug a python script with arguments from terminal
                            
                                Python regex explanation needed - $ character usage
                            
                                django-registration-redux add extra field

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

pandas correlation matrix between each pair groupby item

Tags:

python

pandas

correlation

seizetheday

People also ask

1 Answers

cphlewis

Recent Activity

Donate For Us