Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pairwise correlation of Pandas DataFrame columns with custom function

Pandas pairwise correlation on a DataFrame comes handy in many cases. However, in my specific case I would like to use a method not provided by Pandas (something other than (pearson, kendall or spearman) to correlate two columns. Is it possible to explicitly define the correlation function to use in this case?

The syntax I would like looks like this:

def my_method(x,y): return something
frame.corr(method=my_method)
like image 605
Flo Avatar asked Aug 14 '13 14:08

Flo


Video Answer


1 Answers

You would need to do this in cython for any kind of perf (with a cythonizable function)

l = len(df.columns)
results = np.zeros((l,l))
for i, ac in enumerate(df):
    for j, bc in enumerate(df):
           results[j,i] = func(ac,bc)
results = DataFrame(results,index=df.columns,columns=df.columns)
like image 129
Jeff Avatar answered Sep 27 '22 22:09

Jeff