Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create clusters using correlation matrix in Python

all, I have a correlation matrix of 21 industry sectors. Now I want to split these 21 sectors into 4 or 5 groups, with sectors of similar behaviors grouped together.

Can experts shed me some lights on how to do this in Python please? Thanks much in advance!

like image 243
Jasper C. Avatar asked Oct 12 '18 21:10

Jasper C.


1 Answers

You might explore the use of Pandas DataFrame.corr and the scipy.cluster Hierarchical Clustering package

import pandas as pd
import scipy.cluster.hierarchy as spc


df = pd.DataFrame(my_data)
corr = df.corr().values

pdist = spc.distance.pdist(corr)
linkage = spc.linkage(pdist, method='complete')
idx = spc.fcluster(linkage, 0.5 * pdist.max(), 'distance')
like image 68
Wes Doyle Avatar answered Sep 28 '22 05:09

Wes Doyle