I am getting an error while plotting the dendrogram for the spearmanr correlation. Below is the code I am using
corr = np.round(scipy.stats.spearmanr(full_data[list_of_continous]).correlation, 4)
corr_condensed = hc.distance.squareform(1-corr)
z = hc.linkage(corr_condensed, method='average')
fig = plt.figure(figsize=(20,20))
dendrogram = hc.dendrogram(z, labels=full_data[list_of_continous].columns, orientation='left', leaf_font_size=30)
plt.show()
Below is the error I am getting:
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-10-9873c0be8dc7> in <module>()
      1 corr = np.round(scipy.stats.spearmanr(full_data[list_of_continous]).correlation, 4)
----> 2 corr_condensed = hc.distance.squareform(1-corr)
      3 z = hc.linkage(corr_condensed, method='average')
      4 fig = plt.figure(figsize=(20,20))
      5 dendrogram = hc.dendrogram(z, labels=full_data[list_of_continous].columns, orientation='left', leaf_font_size=30)
/usr/local/anaconda/lib/python3.6/site-packages/scipy/spatial/distance.py in squareform(X, force, checks)
   1844             raise ValueError('The matrix argument must be square.')
   1845         if checks:
-> 1846             is_valid_dm(X, throw=True, name='X')
   1847 
   1848         # One-side of the dimensions is set here.
/usr/local/anaconda/lib/python3.6/site-packages/scipy/spatial/distance.py in is_valid_dm(D, tol, throw, name, warning)
   1920                 if name:
   1921                     raise ValueError(('Distance matrix \'%s\' must be '
-> 1922                                      'symmetric.') % name)
   1923                 else:
   1924                     raise ValueError('Distance matrix must be symmetric.')
ValueError: Distance matrix 'X' must be symmetric.
Variable corr might have nan values which might deform it.
Try:
corr = np.nan_to_num(corr)
Update:
skipping
    corr_condensed = hc.distance.squareform(1-corr)
works without any error for me.
So
corr = np.round(scipy.stats.spearmanr(full_data[list_of_continous]).correlation, 4)
z = hc.linkage(corr, method='average')
fig = plt.figure(figsize=(20,20))
dendrogram = hc.dendrogram(z, labels=full_data[list_of_continous].columns, orientation='left', leaf_font_size=30)
plt.show()
should work for you too.
If you are sure the matrix is symmetric, set checks=False
corr_condensed = hc.distance.squareform(1-corr, checks=False)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With