I try to run this code:
import pandas as pd
import seaborn as sns
df = pd.DataFrame(clusters, columns=cols)
sns.clustermap(df, cmap="vlag", vmin=0, vmax=1, metric="correlation",
z_score=None, standard_scale=None, yticklabels=True,
figsize=(size, size))
The value of clusters is:
clusters = [[0.89463602, 0., 0., 0.85185185, 0.9023569, 0.,
0., 0.83333333, 0., 0., 0., ],
[0.75, 0.66666667, 0., 0., 0.69444444, 0.,
0.89272031, 0., 0.69444444, 0., 0.69444444,],
[0.85185185, 0.88910175, 0., 0., 0.9043771, 0.,
0., 0., 0.89092141, 0.77777778, 0.69444444,],
[0.75, 0.89825458, 0., 0., 0.77777778, 0.,
0.8908046, 0., 0.75, 0.91550069, 0.8, ],]
and I get the following error:
in linkage
linkage_wrap(N, X, Z, mthidx[method])
FloatingPointError: NaN dissimilarity value.
any ideas for what causes it?
Two of your columns are all zeros, and have no variation at all, making it return nan with correlation:
cols = ["col"+str(i) for i in range(11)]
df = pd.DataFrame(clusters, columns=cols)
df.corr()
col0 col1 col2 col3 col4 col5 col6 col7 col8 col9 col10
col0 1.000000 -0.652805 NaN 0.755353 0.914034 NaN -0.971167 0.755353 -0.607892 -0.232318 -0.792705
col1 -0.652805 1.000000 NaN -0.967396 -0.353987 NaN 0.461102 -0.967396 0.982783 0.761192 0.976659
col2 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
col3 0.755353 -0.967396 NaN 1.000000 0.537949 NaN -0.577350 1.000000 -0.978166 -0.573568 -0.990826
col4 0.914034 -0.353987 NaN 0.537949 1.000000 NaN -0.943651 0.537949 -0.352431 0.181392 -0.546475
col5 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
col6 -0.971167 0.461102 NaN -0.577350 -0.943651 NaN 1.000000 -0.577350 0.401476 0.079648 0.627048
col7 0.755353 -0.967396 NaN 1.000000 0.537949 NaN -0.577350 1.000000 -0.978166 -0.573568 -0.990826
col8 -0.607892 0.982783 NaN -0.978166 -0.352431 NaN 0.401476 -0.978166 1.000000 0.665620 0.962359
col9 -0.232318 0.761192 NaN -0.573568 0.181392 NaN 0.079648 -0.573568 0.665620 1.000000 0.636492
col10 -0.792705 0.976659 NaN -0.990826 -0.546475 NaN 0.627048 -0.990826 0.962359 0.636492 1.000000
df[['col2','col5']]
col2 col5
0 0.0 0.0
1 0.0 0.0
2 0.0 0.0
3 0.0 0.0
You can either remove those columns and plot, or you have to use euclidean or canberra as metric.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With