I'm playing around with the abalone dataset from UCI's machine learning repository. I want to display a correlation heatmap using matplotlib and imshow.
The first time I tried it, it worked fine. All the numeric variables plotted and labeled, seen here:
fig = plt.figure(figsize=(15,8))
ax1 = fig.add_subplot(111)
plt.imshow(df.corr(), cmap='hot', interpolation='nearest')
plt.colorbar()
labels = df.columns.tolist()
ax1.set_xticklabels(labels,rotation=90, fontsize=10)
ax1.set_yticklabels(labels,fontsize=10)
plt.show()
successful heatmap
Later, I used get_dummies() on my categorical variable, like so:
df = pd.get_dummies(df, columns = ['sex'])
resulting correlation matrix
So, if I reuse the code from before to generate a nice heatmap, it should be fine, right? Wrong!
What dumpster fire is this?
So my question is, where did my labels go, and how do I get them back?!
Thanks!
To get your labels back, you can force matplotlib to use enough xticks so that all your labels can be shown. This can be done by adding
ax1.set_xticks(np.arange(len(labels)))
ax1.set_yticks(np.arange(len(labels)))
before your statements ax1.set_xticklabels(labels,rotation=90, fontsize=10)
and ax1.set_yticklabels(labels,fontsize=10)
.
This results in the following plot:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With