I have a plot that I am generating through KMeans algorithm in scikit-learn. The clusters correspond to different colors. Here is the plot,
I need a legend for this plot which corresponds to the cluster number in the plot. Ideally, the legend should display the color of the cluster and the label should be the cluster number. Thanks.
EDIT: I think I should put some code since people are downvoting this
from sklearn.cluster import KMeans
km = KMeans(n_clusters=20, init='random')
km.fit(df) #df is the dataframe which contains points as coordinates
labels = km.labels_
plt.clf()
fig = plt.figure()
ax = fig.add_subplot(111, axisbg='w', frame_on=True)
fig.set_size_inches(18.5, 10.5)
# Plot the clusters on the map
# m is a basemap object
m.scatter(
[geom.x for geom in map_points],
[geom.y for geom in map_points],
20, marker='o', lw=.25,
c = labels.astype(float),
alpha =0.9, antialiased=True,
zorder=3)
m.fillcontinents(color='#555555')
plt.show()
Adding legend to Matplotlib scatte plot We can try to add legend to the scatterplot colored by a variable, by using legend() function in Matplotlib. In legend(), we specify title and handles by extracting legend elements from the plot.
First, let us get started by making a scatterplot using Matplotlib’s scatter function. We use “c” argument in scatter() function to color data points by species variable in the dataframe.
Creating legend with color box To create a legend with a color box, patches are used provided by the matplotlib.patchesmodule. A patch nothing but a 2D artist with face color and edge color. Below is a simple example of this:
Now before starting the topic firstly, we have to understand what does “legend” means and how “scatter plot created”. Legend is an area that outlines the elements of the plot. Scatter Plot is a graph in which the values of two variables are plotted along two axes. It represent the reletionship between two variables in a data-set.
I was able to make the legend correspond to the color. The key was using multiple scatterplots for each category in the data as mentioned by Rutger Kassies.
Here is the code:
import numpy as np
import matplotlib.pyplot as plt
# Setting various plot properties
plt.clf()
fig = plt.figure()
ax = fig.add_subplot(111, axisbg='w', frame_on=True)
fig.set_size_inches(18.5, 10.5)
# Creating a discrete colorbar
colors = plt.cm.rainbow(np.linspace(0, 1, 20))
current_plot_range = 0
previous_plot_range = 0
for i,c in enumerate(colors):
previous_plot_range += current_plot_range
current_plot_range = labels[labels==i].size
m.scatter(
[geom.x for geom in map_points[
previous_plot_range:previous_plot_range+current_plot_range]],
[geom.y for geom in map_points[
previous_plot_range:previous_plot_range+current_plot_range]],
20, lw=.25, marker='o',color = c, label=i, alpha =0.9, antialiased=True,
zorder=3)
plt.legend()
m.fillcontinents(color='#555555')
The result looks something like this:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With