I am trying to get the legend right on the figure below. It should be just 'green', 'blue' and 'red' with the corresponding color. But it is all over the place.

the code is below:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({
'category':['blue','green','red','blue','green','red','blue','green','red'],
'attempts':[8955,7881,6723,100,200,300,4567,876,54],
'success':[3000,7500,2000, 256,4567,4567,7665,543,43]
})
fig,ax = plt.subplots()
plt.scatter(df['attempts'],df['success'],c=df['category'],label=df['category'])
plt.legend(loc=2)
plt.savefig('scatter.png')
plt.show()
How do I get this right? (There is a similar one here: https://pythonspot.com/matplotlib-scatterplot/ in the second part "Scatter plot with groups", but this is not based on pandas dataframe).
You can use seaborn's scatterplot:
fig,ax = plt.subplots()
sns.scatterplot(data=df, hue='category', x='attempts', y='success')
plt.legend(loc=2)
plt.savefig('scatter.png')
plt.show()
Output:

Or pure matplotlib:
fig,ax = plt.subplots()
for k,d in df.groupby('category'):
ax.scatter(d['attempts'], d['success'], label=k)
plt.legend(loc=2)
plt.savefig('scatter.png')
plt.show()
output:

If you want to use a single scatter with matplotlib, it would look like this:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
df = pd.DataFrame({
'category':['blue','green','red','blue','green','red','blue','green','red'],
'attempts':[8955,7881,6723,100,200,300,4567,876,54],
'success':[3000,7500,2000, 256,4567,4567,7665,543,43]
})
u, inv = np.unique(df.category.values, return_inverse=True)
cmap = ListedColormap(u)
fig,ax = plt.subplots()
scatter = plt.scatter(df['attempts'],df['success'],c=inv, cmap=cmap)
plt.legend(scatter.legend_elements()[0], u, loc=2)
plt.savefig('scatter.png')
plt.show()

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With