I have an numpy array, which I Convert to a dataframe to visualize it in matplotlib.pytplot:
x = np.array([[10,10,1], [10,20,0], [10,3,0]])
df = pd.DataFrame(x, columns=["x", "y", "z"])
If i plot a scatter plot based on the column z as color, i get the following output(note that I am using different values):
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
df["val"] = df['z'].apply(lambda x: "red" if x==1 else "blue")
ax.scatter(x=df["x"], y=df["y"], c=df["val"], s=100)
fig = plt.gcf()
fig.set_size_inches(15, 10)
plt.plot()

So all the red points are not shown in the plot. How can I highlight them?
If I run seaborn it is working, but is there a way in matplotlib:
import seaborn as sns
sns.lmplot('x', 'y', data =df, hue='z', fit_reg=False)
fig = plt.gcf()
fig.set_size_inches(15, 10)
plt.show()

The similar question: How can I highlight a dot in a cloud of dots with Matplotlib? only make the dots bigger, but does not put them in front. This is then missleading somehow.
You can sort your data so that the foreground points are drawn last as shown in the lower plot of the following example:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame(pd.np.random.rand(1000,3), columns=["x", "y", "z"])
df.z = df.z.add(.1).astype(int)
df["val"] = df['z'].apply(lambda x: "red" if x==1 else "blue")
df1 = df.sort_values('z')
fig, ax = plt.subplots(2)
ax[0].scatter(x=df["x"], y=df["y"], c=df["val"], s=100)
ax[1].scatter(x=df1["x"], y=df1["y"], c=df1["val"], s=100)

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With