Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Highlight specific points (based on a condition) in a scatter plot

I have an numpy array, which I Convert to a dataframe to visualize it in matplotlib.pytplot:

x = np.array([[10,10,1], [10,20,0], [10,3,0]])
df = pd.DataFrame(x, columns=["x", "y", "z"])

If i plot a scatter plot based on the column z as color, i get the following output(note that I am using different values):

import matplotlib.pyplot as plt
fig, ax = plt.subplots()
df["val"] = df['z'].apply(lambda x: "red" if x==1 else "blue")
ax.scatter(x=df["x"], y=df["y"], c=df["val"], s=100)
fig = plt.gcf()
fig.set_size_inches(15, 10)
plt.plot()

enter image description here

So all the red points are not shown in the plot. How can I highlight them?

If I run seaborn it is working, but is there a way in matplotlib:

import seaborn as sns
sns.lmplot('x', 'y', data =df, hue='z', fit_reg=False)
fig = plt.gcf()
fig.set_size_inches(15, 10)
plt.show()

enter image description here

The similar question: How can I highlight a dot in a cloud of dots with Matplotlib? only make the dots bigger, but does not put them in front. This is then missleading somehow.

like image 691
PV8 Avatar asked Nov 25 '25 23:11

PV8


1 Answers

You can sort your data so that the foreground points are drawn last as shown in the lower plot of the following example:

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame(pd.np.random.rand(1000,3), columns=["x", "y", "z"])
df.z = df.z.add(.1).astype(int)
df["val"] = df['z'].apply(lambda x: "red" if x==1 else "blue")

df1 = df.sort_values('z')

fig, ax = plt.subplots(2)
ax[0].scatter(x=df["x"], y=df["y"], c=df["val"], s=100)
ax[1].scatter(x=df1["x"], y=df1["y"], c=df1["val"], s=100)

enter image description here

like image 114
Stef Avatar answered Nov 27 '25 13:11

Stef



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!