Highlight specific points (based on a condition) in a scatter plot

Question

I have an numpy array, which I Convert to a dataframe to visualize it in matplotlib.pytplot:

x = np.array([[10,10,1], [10,20,0], [10,3,0]])
df = pd.DataFrame(x, columns=["x", "y", "z"])

If i plot a scatter plot based on the column z as color, i get the following output(note that I am using different values):

import matplotlib.pyplot as plt
fig, ax = plt.subplots()
df["val"] = df['z'].apply(lambda x: "red" if x==1 else "blue")
ax.scatter(x=df["x"], y=df["y"], c=df["val"], s=100)
fig = plt.gcf()
fig.set_size_inches(15, 10)
plt.plot()

enter image description here

So all the red points are not shown in the plot. How can I highlight them?

If I run seaborn it is working, but is there a way in matplotlib:

import seaborn as sns
sns.lmplot('x', 'y', data =df, hue='z', fit_reg=False)
fig = plt.gcf()
fig.set_size_inches(15, 10)
plt.show()

enter image description here

The similar question: How can I highlight a dot in a cloud of dots with Matplotlib? only make the dots bigger, but does not put them in front. This is then missleading somehow.

Stef · Accepted Answer

You can sort your data so that the foreground points are drawn last as shown in the lower plot of the following example:

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame(pd.np.random.rand(1000,3), columns=["x", "y", "z"])
df.z = df.z.add(.1).astype(int)
df["val"] = df['z'].apply(lambda x: "red" if x==1 else "blue")

df1 = df.sort_values('z')

fig, ax = plt.subplots(2)
ax[0].scatter(x=df["x"], y=df["y"], c=df["val"], s=100)
ax[1].scatter(x=df1["x"], y=df1["y"], c=df1["val"], s=100)

enter image description here

Highlight specific points (based on a condition) in a scatter plot

Tags:

python

pandas

matplotlib

seaborn

PV8

1 Answers

Stef

Recent Activity

Donate For Us

Highlight specific points (based on a condition) in a scatter plot

Tags:

python

pandas

matplotlib

seaborn

PV8

1 Answers

Stef

Related questions

Recent Activity

Donate For Us