Is there a way to extract all outliers after plotting a Seaborn Boxplot? For example, if I am plotting a boxplot for the below data
client total
1 LA 1
2 Sultan 128
3 ElderCare 1
4 CA 3
5 More 900
I want to see the below records returned as outliers after the boxplot is plotted.
2 Sultan 128
5 More 900
Seaborn uses matplotlib to handle outlier calculations, meaning the key parameter, whis
, is passed onto ax.boxplot
. The specific function taking care of the calculation is documented here: https://matplotlib.org/api/cbook_api.html#matplotlib.cbook.boxplot_stats. You can use matplotlib.cbook.boxplot_stats
to calculate rather than extract outliers. The follow code snippet shows you the calculation and how it is the same as the seaborn plot:
import matplotlib.pyplot as plt
from matplotlib.cbook import boxplot_stats
import pandas as pd
import seaborn as sns
data = [
('LA', 1),
('Sultan', 128),
('ElderCare', 1),
('CA', 3),
('More', 900),
]
df = pd.DataFrame(data, columns=('client', 'total'))
ax = sns.boxplot(data=df)
outliers = [y for stat in boxplot_stats(df['total']) for y in stat['fliers']]
print(outliers)
for y in outliers:
ax.plot(1, y, 'p')
ax.set_xlim(right=1.5)
plt.show()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With