Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Violinplots in seaborn not showing mean, percentiles nor sticks?

When I try to replicate the example here, my violin plots (with my data) don't show the median and median, along with the 25th and 75th percentile, but the original example does.

It also seems to ignore the argument "sticks" to it.

Here is what I tried:

sns.violinplot(df, "stick", color="pastel")

and here is what I get:

enter image description here

while the original looks like the following (for sns.violinplot(df, color="pastel")):

enter image description here

NOTE:

This problem does not affect boxplots.

Here is a minimal example which results in (very similar) shapes:

> df

                                A               B
0RS0NrQDHHx                   NaN        19.727869
0RS232Ak80k                   NaN        32.552973
0RSECe1NRShE                  NaN        44.369213
0RSHVQNT16d                   NaN        11.306910
0RSO4JcoLeb                   NaN        -7.935776
0RSOrrpKlRu                   NaN        39.489909
0RSVIHDWBR1                   NaN        52.830051
0RSWe5CE1Hk                   NaN        26.913323
0RSXhLG3Kp8             -1.921543              NaN
0RSc8uRSessd             27.028029             NaN
0RScRSZoDX72             12.713600             NaN
0RSdwNiizS0             28.859158              NaN
0RSeWHWRSww3             12.537717             NaN
0RSrs6jjCsM              5.135179              NaN
0RStNwVhvO1            -55.566641              NaN
0RStQI2VH5A            -15.119272              NaN
0RStWRWmH8V             -2.369918              NaN
0RSukeajMJy             -0.904298              NaN
0RSvJezMyrx             -1.105769              NaN
0RSx5WRStDjG             0.899420              NaN
like image 684
Josh Avatar asked Feb 03 '26 14:02

Josh


1 Answers

Try sns.violinplot(df, inner="stick", color="pastel"). The second positional argument is a grouping variable. (Although, inner="stick" shows each observation. If you want the 25, 50, and 75th percentiles, do inner="box").

Also to handle a relatively sparse dataframe with lots of NAs, e.g.

df = pd.DataFrame(np.random.randn(20, 5), columns=list("ABCDE"))
for i, c in zip(range(5, 10), df.columns):
    df.loc[i, c] = np.nan

you could do

plot_vals = [v.dropna() for k, v in foo.iteritems()] 
sns.violinplot(plot_vals, names=df.columns)
like image 101
mwaskom Avatar answered Feb 05 '26 06:02

mwaskom



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!