Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Plot swarmplot or boxplot in the same ax depending on number of datapoints

I have a dataframe with several columns, where every column has between 5 and 2535 entries (the rest is NAN). I want to plot a boxplot when the column has more than 9 numeric entries and a swarmplot otherwise. I used my mad paint skills to create an example. enter image description here

The problem is that I am only able to plot both as overlays, as in this example. I tried using the position keyword, but this only works for the boxplot, not for the swarmplot. So, how can this be done?

An example dataset can be produced like this:

np.random.seed(1)
df = pd.DataFrame(np.nan, index=range(100), columns=range(11))
for i, column in enumerate(df.columns):
    if i % 2 == 0:
        fill_till = np.random.randint(1,11)
        df.loc[:fill_till-1,column] = np.random.random(fill_till)
    else:
        fill_till = np.random.randint(11,101)
        df.loc[:fill_till-1,column] = np.random.random(fill_till)
like image 349
F. Jehn Avatar asked Mar 06 '23 22:03

F. Jehn


1 Answers

You can create two copies of the data frame, one for the box plot and one for the swarm plot. Then, in each copy, set the values in the columns you don't want to plot in that way to nan.

col_mask = df.count() > 9
swarm_data = df.copy()
swarm_data.loc[:, col_mask] = np.nan
box_data = df.copy()
box_data.loc[:, ~col_mask] = np.nan

Then pass each of the copied data frames to the appropriate seaborn function.

sns.swarmplot(data=swarm_data)
sns.boxplot(data=box_data)
plt.show()

When creating the swarm plot seaborn will plot nothing for the columns filled with nan, but will leave space where they would be. The reverse will happen with the box plot, resulting in your column order being preserved.

The chart generated by the above code looks like this:

enter image description here

This approach would also work for columns with none-numeric labels:

enter image description here

like image 133
mostlyoxygen Avatar answered Mar 08 '23 13:03

mostlyoxygen