Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create a boxplot not showing the outliers using Python and Plotly?

Tags:

python

plotly

How to create a boxplot not showing the outliers using Python and Plotly?

I have a full list of points I use to create a box plot that has many outliers and the range is too big for a comparable box plot.

I just don't want to show the outliers in this list on the box plot at all.

  1. Is there a way to not show outliers in the box plot?

If not, then I tried removing the outliers from data before plotting it. However, then Plotly makes some of points that I did not remove as outliers.

  1. Is there a way to create a box plot where none of the elements are considered outliers?
like image 987
pr338 Avatar asked Jul 19 '15 02:07

pr338


People also ask

How do you ignore outliers on a boxplot?

We can remove outliers in R by setting the outlier. shape argument to NA. In addition, the coord_cartesian() function will be used to reject all outliers that exceed or below a given quartile. The y-axis of ggplot2 is not automatically adjusted.

Can a box plot have no outliers?

The basic form of a boxplot But, if there ARE outliers, then a boxplot will instead be made up of the following values. As you can see above, outliers (if there are any) will be shown by stars or points off the main plot. If there are no outliers, you simply won't see those points.

Do box plots always show outliers?

Yes: If there are outliers in the data set, they should be included in the box plot.


2 Answers

Andrew from Plotly here.

  1. You can't just not show some of the data in the array. So you can set boxpoints: "all" to get a jitter of the points, including the outliers. This will leave the boxplot as-is, without outliers sitting on top of it. I'm guessing this isn't really what you want though.

  2. To prevent outliers from being discovered in the data array, set boxpoints: false

So in Python, something like this should work:

import plotly.plotly as py
from plotly.graph_objs import Box, Figure

fig = Figure()
boxpoints_default = Box(y=[1, 2, 3, 2, 1, 10], name='default')
boxpoints_false = Box(y=[1, 2, 3, 2, 1, 10], boxpoints=False, name='no outliers')
boxpoints_all = Box(y=[1, 2, 3, 2, 1, 10], boxpoints='all', name='jitter boxpoints')

fig['data'].extend([boxpoints_default, boxpoints_false, boxpoints_all])
fig['layout'].update(title='Comparing boxplot "boxpoints" settings')

py.iplot(fig, filename='Stack Overflow 31497537')

Here's the resulting figure for that:

https://plot.ly/~theengineear/4936/comparing-boxplot-boxpoints-settings/

Here's a link to box plot tutorials in general with Plotly:

http://help.plot.ly/make-a-box-plot/

like image 141
theengineear Avatar answered Oct 13 '22 00:10

theengineear


Late to the game but I found a (perhaps new) easy solution:

fig.update_traces(boxpoints=False) 

as shown here: https://plotly.com/python/reference/box/

Note that I had to delete the selector=dict(type='box) section as it created an error.

like image 22
Tom Avatar answered Oct 13 '22 00:10

Tom