Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Matplotlib - How do I set ylim() for a series of plots?

I have a series of box plots I am trying to make, each of which has a different range. I tried setting ylim by determining the max and min of each separate series. However, the min in many cases is an outlier, and so the plot is compressed. How can I select the same limit used by the 'whiskers' of the plot (plus a small margin)?

Eg, right now I'm doing this:

[In]
ax = df['feature'].boxplot()
ymax = max(df['feature']
ymin = min(df['feature']
ax.set_ylim([ymax,ymin])

I'd like to set ymax, ymin to be the whiskers of the box plot.

like image 328
GPB Avatar asked Sep 05 '15 17:09

GPB


People also ask

How do I change YLIM in Python?

To change the limit of axes, we use the ylim() function with keyword arguments bottom and top and set their values. Here we set the bottom value as -150 and the top value as 150. To plot the line graph, we use the plot() function.

How do I show multiple plots in Matplotlib?

In Matplotlib, we can draw multiple graphs in a single plot in two ways. One is by using subplot() function and other by superimposition of second graph on the first i.e, all graphs will appear on the same plot.


Video Answer


2 Answers

As an alternative to what @unutbu suggested, you could avoid plotting the outliers and then use ax.margins(y=0) (or some small eps) to scale the limits to the range of the whiskers.

For example:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame(np.random.poisson(5, size=(100, 5)))

fig, ax = plt.subplots()
#Note showfliers=False is more readable, but requires a recent version iirc
box = df.boxplot(ax=ax, sym='') 
ax.margins(y=0)
plt.show()

enter image description here

And if you'd like a bit of room around the largest "whiskers", use ax.margins(0.05) to add 5% of the range instead of 0% of the range:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame(np.random.poisson(5, size=(100, 5)))

fig, ax = plt.subplots()
box = df.boxplot(ax=ax, sym='')
ax.margins(y=0.05)
plt.show()

enter image description here

like image 146
Joe Kington Avatar answered Sep 28 '22 14:09

Joe Kington


You could set showfliers=False in the boxplot, so the outliers don't get plotted.

Since you ask specifically about the whiskers, this is how they are calculated, with a default of 1.5:

whis : float, sequence (default = 1.5) or string

As a float, determines the reach of the whiskers past the first and third quartiles (e.g., Q3 + whis*IQR, IQR = interquartile range, Q3-Q1). Beyond the whiskers, data are considered outliers and are plotted as individual points. Set this to an unreasonably high value to force the whiskers to show the min and max values. Alternatively, set this to an ascending sequence of percentile (e.g., [5, 95]) to set the whiskers at specific percentiles of the data. Finally, whis can be the string ‘range’ to force the whiskers to the min and max of the data. In the edge case that the 25th and 75th percentiles are equivalent, whis will be automatically set to ‘range’.

You could do the same calculation and set your ylim to that.

like image 36
Paulo Almeida Avatar answered Sep 28 '22 16:09

Paulo Almeida