Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to draw a matplotlib boxplot given the percentile values instead of the original inputs?

Tags:

From what I can see, boxplot() method expects a sequence of raw values (numbers) as input, from which it then computes percentiles to draw the boxplot(s).

I would like to have a method by which I could pass in the percentiles and get the corresponding boxplot.

For example:

Assume that I have run several benchmarks and for each benchmark I've measured latencies ( floating point values ). Now additionally, I have precomputed the percentiles for these values.

Hence for each benchmark, I have the 25th, 50th, 75th percentile along with the min and max.

Now given these data, I would like to draw the box plots for the benchmarks.

like image 231
Alex Averbuch Avatar asked Nov 30 '14 14:11

Alex Averbuch


People also ask

What percentiles are used in a boxplot?

One such plot is the box plot. The box plot is defined by five data-summary values and also shows the outliers. The box portion of the box plot is defined by two lines at the 25th percentile and 75th percentile. The 25th percentile is the value at which 25% of the data values are below this value.

What does 75th percentile mean in boxplot?

We call the rth percentile the value such that r percent of the data fall at or below that value. Example. If you score in the 75th percentile, then 75% of the population scored lower than you. Example.

How do you draw a boxplot using Matplotlib in Python?

Creating Box Plotpyplot module of matplotlib library provides boxplot() function with the help of which we can create box plots. The data values given to the ax. boxplot() method can be a Numpy array or Python list or Tuple of arrays. Let us create the box plot by using numpy.


1 Answers

As of 2020, there is a better method than the one in the accepted answer.

The matplotlib.axes.Axes class provides a bxp method, which can be used to draw the boxes and whiskers based on the percentile values. Raw data is only needed for the outliers, and that is optional.

Example:

import matplotlib.pyplot as plt  fig, ax = plt.subplots() boxes = [     {         'label' : "Male height",         'whislo': 162.6,    # Bottom whisker position         'q1'    : 170.2,    # First quartile (25th percentile)         'med'   : 175.7,    # Median         (50th percentile)         'q3'    : 180.4,    # Third quartile (75th percentile)         'whishi': 187.8,    # Top whisker position         'fliers': []        # Outliers     } ] ax.bxp(boxes, showfliers=False) ax.set_ylabel("cm") plt.savefig("boxplot.png") plt.close() 

This produces the following image: example boxplot

like image 141
Vicariggio Avatar answered Oct 02 '22 10:10

Vicariggio