Matplotlib - Boxplot calculated on log10 values but shown in logarithmic scale

Tags:

I think this is a simple question, but I just still can't seem to think of a simple solution. I have a set of data of molecular abundances, with values ranging many orders of magnitude. I want to represent these abundances with boxplots (box-and-whiskers plots), and I want the boxes to be calculated on log scale because of the wide range of values. I know I can just calculate the log10 of the data and send it to matplotlib's boxplot, but this does not retain the logarithmic scale in plots later.

So my question is basically this: When I have calculated a boxplot based on the log10 of my values, how do I convert the plot afterward to be shown on a logarithmic scale instead of linear with the log10 values? I can change tick labels to partly fix this, but I have no clue how I get logarithmic scales back to the plot.

Or is there another more direct way to plotting this. A different package maybe that has this options already included?

Many thanks for the help.

632

asked Jan 05 '16 09:01

Tobias

2 Answers

I'd advice against doing the boxplot on the raw values and setting the y-axis to logarithmic, because the boxplot function is not designed to work across orders of magnitudes and you may get too many outliers (depends on your data, of course).

Instead, you can plot the logarithm of the data and manually adjust the y-labels.

Here is a very crude example:

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import MultipleLocator, FormatStrFormatter

np.random.seed(42)

values = 10 ** np.random.uniform(-3, 3, size=100)

fig = plt.figure(figsize=(9, 3))


ax = plt.subplot(1, 3, 1)

ax.boxplot(np.log10(values))
ax.set_yticks(np.arange(-3, 4))
ax.set_yticklabels(10.0**np.arange(-3, 4))
ax.set_title('log')

ax = plt.subplot(1, 3, 2)

ax.boxplot(values)
ax.set_yscale('log')
ax.set_title('raw')

ax = plt.subplot(1, 3, 3)

ax.boxplot(values, whis=[5, 95])
ax.set_yscale('log')
ax.set_title('5%')

plt.show()

results

The right figure shows the box plot on the raw values. This leads to many outliers, because the maximum whisker length is computed as a multiple (default: 1.5) of the interquartile range (the box height), which does not scale across orders of magnitude.

Alternatively, you could specify to draw the whiskers for a given percentile range: ax.boxplot(values, whis=[5, 95]) In this case you get a fixed amount of outlires (5%) above and below.

answered Sep 25 '22 01:09

MB-F

You can use plt.yscale:

plt.boxplot(data); plt.yscale('log')

answered Sep 24 '22 01:09

Ferro

Related questions
                            
                                How to parallelise .predict() method of a scikit-learn SVM (SVC) Classifier?
                            
                                Python memory consumption in 64 bit system for int and float
                            
                                Development build of pandas giving importerror: C extension: 'hashtable' not built on python 3.4 (anaconda)
                            
                                python statsmodels: Help using ARIMA model for time series
                            
                                GridSearchCV: performance metrics on a selected class [unbalanced data-set]
                            
                                How to find all of an SQLAlchemy model's hybrid attributes?
                            
                                Python modules autoloader?
                            
                                Python Tornado - Confused how to convert a blocking function into a non-blocking function
                            
                                Python PIP Path in Windows 10
                            
                                Theano import error: No module named cPickle
                            
                                How to remove the effects of a decorator while testing in python? [duplicate]
                            
                                How do I create a deterministic Random number generator with numpy seed?
                            
                                Miniconda "installs" numpy but Python can't import it
                            
                                How does python implement mutual recursion?
                            
                                Nested List comprehension in Python
                            
                                How to use mock.ANY with assert_called_with
                            
                                Python statsmodels ARIMA Forecast
                            
                                Tensor Flow Explicit Device Requirement Error
                            
                                Topic modelling - Assign a document with top 2 topics as category label - sklearn Latent Dirichlet Allocation
                            
                                Argparse and ArgumentDefaultsHelpFormatter. Formatting of default values when sys.stdin/stdout are selected as default

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Matplotlib - Boxplot calculated on log10 values but shown in logarithmic scale

Tags:

python

matplotlib

plot

logarithm

boxplot

Tobias

People also ask

2 Answers

MB-F

Ferro

Recent Activity

Donate For Us