Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Matplotlib - Boxplot calculated on log10 values but shown in logarithmic scale

I think this is a simple question, but I just still can't seem to think of a simple solution. I have a set of data of molecular abundances, with values ranging many orders of magnitude. I want to represent these abundances with boxplots (box-and-whiskers plots), and I want the boxes to be calculated on log scale because of the wide range of values. I know I can just calculate the log10 of the data and send it to matplotlib's boxplot, but this does not retain the logarithmic scale in plots later.

So my question is basically this: When I have calculated a boxplot based on the log10 of my values, how do I convert the plot afterward to be shown on a logarithmic scale instead of linear with the log10 values? I can change tick labels to partly fix this, but I have no clue how I get logarithmic scales back to the plot.

Or is there another more direct way to plotting this. A different package maybe that has this options already included?

Many thanks for the help.

like image 632
Tobias Avatar asked Jan 05 '16 09:01

Tobias


People also ask

How do you change the log scale in Matplotlib?

The method yscale() or xscale() takes a single value as a parameter which is the type of conversion of the scale, to convert axes to logarithmic scale we pass the “log” keyword or the matplotlib. scale. LogScale class to the yscale or xscale method.

What is logarithmic scale in Matplotlib?

The logarithmic scale in Matplotlib The scale means the graduations or tick marks along an axis. They can be any of: matplotlib. scale. LinearScale—These are just numbers, like 1, 2, 3.

How do you make a box plot with a log scale in R?

To create a boxplot with log of the variable in base R, we need to use log argument within the boxplot function but we need to carefully pass the Y-axis inside the function because the values of the boxplot are plotted on the Y-axis.

How do you graph negative values on a logarithmic scale?

Bottom line: A logarithmic axis can only plot positive values. There simply is no way to put negative values or zero on a logarithmic axis.


2 Answers

I'd advice against doing the boxplot on the raw values and setting the y-axis to logarithmic, because the boxplot function is not designed to work across orders of magnitudes and you may get too many outliers (depends on your data, of course).

Instead, you can plot the logarithm of the data and manually adjust the y-labels.

Here is a very crude example:

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import MultipleLocator, FormatStrFormatter

np.random.seed(42)

values = 10 ** np.random.uniform(-3, 3, size=100)

fig = plt.figure(figsize=(9, 3))


ax = plt.subplot(1, 3, 1)

ax.boxplot(np.log10(values))
ax.set_yticks(np.arange(-3, 4))
ax.set_yticklabels(10.0**np.arange(-3, 4))
ax.set_title('log')

ax = plt.subplot(1, 3, 2)

ax.boxplot(values)
ax.set_yscale('log')
ax.set_title('raw')

ax = plt.subplot(1, 3, 3)

ax.boxplot(values, whis=[5, 95])
ax.set_yscale('log')
ax.set_title('5%')

plt.show()

results

The right figure shows the box plot on the raw values. This leads to many outliers, because the maximum whisker length is computed as a multiple (default: 1.5) of the interquartile range (the box height), which does not scale across orders of magnitude.

Alternatively, you could specify to draw the whiskers for a given percentile range: ax.boxplot(values, whis=[5, 95]) In this case you get a fixed amount of outlires (5%) above and below.

like image 63
MB-F Avatar answered Sep 25 '22 01:09

MB-F


You can use plt.yscale:

plt.boxplot(data); plt.yscale('log')
like image 26
Ferro Avatar answered Sep 24 '22 01:09

Ferro