Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Set y-axis scale for pandas Dataframe Boxplot(), 3 Deviations?

I'm trying to make a single boxplot chart area per month with different boxplots grouped by (and labeled) by industry and then have the Y-axis use a scale I dictate.

In a perfect world this would be dynamic and I could set the axis to be a certain number of standard deviations from the overall mean. I could live with another type of dynamically setting the y axis but I would want it to be standard on all the 'monthly' grouped boxplots created. I don't know what the best way to handle this is yet and open to wisdom - all I know is the numbers being used now are way to large for the charts to be meaningful.

I've tried all kinds of code and had zero luck with the scaling of axis and the code below was as close as I could come to the graph.

Here's a link to some dummy data: https://drive.google.com/open?id=0B4xdnV0LFZI1MmlFcTBweW82V0k

And for the code I'm using Python 3.5:

import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
matplotlib.use('TkAgg')
import pylab    
df =  pd.read_csv('Query_Final_2.csv')
df['Ship_Date'] = pd.to_datetime(df['Ship_Date'], errors = 'coerce')
df1 = (df.groupby('Industry'))
print(
df1.boxplot(column='Gross_Margin',layout=(1,9), figsize=(20,10), whis=[5,95])
,pylab.show()
)
like image 520
Python_Learner_DK Avatar asked Nov 30 '16 15:11

Python_Learner_DK


People also ask

How do you change the Y-axis scale in a boxplot in Python?

Using boxplot(), draw a box plot to show distributions with respect to categories. To set the range of Y-axis, use the ylim() method. To display the figure, use the show() method.

How do you change the size of a boxplot in Python?

Steps. Set the figure size and adjust the padding between and around the subplots. Make a Pandas dataframe, i.e., two-dimensional, size-mutable, potentially heterogeneous tabular data. Make a box and whisker plot, using boxplot() method with width tuple to adjust the box in boxplot.

How do you customize a boxplot?

Customizing Box PlotThe notch = True attribute creates the notch format to the box plot, patch_artist = True fills the boxplot with colors, we can set different colors to different boxes. The vert = 0 attribute creates horizontal box plot. labels takes same dimensions as the number data sets.


2 Answers

Here is a cleaned up version of your code with the solution:

import pandas as pd
import matplotlib.pyplot as plt

df =  pd.read_csv('Query_Final_2.csv')
df['Ship_Date'] = pd.to_datetime(df['Ship_Date'], errors = 'coerce')
df1 = df.groupby('Industry')

axes = df1.boxplot(column='Gross_Margin',layout=(1,9), figsize=(20,10),
                   whis=[5,95], return_type='axes')
for ax in axes.values():
    ax.set_ylim(-2.5, 2.5)

plt.show()

The key is to return the subplots as axes objects and set the limits individually.

like image 166
AlexG Avatar answered Sep 28 '22 18:09

AlexG


Once you have established variables for the mean and the standard deviation, use:

plt.ylim(ymin, ymax)

to set the y-axis.

like image 45
Padraig Avatar answered Sep 28 '22 19:09

Padraig