Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

multiple boxplots from Pandas dataframe

I'm trying to plot a panelplot with multiple boxplots from data in pandas dataframe. The columns of dataframe look like this:

 data.columns 
 Index([u'SiteId', u'obs1', u'obs2', u'obs3', u'obs4', u'obs5', u'obs6', u'date', u'area']

I want to create a panel of 9 different plots (since there are 9 distinct geographical areas) each of which has 12 boxplots for each month of the year. An example is shown below with the snippet used to create the plot:

df = data.ix[:, ['obs1','date', 'area']]
df = df.set_index('date')
colm = ['LOCATION 1']
for area in areas:
   df2 = df.loc[(df.area== area)]
   df2.boxplot(column=colm, by=df2.index.month, showmeans=True)

the above code results in the only one figure (with boxplots corresponding to each month in the figure), but I want to create 9 such plots each corresponding to a particular area as subplots in the same plot. In other words, I want to first group the data by area, then by month of the year and then plot the result as boxplot. Any thoughts how I can get the desired plots? Any help is appreciated.

Also, how can I get rid of the "Boxplot grouped by [1 1 1 ...12 12 12]" and "1,1,1,1,1,1,1,1,1,....." both at the top and bottom of the plot?

I can't post images since stackoverflow rules don't allow me to. Thanks.

like image 445
Vakratund Avatar asked Sep 23 '15 08:09

Vakratund


People also ask

How do you draw multiple Boxplots?

In this article, we will learn how to plot multiple boxplot in one graph in R Programming Language. This can be accomplished by using boxplot() function, and we can also pass in a list, data frame or multiple vectors to it. For this purpose, we need to put name of data into boxplot() function as input.

How do you make a boxplot for two columns in Python?

To draw a box plot with Seaborn, the boxplot() function is used. You can either pass the full dataframe column names to the x or y parameters or you can simply specify the column names in the x and y parameters and then specify the dataframe name in the dataset parameter.


1 Answers

Does this do what you want?

fig, axs = plt.subplots(len(areas), 1, figsize=(5,45))
for ax,area in zip(axs,areas):
    df2 = df.loc[(df.area==area)]
    df2.boxplot(column=['obs1'], by=df2.index.month, showmeans=True, ax=ax)
like image 117
Diziet Asahi Avatar answered Oct 09 '22 02:10

Diziet Asahi