<p>How I do multiple plot from a multi-indexed pandas DataFrame based on one of the levels of the multiindex?</p> <p>I have results from a model with different technologies usage in different scenarios, the results could look something like this:</p> <pre class="prettyprint"><code>import numpy as np import pandas as pd df=pd.DataFrame(abs(np.random.randn(12,4)),columns=[2011,2012,2013,2014]) df['scenario']=['s1','s1','s1','s2','s2','s3','s3','s3','s3','s4','s4','s4'] df['technology'=['t1','t2','t5','t2','t6','t1','t3','t4','t5','t1','t3','t4'] dfg=df.groupby(['scenario','technology']).sum().transpose() </code></pre> <p>dfg would have the technologies employed each year for each scenario. I would like to have a subplot for each scenario sharing the legend. </p> <p>If I simply use the argument subplots=True, then it plots all the possible combinations (12 subplots)</p> <pre class="prettyprint"><code>dfg.plot(kind='bar',stacked=True,subplots=True) </code></pre> <p>Based on this response I got closer to what I was looking for. </p> <pre class="prettyprint"><code>f,a=plt.subplots(2,2) fig1=dfg['s1'].plot(kind='bar',ax=a[0,0]) fig2=dfg['s2'].plot(kind='bar',ax=a[0,1]) fig2=dfg['s3'].plot(kind='bar',ax=a[1,0]) fig2=dfg['s3'].plot(kind='bar',ax=a[1,1]) plt.tight_layout() </code></pre> <p>but the result is not ideal, each subplot has a different legend...and that makes it quite difficult to read. There must be an easier way to do subplots from a multiindexed dataframes... Thanks! </p> <p>EDIT1: Ted Petrou proposed a nice solution using seaborn factorplot but I have two issues. I already have a style defined and I'd rather not use the seaborn style (one solution could be change the parameters of seaborn). The other problem is that I wanted to use a stacked bar plot, which require considerable extra tweaks. Any chance I can do something similar with Matplotlib? </p>

<p>In my opinion it's easier to do a data analysis when you 'tidy' up your data - making each column represent one variable. Here, you have all 4 years represented in different columns. Pandas has one function and one method to make long(tidy) data from wide(messy) data. You can use <code>df.stack</code> or <code>pd.melt(df)</code> to tidy your data. Then you can take advantage of the excellent seaborn library which expects tidy data to easily plot most anything you want.</p> <h3>Tidy the data</h3> <pre class="prettyprint"><code>df1 = pd.melt(df, id_vars=['scenario', 'technology'], var_name='year') print(df1.head()) scenario technology year value 0 s1 t1 2011 0.406830 1 s1 t2 2011 0.495418 2 s1 t5 2011 0.116925 3 s2 t2 2011 0.904891 4 s2 t6 2011 0.525101 </code></pre> <h3>Use Seaborn</h3> <pre class="prettyprint"><code>import seaborn as sns sns.factorplot(x='year', y='value', hue='technology', col='scenario', data=df1, kind='bar', col_wrap=2, sharey=False) </code></pre> <p><img src="https://i.stack.imgur.com/W70tw.png" alt="enter image description here"></p>

subplots from a multiindex pandas dataframe grouped by level

Tags:

python

pandas

matplotlib

multi-index

subplot

How I do multiple plot from a multi-indexed pandas DataFrame based on one of the levels of the multiindex?

I have results from a model with different technologies usage in different scenarios, the results could look something like this:

import numpy as np
import pandas as pd
df=pd.DataFrame(abs(np.random.randn(12,4)),columns=[2011,2012,2013,2014])
df['scenario']=['s1','s1','s1','s2','s2','s3','s3','s3','s3','s4','s4','s4']
df['technology'=['t1','t2','t5','t2','t6','t1','t3','t4','t5','t1','t3','t4']
dfg=df.groupby(['scenario','technology']).sum().transpose()

dfg would have the technologies employed each year for each scenario. I would like to have a subplot for each scenario sharing the legend.

If I simply use the argument subplots=True, then it plots all the possible combinations (12 subplots)

dfg.plot(kind='bar',stacked=True,subplots=True)

Based on this response I got closer to what I was looking for.

f,a=plt.subplots(2,2)

fig1=dfg['s1'].plot(kind='bar',ax=a[0,0])

fig2=dfg['s2'].plot(kind='bar',ax=a[0,1])

fig2=dfg['s3'].plot(kind='bar',ax=a[1,0])

fig2=dfg['s3'].plot(kind='bar',ax=a[1,1])

plt.tight_layout()

but the result is not ideal, each subplot has a different legend...and that makes it quite difficult to read. There must be an easier way to do subplots from a multiindexed dataframes... Thanks!

EDIT1: Ted Petrou proposed a nice solution using seaborn factorplot but I have two issues. I already have a style defined and I'd rather not use the seaborn style (one solution could be change the parameters of seaborn). The other problem is that I wanted to use a stacked bar plot, which require considerable extra tweaks. Any chance I can do something similar with Matplotlib?

700

asked Jan 23 '17 16:01

Nabla

1 Answers

In my opinion it's easier to do a data analysis when you 'tidy' up your data - making each column represent one variable. Here, you have all 4 years represented in different columns. Pandas has one function and one method to make long(tidy) data from wide(messy) data. You can use df.stack or pd.melt(df) to tidy your data. Then you can take advantage of the excellent seaborn library which expects tidy data to easily plot most anything you want.

Tidy the data

df1 = pd.melt(df, id_vars=['scenario', 'technology'], var_name='year')
print(df1.head())

  scenario technology  year     value
0       s1         t1  2011  0.406830
1       s1         t2  2011  0.495418
2       s1         t5  2011  0.116925
3       s2         t2  2011  0.904891
4       s2         t6  2011  0.525101

Use Seaborn

import seaborn as sns
sns.factorplot(x='year', y='value', hue='technology', 
               col='scenario', data=df1, kind='bar', col_wrap=2,
              sharey=False)

enter image description here

answered Oct 23 '22 07:10

Ted Petrou

Related questions
                            
                                Fast methods for approximating the highest 3 eigenvalues and eigenvectors of a large symmetric matrix
                            
                                Pandas analogue of JOIN with WHERE clause
                            
                                AH01215: (8) Exec format error: exec of '/var/www/python/hello.py' failed: /var/www/python/hello.py
                            
                                Pandas convert float in scientific notation to string
                            
                                Airflow XCOM KeyError: 'task_instance'
                            
                                How to get logger level as string value
                            
                                Converting a full column of integer into string with thousands separated using comma in pandas
                            
                                Compute mean squared, absolute deviation and custom similarity measure - Python/NumPy
                            
                                python: pandas: filter one column and get the average of another column
                            
                                Convert string date time to pandas datetime
                            
                                use apt-get install python packages in .gitlab-ci.yml
                            
                                Rendering a float array to 24-bit RGB image (using PIL for example)
                            
                                Use pm2 with Django
                            
                                Overiding __mul__ in two dimensional vector class to preserve commutivity
                            
                                Can't Find Jupyter Notebook Kernel
                            
                                I'm trying to count all letters in a txt file then display in descending order
                            
                                Fill in a blank dataframe column with all 0 values using Python
                            
                                createsuperuser didn't ask for username
                            
                                Draw Circles on Top Level of Figure
                            
                                Change queryset in ModelViewSet in Django Rest Framework

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With