Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas plot with different variable for subplots and colour?

Currently this code:

count_df = (df[['rank', 'name', 'variable', 'value']]
    .groupby(['rank', 'variable', 'name'])
    .agg('count')
    .unstack())
count_df .head()
#               value                          
# name           1lin STH_km27_lin ST_lin S_lin
# rank variable                                
# 1.0  NEE         24          115     33    28
#      Qg          23           54     14     9
#      Qh          37          124     11    28
# ...
count_df.plot(kind='bar')

gets me this plot:

bar plot with too much shit on it

using subplots=True in the .plot() call gets me this:

useless subplots

which is pretty useless, because the colours are mapped to the same variable as the subplot facetting. Is there a way to choose which column/index is used for the sub-plotting, so that I can still have colours per name (count_df column header), but sub-plots per variable, so that each subplot has a bar per name/rank, grouped by rank, and coloured by name?

like image 408
naught101 Avatar asked Oct 29 '22 20:10

naught101


1 Answers

Hrm. I suspect this isn't doable in pandas by itself, but I found a way to do it in Seaborn:

import seaborn as sns

cdf = (df[['rank', 'name', 'variable', 'value']]
           .groupby(['rank', 'variable', 'name'])
           .agg('count'))
sns.factorplot(x="rank", y="value", row="variable", hue="name",
               data=cdf.reset_index(), kind='bar')

which results in:

barplot by rank, variable, and name

which is close enough for my purposes

like image 179
naught101 Avatar answered Nov 15 '22 07:11

naught101