Let's say i have a Dataframe with columns as Multiindex. For example:
a = pd.DataFrame(index=range(10),
columns=pd.MultiIndex.from_product(
iterables=[['2000', '2010'], ['a', 'b']],
names=['Year', 'Text']),
data=np.random.randn(10,4))
I'd like to make a boxplot that groups by the Year. Like the hue
arg on seaborn boxplots.
I wondered if there was an easy way to achieve that in either pandas/seaborn/matplotlib.
I feel an unstacking could do the trick but I can't get it to work.
Use stack
for reshape and plot by DataFrame.boxplot
:
np.random.seed(45)
a = pd.DataFrame(index=range(10),
columns=pd.MultiIndex.from_product(
iterables=[['2000', '2010'], ['a', 'b']],
names=['Year', 'Text']),
data=np.random.randn(10,4))
b = a.stack(level=0).reset_index(level=0, drop=True).reset_index()
print (b)
Text Year a b
0 2000 0.026375 0.260322
1 2010 -0.395146 -0.204301
2 2000 -1.271633 -2.596879
3 2010 0.289681 -0.873305
4 2000 0.394073 0.935106
5 2010 -0.015685 0.259596
6 2000 -1.473314 0.801927
7 2010 -1.750752 -0.495052
8 2000 -1.008601 0.025244
9 2010 -0.121507 -1.546873
10 2000 -0.606944 -1.393813
11 2010 -0.627695 0.332632
12 2000 -1.541367 1.670300
13 2010 -0.499546 0.673129
14 2000 2.248090 -1.654263
15 2010 -0.474397 -0.301915
16 2000 -0.931026 1.110986
17 2010 -0.189683 1.278410
18 2000 -0.554077 0.354303
19 2010 -0.440276 -0.424449
b.boxplot(by='Year')
Solution for seaborn boxplot
with unstack
:
b = a.unstack(level=0).reset_index(level=2, drop=True).reset_index(name='data')
print (b.head(15))
Year Text data
0 2000 a 0.026375
1 2000 a -1.271633
2 2000 a 0.394073
3 2000 a -1.473314
4 2000 a -1.008601
5 2000 a -0.606944
6 2000 a -1.541367
7 2000 a 2.248090
8 2000 a -0.931026
9 2000 a -0.554077
10 2000 b 0.260322
11 2000 b -2.596879
12 2000 b 0.935106
13 2000 b 0.801927
14 2000 b 0.025244
ax = sns.boxplot(x='Text', y='data', hue="Year",
data=b, palette="Set3")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With