pandas - multi index plotting

Tags:

I have some data where I've manipulated the dataframe using the following code:

import pandas as pd
import numpy as np

data = pd.DataFrame([[0,0,0,3,6,5,6,1],[1,1,1,3,4,5,2,0],[2,1,0,3,6,5,6,1],[3,0,0,2,9,4,2,1],[4,0,1,3,4,8,1,1],[5,1,1,3,3,5,9,1],[6,1,0,3,3,5,6,1],[7,0,1,3,4,8,9,1]], columns=["id", "sex", "split", "group0Low", "group0High", "group1Low", "group1High", "trim"])
data

#remove all where trim == 0
trimmed = data[(data.trim == 1)]
trimmed

#create df with columns to be split
columns = ['group0Low', 'group0High', 'group1Low', 'group1High']
to_split = trimmed[columns]
to_split

level_group = np.where(to_split.columns.str.contains('0'), 0, 1)
# output: array([0, 0, 1, 1])
level_low_high = np.where(to_split.columns.str.contains('Low'), 'low', 'high')
# output: array(['low', 'high', 'low', 'high'], dtype='<U4')

multi_level_columns = pd.MultiIndex.from_arrays([level_group, level_low_high], names=['group', 'val'])
to_split.columns = multi_level_columns
to_split.stack(level='group')

sex = trimmed['sex']
split = trimmed['split']
horizontalStack = pd.concat([sex, split, to_split], axis=1)
horizontalStack

finalData = horizontalStack.groupby(['split', 'sex', 'group'])
finalData.mean()

My question is, how do I plot the mean data using ggplot or seaborn such that for each "split" level I get a graph that looks like this:

enter image description here

At the bottom of the code you can see I've tried to split up the group factor so I can separate the bars, but that resulted in an error (KeyError: 'group') and I think that is related to the way I used multi indexing

671

asked Aug 06 '15 01:08

Simon

1 Answers

I would use a factor plot from seaborn.

Say you have data like this:

import numpy as np
import pandas

import seaborn
seaborn.set(style='ticks') 
np.random.seed(0)

groups = ('Group 1', 'Group 2')
sexes = ('Male', 'Female')
means = ('Low', 'High')
index = pandas.MultiIndex.from_product(
    [groups, sexes, means], 
   names=['Group', 'Sex', 'Mean']
)

values = np.random.randint(low=20, high=100, size=len(index))
data = pandas.DataFrame(data={'val': values}, index=index).reset_index()
print(data)

     Group     Sex  Mean  val
0  Group 1    Male   Low   64
1  Group 1    Male  High   67
2  Group 1  Female   Low   84
3  Group 1  Female  High   87
4  Group 2    Male   Low   87
5  Group 2    Male  High   29
6  Group 2  Female   Low   41
7  Group 2  Female  High   56

You can then create the factor plot with one command + plus an extra line to remove some redundant (for your data) x-labels:

fg = seaborn.factorplot(x='Group', y='val', hue='Mean', 
                        col='Sex', data=data, kind='bar')
fg.set_xlabels('')

Which gives me:

enter image description here

165

answered Sep 21 '22 02:09

Paul H

Related questions
                            
                                Python request in AWS Lambda timing out
                            
                                Print list of lists in separate lines
                            
                                How do you use python-decouple to load a .env file outside the expected paths?
                            
                                Fastest way to create strictly increasing lists in Python
                            
                                Django widget override template
                            
                                protoc object_detection/protos/*.proto: No such file or directory
                            
                                'tensorboard' is not recognized as an internal or external command,
                            
                                Using OR comparisons with IF statements [duplicate]
                            
                                How do I write a single-file Django application?
                            
                                Script to remove Python comments/docstrings
                            
                                Is it possible to declare a function without arguments but then pass some arguments to that function without raising exception?
                            
                                How to traverse through the files in a directory?
                            
                                Getting output with IPython Notebook
                            
                                How to write multiple try statements in one block in python?
                            
                                How to drop extra copy of duplicate index of Pandas Series?
                            
                                NameError from Python input() function [duplicate]
                            
                                Python sqlite3 cursor has no attribute commit
                            
                                How to write output of terminal to file
                            
                                Computing and drawing vector fields
                            
                                Mixin common fields between serializers in Django Rest Framework

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

pandas - multi index plotting

Tags:

python

pandas

seaborn

python-ggplot

Simon

People also ask

1 Answers

Paul H

Recent Activity

Donate For Us