Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Stacked histogram of grouped values in Pandas

i am trying to create a stacked histogram of grouped values using this code:

titanic.groupby('Survived').Age.hist(stacked=True)

But I am getting this histogram without stacked bars.

enter image description here

How can i get the histogram's bar stacked without having to use matplotlib directly or iterating over groups?

Dataset used: https://www.udacity.com/api/nodes/5454512672/supplemental_media/titanic-datacsv/download

like image 570
leokury Avatar asked Jan 12 '17 20:01

leokury


4 Answers

Improve the answer, the best way could be:

titanic.pivot(columns='Survived').Age.plot(kind = 'hist', stacked=True)

enter image description here

like image 138
Raymond W Avatar answered Sep 22 '22 18:09

Raymond W


The best way that I found so far is to create a new dataframe with the groups:

pd.DataFrame({'Non-Survivors': titanic.groupby('Survived').get_group(0).Age,
              'Survivors':   titanic.groupby('Survived').get_group(1).Age})
            .plot.hist(stacked=True)

enter image description here

like image 45
leokury Avatar answered Sep 23 '22 18:09

leokury


This solution uses a bar plot instead of a histogram but I think it gives you what you are looking for.

titanic.groupby(['Survived', pd.cut(titanic['Age'], np.arange(0,100,10))])\
       .size()\
       .unstack(0)\
       .plot.bar(stacked=True)

enter image description here

like image 35
Ted Petrou Avatar answered Sep 23 '22 18:09

Ted Petrou


I defined a custom function that leverages np.histogram
Also note that the histogram groups are calculated within groups of 'Survived'

def hist(x):
    h, e = np.histogram(x.dropna(), range=(0, 80))
    e = e.astype(int)
    return pd.Series(h, zip(e[:-1], e[1:]))

kw = dict(stacked=True, width=1, rot=45)
titanic.groupby('Survived').Age.apply(hist).unstack(0).plot.bar(**kw)

enter image description here

like image 33
piRSquared Avatar answered Sep 23 '22 18:09

piRSquared