Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using pandas crosstab to create a bar plot

I am trying to create a stacked barplot in seaborn with my dataframe.

I have first generated a crosstab table in pandas like so:

pd.crosstab(df['Period'], df['Mark'])

which returns:

  Mark            False  True  
Period BASELINE    583    132
       WEEK 12     721      0 
       WEEK 24     589    132 
       WEEK 4      721      0

I would like to use seaborn to create a stacked barplot for congruence, ans this is what I have used for the rest of my graphs. I have struggled to do this however as I am unable to index the crosstab.

I have been able to make the plot I want in pandas using .plot.barh(stacked=True) but no luck with seaborn. Any ideas how i can do this?

like image 289
JB1 Avatar asked Apr 21 '17 14:04

JB1


People also ask

How do you do cross-tabulation in pandas?

The crosstab() function is used to compute a simple cross tabulation of two (or more) factors. By default computes a frequency table of the factors unless an array of values and an aggregation function are passed. Values to group by in the rows. Values to group by in the columns.

What is the difference between crosstab and pivot table?

With a basic crosstab, you would have to go back to the program and create a separate crosstab with the information on individual products. Pivot tables let the user filter through their data, add or remove custom fields, and change the appearance of their report.

Can we plot graphs using pandas?

Pandas uses the plot() method to create diagrams. We can use Pyplot, a submodule of the Matplotlib library to visualize the diagram on the screen. Read more about Matplotlib in our Matplotlib Tutorial.


1 Answers

  • As you said you can use pandas to create the stacked bar plot. The argument that you want to have a "seaborn plot" is irrelevant, since every seaborn plot and every pandas plot are in the end simply matplotlib objects, as the plotting tools of both libraries are merely matplotlib wrappers.
  • Here's a complete solution (using the data creation from @andrew_reece's answer).
  • Tested in python 3.8.11, pandas 1.3.2, matplotlib 3.4.3, seaborn 0.11.2
import numpy as np 
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

n = 500
np.random.seed(365)
mark = np.random.choice([True, False], n)
periods = np.random.choice(['BASELINE', 'WEEK 12', 'WEEK 24', 'WEEK 4'], n)

df = pd.DataFrame({'mark': mark, 'period': periods})
ct = pd.crosstab(df.period, df.mark)
    
ax = ct.plot(kind='bar', stacked=True, rot=0)
ax.legend(title='mark', bbox_to_anchor=(1, 1.02), loc='upper left')

# add annotations if desired
for c in ax.containers:
    
    # set the bar label
    ax.bar_label(c, label_type='center')

enter image description here

like image 171
ImportanceOfBeingErnest Avatar answered Sep 23 '22 02:09

ImportanceOfBeingErnest