Create a Diverging Stacked Bar Chart in matplotlib

Tags:

I have lists of data indicating responses to likert questions with a one (very unhappy) to five (very happy) scale. I would like to create a page of plots showing these lists as skewed stacked horizontal bar charts. The lists of responses can be of different sizes (e.g. when someone has opted out of answering a particular question). Here is a minimal example of the data:

likert1 = [1.0, 2.0, 1.0, 2.0, 1.0, 3.0, 3.0, 4.0, 4.0, 1.0, 1.0]
likert2 = [5.0, 4.0, 5.0, 4.0, 5.0, 3.0]

I would like to be able to plot this with something like:

plot_many_likerts(likert1, likert2)

At the moment I've written a function to iterate over the lists, and plot each one as its own subplot on a shared figure in matplotlib:

def plot_many_likerts(*lsts):
    #get the figure and the list of axes for this plot
    fig, axlst = plt.subplots(len(lsts), sharex=True)
    for i in range(len(lsts)):
        likert_horizontal_bar_list(lsts[i], axlst[i], xaxis=[1.0, 2.0, 3.0, 4.0, 5.0])
        axlst[i].axis('off')
    fig.show()

def likert_horizontal_bar_list(lst, ax, xaxis):
    cnt = Counter(lst)
    #del (cnt[None])
    i = 0
    colour_float = 0.00001
    previous_right = 0
    for key in sorted(xaxis):
        ax.barh(bottom=0, width=cnt[key], height=0.4, left=previous_right, color=plt.cm.jet(colour_float),label=str(key))
        i += 1
        previous_right = previous_right + cnt[key]
       colour_float = float(i) / float(len(xaxis))

This works not badly and create stacked bar charts all with the same representative sizes (e.g. the widths share common axis scales). Here is a screen shot:

What is currently Produced http://s7.postimg.org/vh0j816gn/figure_1.jpg

What I would like is to have these two plots centered on midpoints of the mode of the datasets (the datasets will have the same range). For instance:

What I would like to see http://s29.postimg.org/z0qwv4ryr/figure_2.jpg

Suggestions on how I might do this?

998

asked Apr 17 '14 19:04

Christopher

2 Answers

I needed to make a divergent bar chart for some likert data. I was using pandas, but the approach would probably be similar without it. The key mechanism is to add in an invisible buffer at the start.

likert_colors = ['white', 'firebrick','lightcoral','gainsboro','cornflowerblue', 'darkblue']
dummy = pd.DataFrame([[1,2,3,4, 5], [5,6,7,8, 5], [10, 4, 2, 10, 5]],
                     columns=["SD", "D", "N", "A", "SA"],
                    index=["Key 1", "Key B", "Key III"])
middles = dummy[["SD", "D"]].sum(axis=1)+dummy["N"]*.5
longest = middles.max()
complete_longest = dummy.sum(axis=1).max()
dummy.insert(0, '', (middles - longest).abs())

dummy.plot.barh(stacked=True, color=likert_colors, edgecolor='none', legend=False)
z = plt.axvline(longest, linestyle='--', color='black', alpha=.5)
z.set_zorder(-1)

plt.xlim(0, complete_longest)
xvalues = range(0,complete_longest,10)
xlabels = [str(x-longest) for x in xvalues]
plt.xticks(xvalues, xlabels)
plt.show()

There are many limitations to this approach. First, bars no longer get a black outline, and the legend will have an extra blank element. I just hid the legend (I figure there's probably a way to hide just the individual element). I'm not sure of a convenient way to make the bars have an outline without also adding the outline to the buffer element.

First, we establish some colors and dummy data. Then we calculate the width of the left two columns and half of the middle-most column (which i know to be "SD", "D", and "N", respectively). I find the longest column, and use its width to calculate the difference needed for the other columns. Next, I insert this new buffer column into the first column position with a blank title (which felt gross, lemme tell you). For good measure, I also added a vertical line (axvline) behind the middle of the middle bar based on the advice of [2]. Finally, I adjust the x-axis to have the proper scale by offsetting its labels.

You might want more horizontal space on the left - you can easily do so by adding to "longest".

The aligned likert data

[2] Heiberger, Richard M., and Naomi B. Robbins. "Design of diverging stacked bar charts for Likert scales and other applications." Journal of Statistical Software 57.5 (2014): 1-32.

183

answered Oct 28 '22 13:10

Austin Cory Bart

I too recently needed to make a divergent bar chart for some Likert data. I took a slightly different approach than @austin-cory-bart.

I modified an example from the gallery instead and created this:

import numpy as np
import matplotlib.pyplot as plt


category_names = ['Strongly disagree', 'Disagree',
                  'Neither agree nor disagree', 'Agree', 'Strongly agree']
results = {
    'Question 1': [10, 15, 17, 32, 26],
    'Question 2': [26, 22, 29, 10, 13],
    'Question 3': [35, 37, 7, 2, 19],
    'Question 4': [32, 11, 9, 15, 33],
    'Question 5': [21, 29, 5, 5, 40],
    'Question 6': [8, 19, 5, 30, 38]
}


def survey(results, category_names):
    """
    Parameters
    ----------
    results : dict
        A mapping from question labels to a list of answers per category.
        It is assumed all lists contain the same number of entries and that
        it matches the length of *category_names*. The order is assumed
        to be from 'Strongly disagree' to 'Strongly aisagree'
    category_names : list of str
        The category labels.
    """
    
    labels = list(results.keys())
    data = np.array(list(results.values()))
    data_cum = data.cumsum(axis=1)
    middle_index = data.shape[1]//2
    offsets = data[:, range(middle_index)].sum(axis=1) + data[:, middle_index]/2
    
    # Color Mapping
    category_colors = plt.get_cmap('coolwarm_r')(
        np.linspace(0.15, 0.85, data.shape[1]))
    
    fig, ax = plt.subplots(figsize=(10, 5))
    
    # Plot Bars
    for i, (colname, color) in enumerate(zip(category_names, category_colors)):
        widths = data[:, i]
        starts = data_cum[:, i] - widths - offsets
        rects = ax.barh(labels, widths, left=starts, height=0.5,
                        label=colname, color=color)
    
    # Add Zero Reference Line
    ax.axvline(0, linestyle='--', color='black', alpha=.25)
    
    # X Axis
    ax.set_xlim(-90, 90)
    ax.set_xticks(np.arange(-90, 91, 10))
    ax.xaxis.set_major_formatter(lambda x, pos: str(abs(int(x))))
    
    # Y Axis
    ax.invert_yaxis()
    
    # Remove spines
    ax.spines['right'].set_visible(False)
    ax.spines['top'].set_visible(False)
    ax.spines['left'].set_visible(False)
    
    # Ledgend
    ax.legend(ncol=len(category_names), bbox_to_anchor=(0, 1),
              loc='lower left', fontsize='small')
    
    # Set Background Color
    fig.set_facecolor('#FFFFFF')

    return fig, ax


fig, ax = survey(results, category_names)
plt.show()

enter image description here

answered Oct 28 '22 13:10

eitanlees

Related questions
                            
                                How to find indexes of string in lists which starts with some substring?
                            
                                Industrial vision camera with Python [closed]
                            
                                Does embedding c++ code in python make your python application faster? [closed]
                            
                                Passing 3-dimensional numpy array to C
                            
                                celery tutorial: NotRegistered error
                            
                                Refresh a local web page using Python
                            
                                Puzzling "'tuple' object does not support item assignment" error [duplicate]
                            
                                Using scipy.interpolate.splrep function
                            
                                fifo - reading in a loop
                            
                                using flask-sqlalchemy without the subclassed declarative base
                            
                                How to handle Python multiprocessing database concurrency, specifically with django?
                            
                                Python unittest data provider
                            
                                Element-wise maximum of two sparse matrices
                            
                                Django i18n: recommended size and formatting for {% blocktrans %} blocks?
                            
                                How to POST multiple FILES using Flask test client?
                            
                                Install paramiko on Windows
                            
                                MiniBatchKMeans Parameters
                            
                                The equivalent function of Matlab imfilter in Python
                            
                                Create a new type in python [closed]
                            
                                Is it pythonic to use generators to write header and body of a file?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Create a Diverging Stacked Bar Chart in matplotlib

Tags:

python

matplotlib

plot

Christopher

People also ask

2 Answers

Austin Cory Bart

eitanlees

Recent Activity

Donate For Us