Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

stacked bar plot using matplotlib

I am generating bar plots using matplotlib and it looks like there is a bug with the stacked bar plot. The sum for each vertical stack should be 100. However, for X-AXIS ticks 65, 70, 75 and 80 we get completely arbitrary results which do not make any sense. I do not understand what the problem is. Please find the MWE below.

import numpy as np import matplotlib.pyplot as plt import matplotlib header = ['a','b','c','d'] dataset= [('60.0', '65.0', '70.0', '75.0', '80.0', '85.0', '90.0', '95.0', '100.0', '105.0', '110.0', '115.0', '120.0', '125.0', '130.0', '135.0', '140.0', '145.0', '150.0', '155.0', '160.0', '165.0', '170.0', '175.0', '180.0', '185.0', '190.0', '195.0', '200.0'), (0.0, 25.0, 48.93617021276596, 83.01886792452831, 66.66666666666666, 66.66666666666666, 70.96774193548387, 84.61538461538461, 93.33333333333333, 85.0, 92.85714285714286, 93.75, 95.0, 100.0, 100.0, 100.0, 100.0, 80.0, 100.0, 100.0, 100.0, 100.0, 100.0, 100.0, 100.0, 100.0, 100.0, 100.0, 100.0), (0.0, 50.0, 36.17021276595745, 11.320754716981133, 26.666666666666668, 33.33333333333333, 29.03225806451613, 15.384615384615385, 6.666666666666667, 15.0, 7.142857142857142, 6.25, 5.0, 0.0, 0.0, 0.0, 0.0, 20.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0), (0.0, 12.5, 10.638297872340425, 3.7735849056603774, 4.444444444444445, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0), (100.0, 12.5, 4.25531914893617, 1.8867924528301887, 2.2222222222222223, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)] X_AXIS = dataset[0]  matplotlib.rc('font', serif='Helvetica Neue') matplotlib.rc('text', usetex='false') matplotlib.rcParams.update({'font.size': 40})  fig = matplotlib.pyplot.gcf() fig.set_size_inches(18.5, 10.5)  configs = dataset[0] N = len(configs) ind = np.arange(N) width = 0.4  p1 = plt.bar(ind, dataset[1], width, color='r') p2 = plt.bar(ind, dataset[2], width, bottom=dataset[1], color='b') p3 = plt.bar(ind, dataset[3], width, bottom=dataset[2], color='g') p4 = plt.bar(ind, dataset[4], width, bottom=dataset[3], color='c')  plt.ylim([0,120]) plt.yticks(fontsize=12) plt.ylabel(output, fontsize=12) plt.xticks(ind, X_AXIS, fontsize=12, rotation=90) plt.xlabel('test', fontsize=12) plt.legend((p1[0], p2[0], p3[0], p4[0]), (header[0], header[1], header[2], header[3]), fontsize=12, ncol=4, framealpha=0, fancybox=True) plt.show() 

enter image description here

like image 779
tandem Avatar asked Jun 01 '17 13:06

tandem


People also ask

How do I show values in a stacked bar chart in Matplotlib?

DataFrame. plot(kind='bar', stacked=True) , is the easiest way to plot a stacked bar plot. This method returns a matplotlib.

How do you plot a stacked bar chart?

Select the data that you want to display in the form of a chart. In the Insert tab, click Column Charts (in Charts section) and select “2-D stacked bar.” A chart appears, as shown in the following image. The stacked bar chart compares the sales revenue generated in different months with respect to time.

How do you make a stacked bar graph in Matlab?

Define a matrix of size 4 X 3 whose rows will be used as bars, i.e, each row of the matrix will be represented as a bar in the stacked graph. Pass this array and matrix as inputs to the 'Bar' function. Pass 'stacked' as third argument. This argument represents that the we need a stacked bar graph as the output.


2 Answers

You need the bottom of each dataset to be the sum of all the datasets that came before. you may also need to convert the datasets to numpy arrays to add them together.

p1 = plt.bar(ind, dataset[1], width, color='r') p2 = plt.bar(ind, dataset[2], width, bottom=dataset[1], color='b') p3 = plt.bar(ind, dataset[3], width,               bottom=np.array(dataset[1])+np.array(dataset[2]), color='g') p4 = plt.bar(ind, dataset[4], width,              bottom=np.array(dataset[1])+np.array(dataset[2])+np.array(dataset[3]),              color='c') 

enter image description here

Alternatively, you could convert them to numpy arrays before you start plotting.

dataset1 = np.array(dataset[1]) dataset2 = np.array(dataset[2]) dataset3 = np.array(dataset[3]) dataset4 = np.array(dataset[4])  p1 = plt.bar(ind, dataset1, width, color='r') p2 = plt.bar(ind, dataset2, width, bottom=dataset1, color='b') p3 = plt.bar(ind, dataset3, width, bottom=dataset1+dataset2, color='g') p4 = plt.bar(ind, dataset4, width, bottom=dataset1+dataset2+dataset3,              color='c') 

Or finally if you want to avoid converting to numpy arrays, you could use a list comprehension:

p1 = plt.bar(ind, dataset[1], width, color='r') p2 = plt.bar(ind, dataset[2], width, bottom=dataset[1], color='b') p3 = plt.bar(ind, dataset[3], width,              bottom=[sum(x) for x in zip(dataset[1],dataset[2])], color='g') p4 = plt.bar(ind, dataset[4], width,              bottom=[sum(x) for x in zip(dataset[1],dataset[2],dataset[3])],              color='c') 
like image 114
tmdavison Avatar answered Sep 28 '22 16:09

tmdavison


I found this such a pain that I wrote a function to do it. I'm sharing it in the hope that others find it useful:

import numpy as np import matplotlib.pyplot as plt  def plot_stacked_bar(data, series_labels, category_labels=None,                       show_values=False, value_format="{}", y_label=None,                       colors=None, grid=True, reverse=False):     """Plots a stacked bar chart with the data and labels provided.      Keyword arguments:     data            -- 2-dimensional numpy array or nested list                        containing data for each series in rows     series_labels   -- list of series labels (these appear in                        the legend)     category_labels -- list of category labels (these appear                        on the x-axis)     show_values     -- If True then numeric value labels will                         be shown on each bar     value_format    -- Format string for numeric value labels                        (default is "{}")     y_label         -- Label for y-axis (str)     colors          -- List of color labels     grid            -- If True display grid     reverse         -- If True reverse the order that the                        series are displayed (left-to-right                        or right-to-left)     """      ny = len(data[0])     ind = list(range(ny))      axes = []     cum_size = np.zeros(ny)      data = np.array(data)      if reverse:         data = np.flip(data, axis=1)         category_labels = reversed(category_labels)      for i, row_data in enumerate(data):         color = colors[i] if colors is not None else None         axes.append(plt.bar(ind, row_data, bottom=cum_size,                              label=series_labels[i], color=color))         cum_size += row_data      if category_labels:         plt.xticks(ind, category_labels)      if y_label:         plt.ylabel(y_label)      plt.legend()      if grid:         plt.grid()      if show_values:         for axis in axes:             for bar in axis:                 w, h = bar.get_width(), bar.get_height()                 plt.text(bar.get_x() + w/2, bar.get_y() + h/2,                           value_format.format(h), ha="center",                           va="center") 

Example:

plt.figure(figsize=(6, 4))  series_labels = ['Series 1', 'Series 2']  data = [     [0.2, 0.3, 0.35, 0.3],     [0.8, 0.7, 0.6, 0.5] ]  category_labels = ['Cat A', 'Cat B', 'Cat C', 'Cat D']  plot_stacked_bar(     data,      series_labels,      category_labels=category_labels,      show_values=True,      value_format="{:.1f}",     colors=['tab:orange', 'tab:green'],     y_label="Quantity (units)" )  plt.savefig('bar.png') plt.show() 

stacked bar plot example

like image 23
Bill Avatar answered Sep 28 '22 15:09

Bill