I am generating bar plots using matplotlib and it looks like there is a bug with the stacked bar plot. The sum for each vertical stack should be 100. However, for X-AXIS ticks 65, 70, 75 and 80 we get completely arbitrary results which do not make any sense. I do not understand what the problem is. Please find the MWE below.
import numpy as np import matplotlib.pyplot as plt import matplotlib header = ['a','b','c','d'] dataset= [('60.0', '65.0', '70.0', '75.0', '80.0', '85.0', '90.0', '95.0', '100.0', '105.0', '110.0', '115.0', '120.0', '125.0', '130.0', '135.0', '140.0', '145.0', '150.0', '155.0', '160.0', '165.0', '170.0', '175.0', '180.0', '185.0', '190.0', '195.0', '200.0'), (0.0, 25.0, 48.93617021276596, 83.01886792452831, 66.66666666666666, 66.66666666666666, 70.96774193548387, 84.61538461538461, 93.33333333333333, 85.0, 92.85714285714286, 93.75, 95.0, 100.0, 100.0, 100.0, 100.0, 80.0, 100.0, 100.0, 100.0, 100.0, 100.0, 100.0, 100.0, 100.0, 100.0, 100.0, 100.0), (0.0, 50.0, 36.17021276595745, 11.320754716981133, 26.666666666666668, 33.33333333333333, 29.03225806451613, 15.384615384615385, 6.666666666666667, 15.0, 7.142857142857142, 6.25, 5.0, 0.0, 0.0, 0.0, 0.0, 20.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0), (0.0, 12.5, 10.638297872340425, 3.7735849056603774, 4.444444444444445, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0), (100.0, 12.5, 4.25531914893617, 1.8867924528301887, 2.2222222222222223, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)] X_AXIS = dataset[0] matplotlib.rc('font', serif='Helvetica Neue') matplotlib.rc('text', usetex='false') matplotlib.rcParams.update({'font.size': 40}) fig = matplotlib.pyplot.gcf() fig.set_size_inches(18.5, 10.5) configs = dataset[0] N = len(configs) ind = np.arange(N) width = 0.4 p1 = plt.bar(ind, dataset[1], width, color='r') p2 = plt.bar(ind, dataset[2], width, bottom=dataset[1], color='b') p3 = plt.bar(ind, dataset[3], width, bottom=dataset[2], color='g') p4 = plt.bar(ind, dataset[4], width, bottom=dataset[3], color='c') plt.ylim([0,120]) plt.yticks(fontsize=12) plt.ylabel(output, fontsize=12) plt.xticks(ind, X_AXIS, fontsize=12, rotation=90) plt.xlabel('test', fontsize=12) plt.legend((p1[0], p2[0], p3[0], p4[0]), (header[0], header[1], header[2], header[3]), fontsize=12, ncol=4, framealpha=0, fancybox=True) plt.show()
DataFrame. plot(kind='bar', stacked=True) , is the easiest way to plot a stacked bar plot. This method returns a matplotlib.
Select the data that you want to display in the form of a chart. In the Insert tab, click Column Charts (in Charts section) and select “2-D stacked bar.” A chart appears, as shown in the following image. The stacked bar chart compares the sales revenue generated in different months with respect to time.
Define a matrix of size 4 X 3 whose rows will be used as bars, i.e, each row of the matrix will be represented as a bar in the stacked graph. Pass this array and matrix as inputs to the 'Bar' function. Pass 'stacked' as third argument. This argument represents that the we need a stacked bar graph as the output.
You need the bottom
of each dataset to be the sum of all the datasets that came before. you may also need to convert the datasets to numpy arrays to add them together.
p1 = plt.bar(ind, dataset[1], width, color='r') p2 = plt.bar(ind, dataset[2], width, bottom=dataset[1], color='b') p3 = plt.bar(ind, dataset[3], width, bottom=np.array(dataset[1])+np.array(dataset[2]), color='g') p4 = plt.bar(ind, dataset[4], width, bottom=np.array(dataset[1])+np.array(dataset[2])+np.array(dataset[3]), color='c')
Alternatively, you could convert them to numpy arrays before you start plotting.
dataset1 = np.array(dataset[1]) dataset2 = np.array(dataset[2]) dataset3 = np.array(dataset[3]) dataset4 = np.array(dataset[4]) p1 = plt.bar(ind, dataset1, width, color='r') p2 = plt.bar(ind, dataset2, width, bottom=dataset1, color='b') p3 = plt.bar(ind, dataset3, width, bottom=dataset1+dataset2, color='g') p4 = plt.bar(ind, dataset4, width, bottom=dataset1+dataset2+dataset3, color='c')
Or finally if you want to avoid converting to numpy arrays, you could use a list comprehension:
p1 = plt.bar(ind, dataset[1], width, color='r') p2 = plt.bar(ind, dataset[2], width, bottom=dataset[1], color='b') p3 = plt.bar(ind, dataset[3], width, bottom=[sum(x) for x in zip(dataset[1],dataset[2])], color='g') p4 = plt.bar(ind, dataset[4], width, bottom=[sum(x) for x in zip(dataset[1],dataset[2],dataset[3])], color='c')
I found this such a pain that I wrote a function to do it. I'm sharing it in the hope that others find it useful:
import numpy as np import matplotlib.pyplot as plt def plot_stacked_bar(data, series_labels, category_labels=None, show_values=False, value_format="{}", y_label=None, colors=None, grid=True, reverse=False): """Plots a stacked bar chart with the data and labels provided. Keyword arguments: data -- 2-dimensional numpy array or nested list containing data for each series in rows series_labels -- list of series labels (these appear in the legend) category_labels -- list of category labels (these appear on the x-axis) show_values -- If True then numeric value labels will be shown on each bar value_format -- Format string for numeric value labels (default is "{}") y_label -- Label for y-axis (str) colors -- List of color labels grid -- If True display grid reverse -- If True reverse the order that the series are displayed (left-to-right or right-to-left) """ ny = len(data[0]) ind = list(range(ny)) axes = [] cum_size = np.zeros(ny) data = np.array(data) if reverse: data = np.flip(data, axis=1) category_labels = reversed(category_labels) for i, row_data in enumerate(data): color = colors[i] if colors is not None else None axes.append(plt.bar(ind, row_data, bottom=cum_size, label=series_labels[i], color=color)) cum_size += row_data if category_labels: plt.xticks(ind, category_labels) if y_label: plt.ylabel(y_label) plt.legend() if grid: plt.grid() if show_values: for axis in axes: for bar in axis: w, h = bar.get_width(), bar.get_height() plt.text(bar.get_x() + w/2, bar.get_y() + h/2, value_format.format(h), ha="center", va="center")
Example:
plt.figure(figsize=(6, 4)) series_labels = ['Series 1', 'Series 2'] data = [ [0.2, 0.3, 0.35, 0.3], [0.8, 0.7, 0.6, 0.5] ] category_labels = ['Cat A', 'Cat B', 'Cat C', 'Cat D'] plot_stacked_bar( data, series_labels, category_labels=category_labels, show_values=True, value_format="{:.1f}", colors=['tab:orange', 'tab:green'], y_label="Quantity (units)" ) plt.savefig('bar.png') plt.show()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With