Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Plot multiple bars for categorical data

I'm looking for a way to plot multiple bars per value in matplotlib. For numerical data, this can be achieved be adding an offset to the X data, as described for example here:

import numpy as np
import matplotlib.pyplot as plt

X = np.array([1,3,5])
Y = [1,2,3]
Z = [2,3,4]

plt.bar(X - 0.4, Y) # offset of -0.4
plt.bar(X + 0.4, Z) # offset of  0.4
plt.show()

Multiple bars for numerical data

plt.bar() (and ax.bar()) also handle categorical data automatically:

X = ['A','B','C']
Y = [1,2,3]

plt.bar(X, Y)
plt.show()

Category handling

Here, it is obviously not possible to add an offset, as the categories are not directly associated with a value on the axis. I can manually assign numerical values to the categories and set labels on the x axis with plt.xticks():,

X = ['A','B','C']
Y = [1,2,3]
Z = [2,3,4]
_X = np.arange(len(X))

plt.bar(_X - 0.2, Y, 0.4)
plt.bar(_X + 0.2, Z, 0.4)
plt.xticks(_X, X) # set labels manually
plt.show()

Manually setting category labels

However, I'm wondering if there is a more elegant way that makes use of the automatic category handling of bar(), especially if the number of categories and bars per category is not known in before (this causes some fiddling with the bar widths to avoid overlaps).

like image 760
tsabsch Avatar asked Jan 08 '18 20:01

tsabsch


1 Answers

There is no automatic support of subcategories in matplotlib.

Placing bars with matplotlib

You may go the way of placing the bars numerically, like you propose yourself in the question. You can of course let the code manage the unknown number of subcategories.

import numpy as np
import matplotlib.pyplot as plt

X = ['A','B','C']
Y = [1,2,3]
Z = [2,3,4]

def subcategorybar(X, vals, width=0.8):
    n = len(vals)
    _X = np.arange(len(X))
    for i in range(n):
        plt.bar(_X - width/2. + i/float(n)*width, vals[i], 
                width=width/float(n), align="edge")   
    plt.xticks(_X, X)
    
subcategorybar(X, [Y,Z,Y])

plt.show()

enter image description here

Using pandas

You may also use pandas plotting wrapper, which does the work of figuring out the number of subcategories. It will plot one group per column of a dataframe.

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

X = ['A','B','C']
Y = [1,2,3]
Z = [2,3,4]

df = pd.DataFrame(np.c_[Y,Z,Y], index=X)
df.plot.bar()

plt.show()

enter image description here

like image 56
ImportanceOfBeingErnest Avatar answered Oct 12 '22 12:10

ImportanceOfBeingErnest