I thought for sure this would already have an answer, but I can't find it anywhere. I'm running into an issue when trying to use matplotlib to make bar charts. Under most conditions, the plot comes out correctly. However, when I take some values out of the data before plotting the bars become much wider than I want. Consider the following minimum reproducible example:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots()
ex1 = pd.DataFrame({'x':[330,342,344,352,354,371,388,394,401,412,414,448,462,502,504,522,622],
'y':[2,9,0,2,2,1,0,4,7,6,8,4,2,6,3,5,7],
'ind':[0,0,0,0,0,1,1,1,1,1,1,1,1,0,0,0,0]})
ax.bar(ex1.x,ex1.y,width=0.9)
fig.savefig('some/path')
When I open up this plot I get the following:

This looks great. No issues. But now, suppose I only want to create a bar chart for part of the data. Essentially, all of the leading 0's in the "ind" column of my DF contain rows I don't care to plot. So I get rid of those and try again:
fig, ax = plt.subplots()
firstrow = ex1[ex1.ind==np.max(ex1.ind)].index.to_list()[0]
ex1 = ex1[firstrow:]
ax.bar(ex1.x,ex1.y,width=0.9)
fig.savefig('some/other/location')
When I open that one up, I expect a truncated version of the original plot, i.e. with thin bars of the correct height, just without the few bars that I cut out of the DF. Instead, I get this:

Huh? It starts in the right place, but that's about all the good I can say for it. It appears as if it's just ignoring the width parameter and running all of the bars together. I've played with several things and done some searches and couldn't figure out either what's going wrong or how to fix it. Any suggestions on how to make the second figure look like the first but without the data I don't want would be much appreciated!
Edited to answer any questions: Results of print(ex1.x); print(exq.y) are:
print(ex1.x); print(ex1.y)
5 371
6 388
7 394
8 401
9 412
10 414
11 448
12 462
13 502
14 504
15 522
16 622
Name: x, dtype: int64
5 1
6 0
7 4
8 7
9 6
10 8
11 4
12 2
13 6
14 3
15 5
16 7
Name: y, dtype: int64
While matplotlib tries to support direct plotting of pandas objects, it might sometimes be problematic if pandas changes some internals. The solution to such problems would always be to fall back to plotting numpy arrays, for which all functionality is well tested.
Here, the problem is that with some combinations of pandas/matplotlib versions plotting of non-zero indexed dataframes or series can cause hick-ups.
Hence you would want to plot the numpy arrays ex1.x.values and ex1.y.values instead of the pandas series ex1.x amd ex1.y:
ax.bar(ex1.x.values, ex1.y.values, width=0.9)
I'm not completely sure what
ex1[ex1.ind==np.max(ex1.ind)].index.to_list()[0]
is doing since it throws an error for me, but using
ex1[ex1.ind==np.max(ex1.ind)].index.values[0]
instead gives

Tested with Python 2.7 in Jupyter Notebook, Python 2.7 and Python 3.6 on Ubuntu - all gave the same output
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With