I want to plot pandas histogram to an axis, but the behavior is really strange. I don't know what's wrong here.
fig1, ax1 = plt.subplots(figsize=(4,3))
fig2, ax2 = plt.subplots(figsize=(4,3))
fig3, ax3 = plt.subplots(figsize=(4,3))
# 1. This works
df['speed'].hist()
# 2. This doens't work
df['speed'].hist(ax=ax2)
# 3. This works
data = [1,2,3,5,6,2,3,4]
temp_df = pd.DataFrame(data)
temp_df.hist(ax=ax2)
The error jupyter notebook returns is:
AssertionError Traceback (most recent call last)
<ipython-input-46-d629de832772> in <module>()
7
8 # This doens't work
----> 9 df['speed'].hist(ax=ax2)
10
11 # # This works
D:\Anaconda2\lib\site-packages\pandas\tools\plotting.pyc in hist_series(self, by, ax, grid, xlabelsize, xrot, ylabelsize, yrot, figsize, bins, **kwds)
2953 ax = fig.gca()
2954 elif ax.get_figure() != fig:
-> 2955 raise AssertionError('passed axis not bound to passed figure')
2956 values = self.dropna().values
2957
AssertionError: passed axis not bound to passed figure
The pandas source code is here:
https://github.com/pydata/pandas/blob/d38ee272f3060cb884f21f9f7d212efc5f7656a8/pandas/tools/plotting.py#L2913
Totally have no idea what's wrong with my code.
Python is a computer programming language often used to build websites and software, automate tasks, and conduct data analysis. Python is a general-purpose language, meaning it can be used to create a variety of different programs and isn't specialized for any specific problems.
Python is written in C (actually the default implementation is called CPython).
Python is widely considered among the easiest programming languages for beginners to learn. If you're interested in learning a programming language, Python is a good place to start. It's also one of the most widely used.
Python is undoubtedly considered a top programming language at the same level as JavaScript or C++, and it's one of the most used languages by businesses and enterprises. Even though it's almost 30 years old, Python is still relevant, given its ease of use, its vibrant community, and many applications.
The problem is that pandas determines which is the active figure by using gcf()
to get the "current figure". When you create several figures in a row, the "current figure" is the last one created. But you are trying to plot to an earlier one, which causes a mismatch.
However, as you can see on line 2954 of the source you linked to, pandas will look for an (undocumented) figure
argument. So you can make it work by doing df['speed'].hist(ax=ax2, figure=fig2)
. A comment in the pandas
source notes that this is a "hack until the plotting interface is a bit more unified", so I wouldn't rely on it for anything too critical.
The other solution is to not create a new figure until you're ready to use it. In your example above, you only use figure 2, so there's no need to create the others. Of course, that is a contrived example, but in a real-life situation, if you have code like this:
fig1, ax1 = plt.subplots(figsize=(4,3))
fig2, ax2 = plt.subplots(figsize=(4,3))
fig3, ax3 = plt.subplots(figsize=(4,3))
something.hist(ax=ax1)
something.hist(ax=ax2)
something.hist(ax=ax3)
You can change it to this:
fig1, ax1 = plt.subplots(figsize=(4,3))
something.hist(ax=ax1)
fig2, ax2 = plt.subplots(figsize=(4,3))
something.hist(ax=ax2)
fig3, ax3 = plt.subplots(figsize=(4,3))
something.hist(ax=ax3)
That is, put each section of plotting code right after the code that creates the figure for that plot.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With