Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python & Pandas: Strange behavior when Pandas plot histogram to a specific ax

I want to plot pandas histogram to an axis, but the behavior is really strange. I don't know what's wrong here.

fig1, ax1 = plt.subplots(figsize=(4,3))
fig2, ax2 = plt.subplots(figsize=(4,3))
fig3, ax3 = plt.subplots(figsize=(4,3))

# 1. This works
df['speed'].hist()

# 2. This doens't work
df['speed'].hist(ax=ax2)

# 3. This works
data = [1,2,3,5,6,2,3,4]
temp_df = pd.DataFrame(data)
temp_df.hist(ax=ax2)

The error jupyter notebook returns is:


AssertionError                            Traceback (most recent call last)
<ipython-input-46-d629de832772> in <module>()
      7 
      8 # This doens't work
----> 9 df['speed'].hist(ax=ax2)
     10 
     11 # # This works

D:\Anaconda2\lib\site-packages\pandas\tools\plotting.pyc in hist_series(self, by, ax, grid, xlabelsize, xrot, ylabelsize, yrot, figsize, bins, **kwds)
   2953             ax = fig.gca()
   2954         elif ax.get_figure() != fig:
-> 2955             raise AssertionError('passed axis not bound to passed figure')
   2956         values = self.dropna().values
   2957 

AssertionError: passed axis not bound to passed figure

The pandas source code is here:

https://github.com/pydata/pandas/blob/d38ee272f3060cb884f21f9f7d212efc5f7656a8/pandas/tools/plotting.py#L2913

Totally have no idea what's wrong with my code.

like image 222
cqcn1991 Avatar asked Jul 13 '16 04:07

cqcn1991


People also ask

What is Python used for?

Python is a computer programming language often used to build websites and software, automate tasks, and conduct data analysis. Python is a general-purpose language, meaning it can be used to create a variety of different programs and isn't specialized for any specific problems.

What is the basic language of Python?

Python is written in C (actually the default implementation is called CPython).

Is Python easy to learn?

Python is widely considered among the easiest programming languages for beginners to learn. If you're interested in learning a programming language, Python is a good place to start. It's also one of the most widely used.

Is Python coding good?

Python is undoubtedly considered a top programming language at the same level as JavaScript or C++, and it's one of the most used languages by businesses and enterprises. Even though it's almost 30 years old, Python is still relevant, given its ease of use, its vibrant community, and many applications.


1 Answers

The problem is that pandas determines which is the active figure by using gcf() to get the "current figure". When you create several figures in a row, the "current figure" is the last one created. But you are trying to plot to an earlier one, which causes a mismatch.

However, as you can see on line 2954 of the source you linked to, pandas will look for an (undocumented) figure argument. So you can make it work by doing df['speed'].hist(ax=ax2, figure=fig2). A comment in the pandas source notes that this is a "hack until the plotting interface is a bit more unified", so I wouldn't rely on it for anything too critical.

The other solution is to not create a new figure until you're ready to use it. In your example above, you only use figure 2, so there's no need to create the others. Of course, that is a contrived example, but in a real-life situation, if you have code like this:

fig1, ax1 = plt.subplots(figsize=(4,3))
fig2, ax2 = plt.subplots(figsize=(4,3))
fig3, ax3 = plt.subplots(figsize=(4,3))

something.hist(ax=ax1)
something.hist(ax=ax2)
something.hist(ax=ax3)

You can change it to this:

fig1, ax1 = plt.subplots(figsize=(4,3))
something.hist(ax=ax1)

fig2, ax2 = plt.subplots(figsize=(4,3))
something.hist(ax=ax2)

fig3, ax3 = plt.subplots(figsize=(4,3))
something.hist(ax=ax3)

That is, put each section of plotting code right after the code that creates the figure for that plot.

like image 190
BrenBarn Avatar answered Sep 28 '22 08:09

BrenBarn