First of all, this question is not the same as this one.
The problem I'm having is that when I try to plot a DataFrame which contains a numpy NaN in one cell, I get an error:
C:\>\Python33x86\python.exe
Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:03:43) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>>
>>> dates = pd.date_range('20131201', periods=5, freq='H')
>>> data = [[1, 2], [4, 5], [9, np.nan], [16, 17], [25, 26]]
>>> df = pd.DataFrame(data, index=dates,
... columns=list('AB'))
>>>
>>> print(df.to_string())
A B
2013-12-01 00:00:00 1 2
2013-12-01 01:00:00 4 5
2013-12-01 02:00:00 9 NaN
2013-12-01 03:00:00 16 17
2013-12-01 04:00:00 25 26
>>> df.plot()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1636, in plot_frame
plot_obj.generate()
File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 856, in generate
self._make_plot()
File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1240, in _make_plot
self._make_ts_plot(data, **self.kwds)
File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1321, in _make_ts_plot
_plot(data[col], i, ax, label, style, **kwds)
File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1295, in _plot
style=style, **kwds)
File "C:\Python33x86\lib\site-packages\pandas\tseries\plotting.py", line 77, in tsplot
lines = plotf(ax, *args, **kwargs)
File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 4139, in plot
for line in self._get_lines(*args, **kwargs):
File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 319, in _grab_next_args
for seg in self._plot_args(remaining, kwargs):
File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 297, in _plot_args
x, y = self._xy_from_xy(x, y)
File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 216, in _xy_from_xy
by = self.axes.yaxis.update_units(y)
File "C:\Python33x86\lib\site-packages\matplotlib\axis.py", line 1337, in update_units
converter = munits.registry.get_converter(data)
File "C:\Python33x86\lib\site-packages\matplotlib\units.py", line 137, in get_converter
xravel = x.ravel()
File "C:\Python33x86\lib\site-packages\numpy\ma\core.py", line 3969, in ravel
r._mask = ndarray.ravel(self._mask).reshape(r.shape)
File "C:\Python33x86\lib\site-packages\pandas\core\series.py", line 981, in reshape
return ndarray.reshape(self, newshape, order)
TypeError: an integer is required
The above code works if I replace the np.NaN with a number, such as "2.3".
Plotting as two separate Series does not work either (it fails when I add the Series containing the NaN to the plot):
C:\>\Python33x86\python.exe
Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:03:43) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>>
>>> dates = pd.date_range('20131201', periods=5, freq='H')
>>> data = [[1, 2], [4, 5], [9, np.nan], [16, 17], [25, 26]]
>>> df = pd.DataFrame(data, index=dates,
... columns=list('AB'))
>>>
>>> print(df.to_string())
A B
2013-12-01 00:00:00 1 2
2013-12-01 01:00:00 4 5
2013-12-01 02:00:00 9 NaN
2013-12-01 03:00:00 16 17
2013-12-01 04:00:00 25 26
>>> df['A'].plot(label='This is A', style='k')
<matplotlib.axes.AxesSubplot object at 0x02ACFF90>
>>> df['B'].plot(label='This is B', style='g')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1730, in plot_series
plot_obj.generate()
File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 856, in generate
self._make_plot()
File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1240, in _make_plot
self._make_ts_plot(data, **self.kwds)
File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1311, in _make_ts_plot
_plot(data, 0, ax, label, self.style, **kwds)
File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1295, in _plot
style=style, **kwds)
File "C:\Python33x86\lib\site-packages\pandas\tseries\plotting.py", line 77, in tsplot
lines = plotf(ax, *args, **kwargs)
File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 4139, in plot
for line in self._get_lines(*args, **kwargs):
File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 319, in _grab_next_args
for seg in self._plot_args(remaining, kwargs):
File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 297, in _plot_args
x, y = self._xy_from_xy(x, y)
File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 216, in _xy_from_xy
by = self.axes.yaxis.update_units(y)
File "C:\Python33x86\lib\site-packages\matplotlib\axis.py", line 1337, in update_units
converter = munits.registry.get_converter(data)
File "C:\Python33x86\lib\site-packages\matplotlib\units.py", line 137, in get_converter
xravel = x.ravel()
File "C:\Python33x86\lib\site-packages\numpy\ma\core.py", line 3969, in ravel
r._mask = ndarray.ravel(self._mask).reshape(r.shape)
File "C:\Python33x86\lib\site-packages\pandas\core\series.py", line 981, in reshape
return ndarray.reshape(self, newshape, order)
TypeError: an integer is required
However, if I do this directly with Matplotlib's Pyplot plot(), instead of using Pandas' plot() function, it works:
C:\>\Python33x86\python.exe
Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:03:43) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> dates = pd.date_range('20131201', periods=5, freq='H')
>>> plt.plot(dates, [1, 4, 9, 16, 25], 'k', dates, [2, 5, np.NAN, 17, 26], 'g')
[<matplotlib.lines.Line2D object at 0x03E98650>, <matplotlib.lines.Line2D object at 0x040929B0>]
>>> plt.show()
>>>
So it seems that I have a workaround, but as I plot large DataFrames, I would prefer to use Pandas' plot() method, which is more convenient. I've tried to follow the stack trace, but after a while it gets complicated (I'm not familiar with Pandas, Numpy and Matplotlib source code). Am I doing something wrong, or is this a possible bug in Pandas' plot()?
Thank you for your help!
I tried both on Windows x86 and on Linux AMD64 with the same results with these versions:
It seems this is matplotlib 1.3.1 with pandas 0.12 integration bug:
The workaround is to downgrade to matplotlib
1.3.0
. (Note, however, that this version of matplotlib contains a bug on systems which have fonts with non-ASCII font names, so you may need to pick your problem!). This downgrade will trigger a downgrade to numpy1.7.1
, so you should then (again) upgrade to numpy1.8.0
. This error should be fixed in the upcoming Pandas0.13
. However Pandas0.13
may break some existing code (because pandas.Series is no longer a subclass of numpy.ndarray), so again, some hard choices may be required, at least in the short term.
Just checked, code works fine with matplotlib 1.3.0
:
>>> import matplotlib
>>> matplotlib.__version__
'1.3.0'
>>> df.plot()
<matplotlib.axes.AxesSubplot object at 0x04E8B4F0>
>>> plt.show(_)
I workaround the problem with following:
fig, ax = plt.subplots()
ax.plot(df)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With