I create a simple pandas dataframe with some random values and a DatetimeIndex like so:
import pandas as pd
from numpy.random import randint
import datetime as dt
import matplotlib.pyplot as plt
# create a random dataframe with datetimeindex
dateRange = pd.date_range('1/1/2011', '3/30/2011', freq='D')
randomInts = randint(1, 50, len(dateRange))
df = pd.DataFrame({'RandomValues' : randomInts}, index=dateRange)
Then I plot it in two different ways:
# plot with pandas own matplotlib wrapper
df.plot()
# plot directly with matplotlib pyplot
plt.plot(df.index, df.RandomValues)
plt.show()
(Do not use both statements at the same time as they plot on the same figure.)
I use Python 3.4 64bit and matplotlib 1.4. With pandas 0.14, both statements give me the expected plot (they use slightly different formatting of the x-axis which is okay; note that data is random so the plots do not look the same):
However, when using pandas 0.15, the pandas plot looks alright but the matplotlib plot has some strange tick format on the x-axis:
Is there any good reason for this behaviour and why it has changed from pandas 0.14 to 0.15?
DatetimeIndex [source] Immutable ndarray of datetime64 data, represented internally as int64, and which can be boxed to Timestamp objects that are subclasses of datetime and carry metadata such as frequency information.
show() and plt. draw() are unnecessary and / or blocking in one way or the other.
Use xticks() method to show all the X-coordinates in the plot. Use yticks() method to show all the Y-coordinates in the plot. To display the figure, use show() method.
Note that this bug was fixed in pandas 0.15.1 (https://github.com/pandas-dev/pandas/pull/8693), and plt.plot(df.index, df.RandomValues)
now just works again.
The reason for this change in behaviour is that starting from 0.15, the pandas Index
object is no longer a numpy ndarray subclass. But the real reason is that matplotlib does not support the datetime64
dtype.
As a workaround, in the case you want to use the matplotlib plot
function, you can convert the index to python datetime's using to_pydatetime
:
plt.plot(df.index.to_pydatetime(), df.RandomValues)
More in detail explanation:
Because Index
is no longer a ndarray subclass, matplotlib will convert the index to a numpy array with datetime64
dtype (while before, it retained the Index
object, of which scalars are returned as Timestamp
values, a subclass of datetime.datetime
, which matplotlib can handle). In the plot
function, it calls np.atleast_1d()
on the input which now returns a datetime64 array, which matplotlib handles as integers.
I opened an issue about this (as this gets possibly a lot of use): https://github.com/pydata/pandas/issues/8614
With matplotlib 1.5.0 this 'just works':
import pandas as pd
from numpy.random import randint
import datetime as dt
import matplotlib.pyplot as plt
# create a random dataframe with datetimeindex
dateRange = pd.date_range('1/1/2011', '3/30/2011', freq='D')
randomInts = randint(1, 50, len(dateRange))
df = pd.DataFrame({'RandomValues' : randomInts}, index=dateRange)
fig, ax = plt.subplots()
ax.plot('RandomValues', data=df)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With