Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can pandas plot a time-series without trying to convert the index to Periods?

When plotting a time-series, I observe an unusual behavior, which eventually results in not being able to format the xticks of the plot. It seems that pandas internally tries to convert the index into a PeriodIndex, but obviously only succeeds if the timestamp values are equally spaced. If they are unevenly spaced (or - strangely - if they are evenly spaced but timezone-aware) the index remains a DatetimeIndex. The latter case works as expected. I can set DateFormatter and Locators. If however the index is interally converted to a PeriodIndex before plotting, the x-axis of the resulting plott seems to be messed up.

Here is an Example to reproduce the problem.

from pandas import Series, DataFrame
import pandas as pd
from datetime import datetime
import pytz
import matplotlib.pyplot as plt
import matplotlib as mpl
import numpy as np

idx1 = np.array([datetime(2014, 1, 16, 0),
                 datetime(2014, 1, 16, 5),
                 datetime(2014, 1, 16, 10),
                 datetime(2014, 1, 16, 15), 
                 datetime(2014, 1, 16, 20), 
                 datetime(2014, 1, 17, 1)])
idx2 = np.array([datetime(2014, 1, 16, 0),
                 datetime(2014, 1, 16, 5),
                 datetime(2014, 1, 16, 10),
                 datetime(2014, 1, 16, 15),
                 datetime(2014, 1, 16, 20),
                 datetime(2014, 1, 16, 23)])
y = [0, 2, np.nan, 5, 2, 1]
tz = pytz.timezone('Europe/Berlin')

fig, (ax1, ax2, ax3) = plt.subplots(1,3, figsize=(15,4))

# index convertible to period index
s1 = Series(y, index=idx1)
s1.plot(ax=ax1)
print ax1.get_xticks()
print ax1.xaxis.get_major_locator()
print ax1.xaxis.get_major_formatter()
#ax1.xaxis.set_major_formatter(mpl.dates.DateFormatter('%H'))
#ax1.xaxis.set_major_locator(mpl.ticker.MultipleLocator(0.25))

# index not convertible to period index
s2 = Series(y, index=idx2)
s2.plot(ax=ax2)
print ax2.get_xticks()
#ax2.xaxis.set_major_formatter(mpl.dates.DateFormatter('%H'))
#ax2.xaxis.set_major_locator(mpl.ticker.MultipleLocator(0.25))

# index convertible to period index but tz-aware
s3 = Series(y, index=idx1)
s3 = s3.tz_localize(tz)
s3.plot(ax=ax3)
print ax3.get_xticks()
#ax2.xaxis.set_major_formatter(mpl.dates.DateFormatter('%H'))
#ax2.xaxis.set_major_locator(mpl.ticker.MultipleLocator(0.25))

fig.autofmt_xdate()  # just temporarily

plt.tight_layout()
plt.show(block=False)

Is there a way to tell pandas to keep the index in its original format and not to convert it to Periods? Any ideas how to deal with this are greatly appreciated!

I use pandas 0.13 and matplotlib 1.3.1

As a sidenote:
It would of course be great if the timezones were not converted all to UTC. However I realize this problem may still persist for a while. But if anyone has a hint for a workaround I'd be glad to hear (I tried passing a tz directly to the DateFormatter. That works, but the Locators don't seem to like it much).

like image 352
user2689410 Avatar asked Jan 17 '14 15:01

user2689410


1 Answers

One way around this is not to use the pandas plot method, but to directly the matplotlib's plot function. s1.plot(ax=ax1) would then become:

ax1.plot(s1.index, s1)

If you then print the ax1.get_xticks() you get the same as with the irregular time series, as the datetime values are not converted to Periods. One disadvantage of this is that you loose the smarter date axis formatting of pandas (but as you want to adapt this, not a problem I suppose).

As far as I know you cannot specify this in the pandas public api (apart form ugly hacks as deliberately making your time series irregular or adding a time zone)

like image 92
joris Avatar answered Oct 31 '22 13:10

joris