Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compare multiple year data on a single plot python

I have two time series from different years stored in pandas dataframes. For example:

data15 = pd.DataFrame(
    [1,2,3,4,5,6,7,8,9,10,11,12],
    index=pd.date_range(start='2015-01',end='2016-01',freq='M'),
    columns=['2015']
)
data16 = pd.DataFrame(
    [5,4,3,2,1],
    index=pd.date_range(start='2016-01',end='2016-06',freq='M'),
    columns=['2016']
)

I'm actually working with daily data but if this question is answered sufficiently I can figure out the rest.

What I'm trying to do is overlay the plots of these different data sets onto a single plot from January through December to compare the differences between the years. I can do this by creating a "false" index for one of the datasets so they have a common year:

data16.index = data15.index[:len(data16)]
ax = data15.plot()
data16.plot(ax=ax)

But I would like to avoid messing with the index if possible. Another problem with this method is that the year (2015) will appear in the x axis tick label which I don't want. Does anyone know of a better way to do this?

like image 958
Taylor Avatar asked Jun 02 '16 15:06

Taylor


2 Answers

One way to do this would be to overlay a transparent axes over the first, and plot the 2nd dataframe in that one, but then you'd need to update the x-limits of both axes at the same time (similar to twinx). However, I think that's far more work and has a few more downsides: you can't easily zoom interactively into a specific region anymore for example, unless you make sure both axes are linked via their x-limits. Really, the easiest is to take into account that offset, by "messing with the index".

As for the tick labels, you can easily change the format so that they don't show the year by specifying the x-tick format:

import matplotlib.dates as mdates
month_day_fmt = mdates.DateFormatter('%b %d') # "Locale's abbreviated month name. + day of the month"
ax.xaxis.set_major_formatter(month_day_fmt)

Have a look at the matplotlib API example for specifying the date format.

like image 79
Oliver W. Avatar answered Sep 22 '22 12:09

Oliver W.


I see two options.

Option 1: add a month column to your dataframes

data15['month'] = data15.index.to_series().dt.strftime('%b')
data16['month'] = data16.index.to_series().dt.strftime('%b')

ax = data16.plot(x='month', y='2016')
ax = data15.plot(x='month', y='2015', ax=ax)

Option 2: if you don't want to do that, you can use matplotlib directly

import matplotlib.pyplot as plt

fig, ax = plt.subplots()
ax.plot(data15['2015'].values)
ax.plot(data16['2016'].values)
plt.xticks(range(len(data15)), data15.index.to_series().dt.strftime('%b'), size='small')

Needless to say, I would recommend the first option.

like image 34
IanS Avatar answered Sep 19 '22 12:09

IanS