Plot pandas data frame with year over year data

Tags:

pandas

I have a data frame in the format

              value
2000-01-01    1
2000-03-01    2
2000-06-01    15
2000-09-01    3
2000-12-01    7
2001-01-01    1
2001-03-01    3
2001-06-01    8
2001-09-01    5
2001-12-01    3
2002-01-01    1
2002-03-01    1
2002-06-01    8
2002-09-01    5
2002-12-01    19

(index is datetime) and I need to plot all results year over year to compare the results each 3 months (The data can be monthly, too), plus the average of all years.

I can easily plot they separately, but because of the index, it will shift the plots according with the index:

fig, axes = plt.subplots()
df['2000'].plot(ax=axes, label='2000')
df['2001'].plot(ax=axes, label='2001')
df['2002'].plot(ax=axes, label='2002')
axes.plot(df["2000":'2002'].groupby(df["2000":'2002'].index.month).mean())

So it's not the desired result. I've seem some answers here, but you have to concat, create a multiindex and plot. If one of the data frames has NaNs or missing values, it can be very cumbersome. Is there a pandas way to do it?

410

asked May 21 '15 16:05

Ivan

2 Answers

Is this what you want? You can add means after transformation.

df = pd.DataFrame({'value': [1, 2, 15, 3, 7, 1, 3, 8, 5, 3, 1, 1, 8, 5, 19]},
              index=pd.DatetimeIndex(['2000-01-01', '2000-03-01', '2000-06-01', '2000-09-01',
                                      '2000-12-01', '2001-01-01', '2001-03-01', '2001-06-01',
                                      '2001-09-01', '2001-12-01', '2002-01-01', '2002-03-01',
                                      '2002-06-01', '2002-09-01', '2002-12-01']))


pv = pd.pivot_table(df, index=df.index.month, columns=df.index.year,
                    values='value', aggfunc='sum')
pv
#     2000  2001  2002
# 1      1     1     1
# 3      2     3     1
# 6     15     8     8
# 9      3     5     5
# 12     7     3    19

pv.plot()

enter image description here

127

answered Oct 12 '22 18:10

sinhrks

One possibility is to use the 'day of the year' as x-axis. Using the x kwarg to override the index of the dataframe as x-axis:

fig, axes = plt.subplots()
df['2000'].plot(ax=axes, label='2000', x=df['2000'].index.dayofyear)
df['2001'].plot(ax=axes, label='2001', x=df['2001'].index.dayofyear)

Alternatively, you can also add this as a column, and then refer to the column name.

If it are monthly data, then you an of course use the month attribute of the index as well.

The disadvantage of the above approach is that you don't have the nice datetime formatting of the x-axis.

answered Oct 12 '22 18:10

joris

Related questions
                            
                                Pytest init setup for few modules
                            
                                Reverse diagonal on numpy python
                            
                                Concise Ruby hash equivalent of Python dict.get()
                            
                                Python local vs global variables
                            
                                Get a header with Python and convert in JSON (requests - urllib2 - json)
                            
                                How to create a modal window in pyqt?
                            
                                How can I check whether a URL is valid using `urlparse`?
                            
                                Sorting XML in python etree
                            
                                Rotate a 2D image around specified origin in Python
                            
                                Python Multiprocessing: Only one process is running
                            
                                What's the Pythonic way to report nonfatal errors in a parser?
                            
                                count occurrences of number by column in pandas data frame
                            
                                mat is not a numerical tuple : openCV error
                            
                                Masking user input in python with asterisks
                            
                                get_bucket() gives 'Bad Request' for S3 buckets I didn't create via Boto
                            
                                Adding colors to a 3d quiver plot in matplotlib
                            
                                Traceback when updating status on twitter via Tweepy
                            
                                Pandas selecting discontinuous columns from a dataframe
                            
                                Getting all instances of child node using xml.etree.ElementTree
                            
                                How are the "error bands" in Seaborn tsplot calculated?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With