How to plot kernel density plot of dates in Pandas?

Tags:

I have a pandas dataframe where each observation has a date (as a column of entries in datetime[64] format). These dates are spread over a period of about 5 years. I would like to plot a kernel-density plot of the dates of all the observations, with the years labelled on the x-axis.

I have figured out how to create a time-delta relative to some reference date and then create a density plot of the number of hours/days/years between each observation and the reference date:

df['relativeDate'].astype('timedelta64[D]').plot(kind='kde')

But this isn't exactly what I want: If I convert to year-deltas, then the x-axis is right but I lose the within-year variation. But if I take a smaller unit of time like hour or day, the x-axis labels are much harder to interpret.

What's the simplest way to make this work in Pandas?

219

asked Jul 10 '15 19:07

bhackinen

1 Answers

Inspired by @JohnE 's answer, an alternative approach to convert date to numeric value is to use .toordinal().

import pandas as pd
import numpy as np

# simulate some artificial data
# ===============================
np.random.seed(0)
dates = pd.date_range('2010-01-01', periods=31, freq='D')
df = pd.DataFrame(np.random.choice(dates,100), columns=['dates'])
# use toordinal() to get datenum
df['ordinal'] = [x.toordinal() for x in df.dates]

print(df)

        dates  ordinal
0  2010-01-13   733785
1  2010-01-16   733788
2  2010-01-22   733794
3  2010-01-01   733773
4  2010-01-04   733776
5  2010-01-28   733800
6  2010-01-04   733776
7  2010-01-08   733780
8  2010-01-10   733782
9  2010-01-20   733792
..        ...      ...
90 2010-01-19   733791
91 2010-01-28   733800
92 2010-01-01   733773
93 2010-01-15   733787
94 2010-01-04   733776
95 2010-01-22   733794
96 2010-01-13   733785
97 2010-01-26   733798
98 2010-01-11   733783
99 2010-01-21   733793

[100 rows x 2 columns]    

# plot non-parametric kde on numeric datenum
ax = df['ordinal'].plot(kind='kde')
# rename the xticks with labels
x_ticks = ax.get_xticks()
ax.set_xticks(x_ticks[::2])
xlabels = [datetime.datetime.fromordinal(int(x)).strftime('%Y-%m-%d') for x in x_ticks[::2]]
ax.set_xticklabels(xlabels)

enter image description here

174

answered Sep 20 '22 15:09

Jianxun Li

Related questions
                            
                                Distances between rankings
                            
                                PyCharm remote debugging - connects but can't start debugging
                            
                                Save numpy array as image with high precision (16 bits) with scikit-image
                            
                                how can I use selenium with my normal browser
                            
                                python - how to compute correlation-matrix with nans in data-matrix
                            
                                Numpy.dot() dimensions not aligned
                            
                                Is there a difference between RotatingFileHandler and logrotate.d + WatchedFileHandler for Python log rotation?
                            
                                Opening PNG with PIL/Pillow
                            
                                Creating databases in SQLAlchemy tests with PostgreSQL
                            
                                Why do new style class and old style class have different behavior in this case?
                            
                                Overwriting previously extracted files instead of creating new ones
                            
                                Trouble installing scipy via pyCharm windows 8 - no lapack / blas resources found
                            
                                Function decorated using functools.wraps raises TypeError with the name of the wrapper. Why? How to avoid?
                            
                                Python 3 map dictionary update method to a list of other dictionaries [duplicate]
                            
                                plt.show() hangs on OSX with Anaconda Python
                            
                                Enabling compression on Heroku using python
                            
                                How can I control which Python distribution to pip install a package to when I have Python 2, Python 3, and Anaconda on my computer?
                            
                                IntegrityError Insert or update on table "orders_order" violates foreign key constraint "
                            
                                flask admin custom QueryAjaxModelLoader
                            
                                Is there a way to make Seaborn or Vincent interactive?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to plot kernel density plot of dates in Pandas?

Tags:

python

pandas

matplotlib

time-series

kernel-density

bhackinen

People also ask

1 Answers

Jianxun Li

Recent Activity

Donate For Us