An irregular time series <code>data</code> is stored in a <code>pandas.DataFrame</code>. A <code>DatetimeIndex</code> has been set. I need the time difference between consecutive entries in the index. I thought it would be as simple as <pre class="prettyprint"><code>data.index.diff() </code></pre> but got <pre class="prettyprint"><code>AttributeError: 'DatetimeIndex' object has no attribute 'diff' </code></pre> I tried <pre class="prettyprint"><code>data.index - data.index.shift(1) </code></pre> but got <pre class="prettyprint"><code>ValueError: Cannot shift with no freq </code></pre> I do not want to infer or enforce a frequency first before doing this operation. There are large gaps in the time series that would be expanded to large runs of <code>nan</code>. The point is to find these gaps first. So, what is a clean way to do this seemingly simple operation?

There is no implemented <code>diff</code> function yet for index. However, it is possible to convert the index to a <code>Series</code> first by using <code>Index.to_series</code>, if you need to preserve the original index. Use the <code>Series</code> constructor with no index parameter if the default index is needed. Code example: <pre class="prettyprint"><code>rng = pd.to_datetime(['2015-01-10','2015-01-12','2015-01-13']) data = pd.DataFrame({'a': range(3)}, index=rng) print(data) a 2015-01-10 0 2015-01-12 1 2015-01-13 2 a = data.index.to_series().diff() print(a) 2015-01-10 NaT 2015-01-12 2 days 2015-01-13 1 days dtype: timedelta64[ns] a = pd.Series(data.index).diff() print(a) 0 NaT 1 2 days 2 1 days dtype: timedelta64[ns] </code></pre>

Difference pandas.DateTimeIndex without a frequency

Tags:

python

pandas

time-series

data-science

An irregular time series data is stored in a pandas.DataFrame. A DatetimeIndex has been set. I need the time difference between consecutive entries in the index.

I thought it would be as simple as

data.index.diff()

but got

AttributeError: 'DatetimeIndex' object has no attribute 'diff'

I tried

data.index - data.index.shift(1)

but got

ValueError: Cannot shift with no freq

I do not want to infer or enforce a frequency first before doing this operation. There are large gaps in the time series that would be expanded to large runs of nan. The point is to find these gaps first.

So, what is a clean way to do this seemingly simple operation?

669

asked Mar 14 '18 12:03

clstaudt

1 Answers

There is no implemented diff function yet for index.

However, it is possible to convert the index to a Series first by using Index.to_series, if you need to preserve the original index. Use the Series constructor with no index parameter if the default index is needed.

Code example:

rng = pd.to_datetime(['2015-01-10','2015-01-12','2015-01-13'])
data = pd.DataFrame({'a': range(3)}, index=rng)  
print(data)
             a
 2015-01-10  0
 2015-01-12  1
 2015-01-13  2

a = data.index.to_series().diff()
print(a)

2015-01-10      NaT
2015-01-12   2 days
2015-01-13   1 days
dtype: timedelta64[ns]

a = pd.Series(data.index).diff()
print(a)
 0      NaT
 1   2 days
 2   1 days
dtype: timedelta64[ns]

143

answered Oct 04 '22 07:10

jezrael

Related questions
                            
                                Why do people default owner parameter to None in __get__?
                            
                                Pandas DataFrame - Combining one column's values with same index into list
                            
                                Saving a cross-validation trained model in Scikit
                            
                                python requests upload large file with additional data
                            
                                Jupyter notebook does not print logs to the output cell
                            
                                How int() object uses "==" operator without __eq__() method in python2?
                            
                                What is the default variable initializer in Tensorflow?
                            
                                Cannot convert string to float in pandas (ValueError)
                            
                                How to document multiple return values using reStructuredText in Python 2?
                            
                                How am I supposed to register a package to PyPI?
                            
                                value error in python statsmodels.tsa.seasonal
                            
                                create a new dataframe from selecting specific rows from existing dataframe python
                            
                                Why Python hasn't true constants? Is it not dangerous?
                            
                                How to share in memory resources between Flask methods when deploying with Gunicorn
                            
                                get_document_topics and get_term_topics in gensim
                            
                                Key <variable_name> not found in checkpoint Tensorflow
                            
                                find duplicate rows in a pandas dataframe
                            
                                seasonal decompose in python
                            
                                Python 3: using requests does not get the full content of a web page
                            
                                How to do proper file locking on NFS?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With