Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to round a Pandas `DatetimeIndex`?

I have a pandas.DatetimeIndex, e.g.:

pd.date_range('2012-1-1 02:03:04.000',periods=3,freq='1ms')
>>> [2012-01-01 02:03:04, ..., 2012-01-01 02:03:04.002000]

I would like to round the dates (Timestamps) to the nearest second. How do I do that? The expected result is similar to:

[2012-01-01 02:03:04.000000, ..., 2012-01-01 02:03:04.000000]

Is it possible to accomplish this by rounding a Numpy datetime64[ns] to seconds without changing the dtype [ns]?

np.array(['2012-01-02 00:00:00.001'],dtype='datetime64[ns]')
like image 925
Yariv Avatar asked Dec 09 '12 08:12

Yariv


People also ask

How do I convert DatetimeIndex to series?

To convert the DateTimeIndex to Series, use the DateTimeIndex. to_series() method.

What is DatetimeIndex pandas?

class pandas. DatetimeIndex [source] Immutable ndarray of datetime64 data, represented internally as int64, and which can be boxed to Timestamp objects that are subclasses of datetime and carry metadata such as frequency information.


2 Answers

Update: if you're doing this to a DatetimeIndex / datetime64 column a better way is to use np.round directly rather than via an apply/map:

np.round(dtindex_or_datetime_col.astype(np.int64), -9).astype('datetime64[ns]')

Old answer (with some more explanation):

Whilst @Matti's answer is clearly the correct way to deal with your situation, I thought I would add an answer how you might round a Timestamp to the nearest second:

from pandas.lib import Timestamp

t1 = Timestamp('2012-1-1 00:00:00')
t2 = Timestamp('2012-1-1 00:00:00.000333')

In [4]: t1
Out[4]: <Timestamp: 2012-01-01 00:00:00>

In [5]: t2
Out[5]: <Timestamp: 2012-01-01 00:00:00.000333>

In [6]: t2.microsecond
Out[6]: 333

In [7]: t1.value
Out[7]: 1325376000000000000L

In [8]: t2.value
Out[8]: 1325376000000333000L

# Alternatively: t2.value - t2.value % 1000000000
In [9]: long(round(t2.value, -9)) # round milli-, micro- and nano-seconds
Out[9]: 1325376000000000000L

In [10]: Timestamp(long(round(t2.value, -9)))
Out[10]: <Timestamp: 2012-01-01 00:00:00>

Hence you can apply this to the entire index:

def to_the_second(ts):
    return Timestamp(long(round(ts.value, -9)))

dtindex.map(to_the_second)
like image 178
Andy Hayden Avatar answered Sep 24 '22 07:09

Andy Hayden


round() method was added for DatetimeIndex, Timestamp, TimedeltaIndex and Timedelta in pandas 0.18.0. Now we can do the following:

In[114]: index = pd.DatetimeIndex([pd.Timestamp('2012-01-01 02:03:04.000'), pd.Timestamp('2012-01-01 02:03:04.002'), pd.Timestamp('20130712 02:03:04.500'), pd.Timestamp('2012-01-01 02:03:04.501')])

In[115]: index.values
Out[115]: 
array(['2012-01-01T02:03:04.000000000', '2012-01-01T02:03:04.002000000',
       '2013-07-12T02:03:04.500000000', '2012-01-01T02:03:04.501000000'], dtype='datetime64[ns]')

In[116]: index.round('S')
Out[116]: 
DatetimeIndex(['2012-01-01 02:03:04', '2012-01-01 02:03:04',
               '2013-07-12 02:03:04', '2012-01-01 02:03:05'],
              dtype='datetime64[ns]', freq=None)

round() accepts frequency parameter. String aliases for it are listed here.

like image 30
wombatonfire Avatar answered Sep 20 '22 07:09

wombatonfire