Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas Timestamp rounds 30 seconds inconsistently

I'm trying to round a pandas DatetimeIndex (or Timestamp) to the nearest minute, but I'm having a problem with Timestamps of 30 seconds - some rounding up, some rounding down (this seems to alternate).

Any suggestions to fix this so that 30s always rounds up?

>>> pd.Timestamp(2019,6,1,6,57,30).round('1T')
Timestamp('2019-06-01 06:58:00')

>>> pd.Timestamp(2019,6,1,6,58,30).round('1T')
Timestamp('2019-06-01 06:58:00')

The top result looks fine, with 57m 30s rounding up to 58m, but I'd expect the bottom result to round up to 59m - not down to 58m.

like image 938
Peter Avatar asked Jun 17 '19 14:06

Peter


2 Answers

This is ceil round

pd.Timestamp(2019,6,1,6,57,30).ceil('1T')
Out[344]: Timestamp('2019-06-01 06:58:00')
pd.Timestamp(2019,6,1,6,58,30).ceil('1T')
Out[345]: Timestamp('2019-06-01 06:59:00')

Update , this is decimal problem

from decimal import Decimal, ROUND_HALF_UP
s=Decimal((pd.Timestamp(2019,6,1,6,58,30).value//60)/1e9).quantize(0, ROUND_HALF_UP)
pd.to_datetime(int(s)*60*1e9)
Out[28]: Timestamp('2019-06-01 06:59:00')
s=Decimal((pd.Timestamp(2019,6,1,6,57,30).value//60)/1e9).quantize(0, ROUND_HALF_UP)
pd.to_datetime(int(s)*60*1e9)
Out[30]: Timestamp('2019-06-01 06:58:00')
like image 171
BENY Avatar answered Nov 15 '22 08:11

BENY


The rounding is consistent; the choice followed is, "when halfway between two integers the even integer is chosen." You want half-up rounding, which you will need to implement yourself.

import numpy as np
import pandas as pd

def half_up_minute(x):
    m = (x - x.dt.floor('1T')).dt.total_seconds() < 30   # Round True Down, False Up
    return x.where(m).dt.floor('1T').fillna(x.dt.ceil('1T'))

# For indices:
def half_up_minute_idx(idx):
    m = (idx - idx.floor('1T')).total_seconds() < 30   # Round True Down, False Up
    return pd.Index(np.select([m], [idx.floor('1T')], default=idx.ceil('1T')))

# Sample Data
df = pd.DataFrame({'date': pd.date_range('2019-01-01', freq='15S', periods=10)})
df['rounded'] = half_up_minute(df.date)

Output:

                 date             rounded
0 2019-01-01 00:00:00 2019-01-01 00:00:00
1 2019-01-01 00:00:15 2019-01-01 00:00:00
2 2019-01-01 00:00:30 2019-01-01 00:01:00
3 2019-01-01 00:00:45 2019-01-01 00:01:00
4 2019-01-01 00:01:00 2019-01-01 00:01:00
5 2019-01-01 00:01:15 2019-01-01 00:01:00
6 2019-01-01 00:01:30 2019-01-01 00:02:00
7 2019-01-01 00:01:45 2019-01-01 00:02:00
8 2019-01-01 00:02:00 2019-01-01 00:02:00
9 2019-01-01 00:02:15 2019-01-01 00:02:00
like image 34
ALollz Avatar answered Nov 15 '22 08:11

ALollz