Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas to_Datetime conversion with timezone aware index

I have a dataframe with timezone aware index

>>> dfn.index
Out[1]: 
DatetimeIndex(['2004-01-02 01:00:00+11:00', '2004-01-02 02:00:00+11:00',
               '2004-01-02 03:00:00+11:00', '2004-01-02 04:00:00+11:00',
               '2004-01-02 21:00:00+11:00', '2004-01-02 22:00:00+11:00'],
              dtype='datetime64[ns]', freq='H', tz='Australia/Sydney')

I save it in csv, then read it as follows:

>>> dfn.to_csv('temp.csv')
>>> df= pd.read_csv('temp.csv', index_col=0 ,header=None )
>>> df.head()
Out[1]: 
                                1
0                                
NaN                        0.0000
2004-01-02 01:00:00+11:00  0.7519
2004-01-02 02:00:00+11:00  0.7520
2004-01-02 03:00:00+11:00  0.7515
2004-01-02 04:00:00+11:00  0.7502

The index is read as a string

>>> df.index[1]
Out[3]: '2004-01-02 01:00:00+11:00'

On converting to_datetime, it changes the time as it adds +11 to hours

>>> df.index = pd.to_datetime(df.index)
>>> df.index[1]
Out[6]: Timestamp('2004-01-01 14:00:00')

I can now subtract 11 hours from the index to fix it, but is there a better way to handle this?

I tried using the solution in answer here, but that slows down the code a lot.

like image 799
dayum Avatar asked Nov 09 '17 06:11

dayum


People also ask

How do I make pandas datetimeindex timezone aware?

Pandas DatetimeIndex.tz_localize () function localize tz-naive DatetimeIndex to tz-aware DatetimeIndex. This method takes a time zone (tz) naive DatetimeIndex object and makes this time zone aware.

Why do pandas give different timestamps for the same date?

Late contribution but just came across something similar in Python datetime and pandas give different timestamps for the same date. If you have timezone-aware datetime in pandas, technically, tz_localize (None) changes the POSIX timestamp (that is used internally) as if the local time from the timestamp was UTC.

What does timezone aware mean in Python?

Timezone-aware objects are Python DateTime or time objects that include timezone information. An aware object represents a specific moment in time that is not open to interpretation. Checking if an object is timezone aware or not: We can easily check if a datetime object is timezone-aware or not.

Does TZ_localize change the POSIX timestamp in pandas?

If you have timezone-aware datetime in pandas, technically, tz_localize (None) changes the POSIX timestamp (that is used internally) as if the local time from the timestamp was UTC. Local in this context means local in the specified timezone.


1 Answers

I think here is problem you need write and read header of file same way. And for parse dates need parameter parse_dates.

#write to file header
dfn.to_csv('temp.csv')
#no read header
df= pd.read_csv('temp.csv', index_col=0 ,header=None)

Solution1:

#no write header
dfn.to_csv('temp.csv', header=None)
#no read header
df= pd.read_csv('temp.csv', index_col=0 ,header=None, parse_dates=[0])

Solution2:

#write header
dfn.to_csv('temp.csv')
#read header
df= pd.read_csv('temp.csv', index_col=0, parse_dates=[0])

Unfortunately parse_date convert dates to UTC, so is necessary add timezones later:

df.index = df.index.tz_localize('UTC').tz_convert('Australia/Sydney')
print (df.index)
DatetimeIndex(['2004-01-02 01:00:00+11:00', '2004-01-02 02:00:00+11:00',
               '2004-01-02 03:00:00+11:00', '2004-01-02 04:00:00+11:00',
               '2004-01-02 05:00:00+11:00', '2004-01-02 06:00:00+11:00',
               '2004-01-02 07:00:00+11:00', '2004-01-02 08:00:00+11:00',
               '2004-01-02 09:00:00+11:00', '2004-01-02 10:00:00+11:00'],
              dtype='datetime64[ns, Australia/Sydney]', name=0, freq=None)

Sample for test:

idx = pd.date_range('2004-01-02 01:00:00', periods=10, freq='H', tz='Australia/Sydney')
dfn = pd.DataFrame({'col':range(len(idx))}, index=idx)
print (dfn)
                           col
2004-01-02 01:00:00+11:00    0
2004-01-02 02:00:00+11:00    1
2004-01-02 03:00:00+11:00    2
2004-01-02 04:00:00+11:00    3
2004-01-02 05:00:00+11:00    4
2004-01-02 06:00:00+11:00    5
2004-01-02 07:00:00+11:00    6
2004-01-02 08:00:00+11:00    7
2004-01-02 09:00:00+11:00    8
2004-01-02 10:00:00+11:00    9
like image 127
jezrael Avatar answered Oct 16 '22 18:10

jezrael