Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas DataFrame datetime index doesn't survive JSON conversion and reconversion

I have the following snippet of Python code:

import pandas as pd

# print normal index
print data.index

# convert from df to JSON and back
data_json = data.to_json()
df = pd.read_json(data_json)
df.index = pd.to_datetime(df.index)
print df.index

for some reason running this returns in:

<class 'pandas.tseries.index.DatetimeIndex'>
[1950-01-03 00:00:00, ..., 2014-08-21 00:00:00]
Length: 16264, Freq: None, Timezone: None
<class 'pandas.tseries.index.DatetimeIndex'>
[1966-10-31 00:00:00, ..., 2001-09-07 00:00:00]
Length: 16264, Freq: None, Timezone: None

Can someone explain to me what is going on and how I can have the index persist through the transformations?

like image 819
L1meta Avatar asked Aug 22 '14 20:08

L1meta


1 Answers

The error here is that to_json saves dates with ms resolution by defaul, while to_datetime converts with nanosecond resolution by default. To fix, either of these (but not both!) would work.

pd.to_datetime(df.index, unit='ms')
#OR
data_json = data.to_json(date_unit='ns')

As noted in comments, you can also just save the json with the dates in iso format.

like image 91
chrisb Avatar answered Sep 29 '22 20:09

chrisb