How do I keep the timezone of my index when serializing/deserializing a Pandas DataFrame using JSON

Tags:

pandas

I need to serialize a Pandas DataFrame to JSON using the to_json method. Here is an example of how I am doing that:

import pandas
import numpy as np
dr = pandas.date_range('2016-01-01T12:30:00Z', '2016-02-01T12:30:00Z')
data = np.random.rand(len(dr), 2)
df = pandas.DataFrame(data, index=dr, columns=['a', 'b'])

# NOTE: The index for df has the following properties in pandas 0.19.2
#       dtype='datetime64[ns, UTC]', freq='D'

# Save to JSON
df.to_json('/tmp/test_data_01.json', date_unit='s', date_format='iso')

Using the code above I see that my DataFrame has been saved to disk and that the indices look like: [2016-01-01T12:30:00Z, 2016-01-02T12:30:00Z, ...] in the file /tmp/test_data_01.json.

The problem is that when I do the following:

df2 = pandas.read_json('/tmp/test_data_01.json')

the index for df2 has no timezone.

df2.index.tz
# Returns None

Is there anyway to keep the timezone property of a DataFrame that is serialized to JSON and deserialized back?

735

asked Jan 05 '17 19:01

3 Answers

Pandas will convert everything to UTC when using to_json.

See this example where I change it to Europe/Paris which is UTC+1:

In [1]:
dr = pd.date_range('2016-01-01T12:30:00Z', '2016-02-01T12:30:00Z')
dr = dr.tz_convert('Europe/Paris')
data = np.random.rand(len(dr), 2)
df = pd.DataFrame(data, index=dr, columns=['a', 'b'])

In [2]: df.index[0]
Out[2]: Timestamp('2016-01-01 13:30:00+0100', tz='Europe/Paris', freq='D')

In [3]: df.to_json('test_data_01.json', date_unit='s', date_format='iso')

If I open the test_data_01.json, the first one is "2016-01-01T12:30:00Z".

So when you load the json, localize it to UTC. There's no way to know what tz was used beforehand though:

In [4]:
df2 = pd.read_json('test_data_01.json')
df2.index = df2.index.tz_localize('UTC')

149

answered Oct 21 '22 09:10

I'm not agree with the solution of @julien-marrec, because it force the timezone to be UTC, and when calling read_json the timezone could be anything else. I had implemented the following workaround that parse date while analyzing timezone.

import pandas._libs.json as json
loads = json.loads
result = loads('{"2019-01-01T13:00:00.000Z":15,"2019-01-01T11:00:00.000Z":88.352985054,"2019-01-01T12:00:00.000Z":90.091719896}',
          dtype=None, numpy=True, labelled=True )
pd.Series(result[0], pd.DatetimeIndex(result[1])).index

And filled a bug about that https://github.com/pandas-dev/pandas/issues/25546

answered Oct 21 '22 07:10

Jérôme B

Related questions
                            
                                Numpy roll vertical in 2d array
                            
                                How to select specific the cipher while sending request via python request module
                            
                                Python-Sphinx: "inherit" method documentation from superclass
                            
                                How to run django and wordpress on NGINX server using same domain?
                            
                                How to unpack a dictionary of list (of dictionaries!) and return as grouped tuples?
                            
                                Numpy unique 2D sub-array [duplicate]
                            
                                Enhance performance of geopandas overlay(intersection)
                            
                                How to log Python warnings in a Django log file?
                            
                                how to reproduce "Connection reset by peer"
                            
                                After resizing an image with cv2, how to get the new bounding box coordinate
                            
                                How to query pre-existing table from SQlAlchemy ORM session?
                            
                                Pandas: read_csv ignore rows after a blank line
                            
                                How can Python be used to write line breaks to a csv as '\n'?
                            
                                Python add custom property/metadata to file
                            
                                Using a decorator function defined as an instance variable
                            
                                Can this cython code be optimized?
                            
                                Use Python regex to parse string of floats output by Java Arrays.deepToString
                            
                                How save list to file in spark?
                            
                                Getting blocked when scraping Amazon (even with headers, proxies, delay) [closed]
                            
                                How can asyncio ever not be thread safe considering the GIL?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How do I keep the timezone of my index when serializing/deserializing a Pandas DataFrame using JSON

Tags:

python

pandas

aquil.abdullah

People also ask

3 Answers

Julien Marrec

Attack68

Jérôme B

Recent Activity

Donate For Us