When I use pandas read_csv to read a column with a timezone aware datetime (and specify this column to be the index), pandas converts it to a timezone naive utc DatetimeIndex.
Data in Test.csv:
DateTime,Temperature
2016-07-01T11:05:07+02:00,21.125
2016-07-01T11:05:09+02:00,21.138
2016-07-01T11:05:10+02:00,21.156
2016-07-01T11:05:11+02:00,21.179
2016-07-01T11:05:12+02:00,21.198
2016-07-01T11:05:13+02:00,21.206
2016-07-01T11:05:14+02:00,21.225
2016-07-01T11:05:15+02:00,21.233
Code to read from csv:
In [1]: import pandas as pd
In [2]: df = pd.read_csv('Test.csv', index_col=0, parse_dates=True)
This results in an index that represents the timezone naive utc time:
In [3]: df.index
Out[3]: DatetimeIndex(['2016-07-01 09:05:07', '2016-07-01 09:05:09',
           '2016-07-01 09:05:10', '2016-07-01 09:05:11',
           '2016-07-01 09:05:12', '2016-07-01 09:05:13',
           '2016-07-01 09:05:14', '2016-07-01 09:05:15'],
          dtype='datetime64[ns]', name='DateTime', freq=None)
I tried to use a date_parser function:
In [4]: date_parser = lambda x: pd.to_datetime(x).tz_localize(None)
In [5]: df = pd.read_csv('Test.csv', index_col=0, parse_dates=True, date_parser=date_parser)
This gave the same result.
How can I make read_csv create a DatetimeIndex that is timezone naive and represents the local time instead of the utc time?
I'm using pandas 0.18.1.
According to the docs the default date_parser uses dateutil.parser.parser. According to the docs for that function, the default is to ignore timezones. So if you supply dateutil.parser.parser as the date_parser kwarg, timezones are not converted.
import dateutil
df = pd.read_csv('Test.csv', index_col=0, parse_dates=True, date_parser=dateutil.parser.parse)
print(df)
outputs
                           Temperature
DateTime                              
2016-07-01 11:05:07+02:00       21.125
2016-07-01 11:05:09+02:00       21.138
2016-07-01 11:05:10+02:00       21.156
2016-07-01 11:05:11+02:00       21.179
2016-07-01 11:05:12+02:00       21.198
2016-07-01 11:05:13+02:00       21.206
2016-07-01 11:05:14+02:00       21.225
2016-07-01 11:05:15+02:00       21.233
                        The answer of Alex leads to a timezone aware DatetimeIndex. To get a timezone naive local DatetimeIndex, as asked by the OP, inform dateutil.parser.parser to ignore the timezone information by setting ignoretz=True:
import dateutil
date_parser = lambda x: dateutil.parser.parse(x, ignoretz=True)
df = pd.read_csv('Test.csv', index_col=0, parse_dates=True, date_parser=date_parser)
print(df)
outputs
                     Temperature
DateTime                        
2016-07-01 11:05:07       21.125
2016-07-01 11:05:09       21.138
2016-07-01 11:05:10       21.156
2016-07-01 11:05:11       21.179
2016-07-01 11:05:12       21.198
2016-07-01 11:05:13       21.206
2016-07-01 11:05:14       21.225
2016-07-01 11:05:15       21.233
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With