Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas to_datetime loses timezone

My raw data has a column with timestamps in ISO8601 format like this:

'2017-07-25T06:00:02+02:00'

Since the data is in CSV format, it will be read as object/string. Therefore I'm converting it to datetime like this.

import pandas pd
df['time'] = pd.to_datetime(df['time'], utc=False)

#df['time'][0]
df['time'][0].isoformat()

Unfortunately this results in UTC timestamps and the timezone is lost. For instance df['time'][0].tzinfo is not set.

Timestamp('2017-07-25 04:00:02')

'2017-07-25T04:00:02'

I'm looking for a way to keep the timezone info in each of the timezone objects. But without re-setting it to CEST (Central European Summer Time) afterwards since this information is already included in the ISO8601 timezone offset in the raw-data. Any idea how to do this?

like image 940
Matthias Avatar asked Nov 15 '17 15:11

Matthias


People also ask

How do I convert a string column to date time in pandas?

Let’s start by simply converting a string column to date time. We can load the Pandas DataFrame below and print out its data types using the info () method: While the data looks like dates, it’s actually formatted as strings. Let’s see how we can use the Pandas to_datetime function to convert the string column to a date time.

Can pandas infer datetime format?

Pandas was able to infer the datetime format and correctly convert the string to a datetime data type. In the next section, you’ll learn how to specify specific formats. There will be many times when you receive a date column in a format that’s not immediately inferred by Pandas.

What is pandas to_datetime () in Python?

Python | Pandas.to_datetime () When a csv file is imported and a Data Frame is made, the Date time objects in the file are read as a string object rather a Date Time object and Hence it’s very tough to perform operations like Time difference on a string rather a Date Time object. Pandas to_datetime () method helps to convert string Date time ...

How do I tell pandas to use'ist'timezone?

How do I tell pandas to use 'IST' timezone or just 5hrs 30 mins further to the time it currently shows me. eg. 7 hrs should become 12:30 hrs and so on. Show activity on this post. You can use tz_localize to set the timezone to UTC /+0000, and then tz_convert to add the timezone you want:


1 Answers

So here's how I solved it.

There's a great article about Timezones and Python, which helped me to come to a solution. It relies on the ISO8601 Python packages.

import iso8601

times = ['2017-07-25 06:00:02+02:00',
         '2017-07-25 08:15:08+02:00',
         '2017-07-25 12:08:00+02:00',
         '2017-07-25 13:10:12+02:00',
         '2017-07-25 15:11:55+02:00',
         '2017-07-25 16:00:00+02:00'
        ]

df = pd.DataFrame(times, columns=['time'])
df['time'] = df['time'].apply(iso8601.parse_date)
df['time'][0]

Which produces the following output and keeps the timezone information.

Timestamp('2017-07-25 06:00:02+0200', tz='+02:00')

like image 188
Matthias Avatar answered Sep 25 '22 14:09

Matthias