Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas converting row with unix timestamp (in milliseconds) to datetime

I need to process a huge amount of CSV files where the time stamp is always a string representing the unix timestamp in milliseconds. I could not find a method yet to modify these columns efficiently.

This is what I came up with, however this of course duplicates only the column and I have to somehow put it back to the original dataset. I'm sure it can be done when creating the DataFrame?

import sys if sys.version_info[0] < 3:     from StringIO import StringIO else:     from io import StringIO import pandas as pd  data = 'RUN,UNIXTIME,VALUE\n1,1447160702320,10\n2,1447160702364,20\n3,1447160722364,42'  df = pd.read_csv(StringIO(data))  convert = lambda x: datetime.datetime.fromtimestamp(x / 1e3) converted_df = df['UNIXTIME'].apply(convert) 

This will pick the column 'UNIXTIME' and change it from

0    1447160702320 1    1447160702364 2    1447160722364 Name: UNIXTIME, dtype: int64 

into this

0   2015-11-10 14:05:02.320 1   2015-11-10 14:05:02.364 2   2015-11-10 14:05:22.364 Name: UNIXTIME, dtype: datetime64[ns] 

However, I would like to use something like pd.apply() to get the whole dataset returned with the converted column or as I already wrote, simply create datetimes when generating the DataFrame from CSV.

like image 531
tamasgal Avatar asked Jan 19 '16 17:01

tamasgal


People also ask

How do I convert a Timestamp to a date in Python?

Import the “datetime” file to start timestamp conversion into a date. Create an object and initialize the value of the timestamp. Use the ” fromtimestamp ()” method to place either data or object. Print the date after conversion of the timestamp.

Is Timestamp the same as datetime pandas?

Timestamp is the pandas equivalent of python's Datetime and is interchangeable with it in most cases. It's the type used for the entries that make up a DatetimeIndex, and other timeseries oriented data structures in pandas.

What is the method in pandas to change values into date time data type?

In Pandas, you can convert a column (string/object or integer type) to datetime using the to_datetime() and astype() methods.


2 Answers

You can do this as a post processing step using to_datetime and passing arg unit='ms':

In [5]: df['UNIXTIME'] = pd.to_datetime(df['UNIXTIME'], unit='ms') df  Out[5]:    RUN                UNIXTIME  VALUE 0    1 2015-11-10 13:05:02.320     10 1    2 2015-11-10 13:05:02.364     20 2    3 2015-11-10 13:05:22.364     42 
like image 67
EdChum Avatar answered Sep 21 '22 03:09

EdChum


I use the @EdChum solution, but I add the timezone management:

df['UNIXTIME']=pd.DatetimeIndex(pd.to_datetime(pd['UNIXTIME'], unit='ms'))\                  .tz_localize('UTC' )\                  .tz_convert('America/New_York') 

the tz_localize indicates that timestamp should be considered as regarding 'UTC', then the tz_convert actually moves the date/time to the correct timezone (in this case `America/New_York').

Note that it has been converted to a DatetimeIndex because the tz_ methods works only on the index of the series. Since Pandas 0.15 one can use .dt:

df['UNIXTIME']=pd.to_datetime(df['UNIXTIME'], unit='ms')\                  .dt.tz_localize('UTC' )\                  .dt.tz_convert('America/New_York') 
like image 40
Teudimundo Avatar answered Sep 19 '22 03:09

Teudimundo