Pandas converting row with unix timestamp (in milliseconds) to datetime

Tags:

I need to process a huge amount of CSV files where the time stamp is always a string representing the unix timestamp in milliseconds. I could not find a method yet to modify these columns efficiently.

This is what I came up with, however this of course duplicates only the column and I have to somehow put it back to the original dataset. I'm sure it can be done when creating the DataFrame?

Click to copy

import sys if sys.version_info[0] < 3:     from StringIO import StringIO else:     from io import StringIO import pandas as pd  data = 'RUN,UNIXTIME,VALUE\n1,1447160702320,10\n2,1447160702364,20\n3,1447160722364,42'  df = pd.read_csv(StringIO(data))  convert = lambda x: datetime.datetime.fromtimestamp(x / 1e3) converted_df = df['UNIXTIME'].apply(convert)

This will pick the column 'UNIXTIME' and change it from

Click to copy

0    1447160702320 1    1447160702364 2    1447160722364 Name: UNIXTIME, dtype: int64

into this

Click to copy

0   2015-11-10 14:05:02.320 1   2015-11-10 14:05:02.364 2   2015-11-10 14:05:22.364 Name: UNIXTIME, dtype: datetime64[ns]

However, I would like to use something like pd.apply() to get the whole dataset returned with the converted column or as I already wrote, simply create datetimes when generating the DataFrame from CSV.

531

asked Jan 19 '16 17:01

tamasgal

2 Answers

You can do this as a post processing step using to_datetime and passing arg unit='ms':

Click to copy

In [5]: df['UNIXTIME'] = pd.to_datetime(df['UNIXTIME'], unit='ms') df  Out[5]:    RUN                UNIXTIME  VALUE 0    1 2015-11-10 13:05:02.320     10 1    2 2015-11-10 13:05:02.364     20 2    3 2015-11-10 13:05:22.364     42

answered Sep 21 '22 03:09

EdChum

I use the @EdChum solution, but I add the timezone management:

Click to copy

df['UNIXTIME']=pd.DatetimeIndex(pd.to_datetime(pd['UNIXTIME'], unit='ms'))\                  .tz_localize('UTC' )\                  .tz_convert('America/New_York')

the tz_localize indicates that timestamp should be considered as regarding 'UTC', then the tz_convert actually moves the date/time to the correct timezone (in this case `America/New_York').

Note that it has been converted to a DatetimeIndex because the tz_ methods works only on the index of the series. Since Pandas 0.15 one can use .dt:

Click to copy

df['UNIXTIME']=pd.to_datetime(df['UNIXTIME'], unit='ms')\                  .dt.tz_localize('UTC' )\                  .dt.tz_convert('America/New_York')

answered Sep 19 '22 03:09

Teudimundo

Related questions
                            
                                Python's argparse to show program's version with prog and version string formatting
                            
                                splitting a string based on tab in the file
                            
                                how to ignore index comparison for pandas assert frame equal
                            
                                How to avoid "CUDA out of memory" in PyTorch
                            
                                Replace all quotes in a string with escaped quotes?
                            
                                Python return list from function
                            
                                How to delete rows from a table using an SQLAlchemy query without ORM?
                            
                                How to display text in pygame? [duplicate]
                            
                                Python string formatting: reference one argument multiple times
                            
                                How can I make ipdb show more lines of context while debugging?
                            
                                Python's "open()" throws different errors for "file not found" - how to handle both exceptions?
                            
                                Sort multidimensional array based on 2nd element of the subarray
                            
                                How to fix "AttributeError: module 'tensorflow' has no attribute 'get_default_graph'"?
                            
                                Weird behaviour initializing a numpy array of string data
                            
                                Convert np.array of type float64 to type uint8 scaling values
                            
                                matplotlib: change the current axis instance (i.e., gca())
                            
                                Max and Min date in pandas groupby
                            
                                USING LIKE inside pandas.query()
                            
                                Where are the ampersand and vertical bar characters used in Python?
                            
                                Is there a numpy/scipy dot product, calculating only the diagonal entries of the result?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas converting row with unix timestamp (in milliseconds) to datetime

Tags:

python

datetime

pandas

tamasgal

People also ask

2 Answers

EdChum

Teudimundo

Recent Activity

Donate For Us