Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert timedelta64[ns] column to seconds in Python Pandas DataFrame

A pandas DataFrame column duration contains timedelta64[ns] as shown. How can you convert them to seconds?

0   00:20:32 1   00:23:10 2   00:24:55 3   00:13:17 4   00:18:52 Name: duration, dtype: timedelta64[ns] 

I tried the following

print df[:5]['duration'] / np.timedelta64(1, 's') 

but got the error

Traceback (most recent call last):   File "test.py", line 16, in <module>     print df[0:5]['duration'] / np.timedelta64(1, 's')   File "C:\Python27\lib\site-packages\pandas\core\series.py", line 130, in wrapper     "addition and subtraction, but the operator [%s] was passed" % name) TypeError: can only operate on a timedeltas for addition and subtraction, but the operator [__div__] was passed 

Also tried

print df[:5]['duration'].astype('timedelta64[s]') 

but received the error

Traceback (most recent call last):   File "test.py", line 17, in <module>     print df[:5]['duration'].astype('timedelta64[s]')   File "C:\Python27\lib\site-packages\pandas\core\series.py", line 934, in astype     values = com._astype_nansafe(self.values, dtype)   File "C:\Python27\lib\site-packages\pandas\core\common.py", line 1653, in _astype_nansafe     raise TypeError("cannot astype a timedelta from [%s] to [%s]" % (arr.dtype,dtype)) TypeError: cannot astype a timedelta from [timedelta64[ns]] to [timedelta64[s]] 
like image 635
Nyxynyx Avatar asked Oct 20 '14 00:10

Nyxynyx


People also ask

How do I convert a column to a DataFrame in Python?

Convert Column to int (Integer)Use pandas DataFrame. astype() function to convert column to int (integer), you can apply this on a specific column or on an entire DataFrame. To cast the data type to 64-bit signed integer, you can use numpy. int64 , numpy.


2 Answers

This works properly in the current version of Pandas (version 0.14):

In [132]: df[:5]['duration'] / np.timedelta64(1, 's') Out[132]:  0    1232 1    1390 2    1495 3     797 4    1132 Name: duration, dtype: float64 

Here is a workaround for older versions of Pandas/NumPy:

In [131]: df[:5]['duration'].values.view('<i8')/10**9 Out[131]: array([1232, 1390, 1495,  797, 1132], dtype=int64) 

timedelta64 and datetime64 data are stored internally as 8-byte ints (dtype '<i8'). So the above views the timedelta64s as 8-byte ints and then does integer division to convert nanoseconds to seconds.

Note that you need NumPy version 1.7 or newer to work with datetime64/timedelta64s.

like image 117
unutbu Avatar answered Sep 18 '22 17:09

unutbu


Use the Series dt accessor to get access to the methods and attributes of a datetime (timedelta) series.

>>> s 0   -1 days +23:45:14.304000 1   -1 days +23:46:57.132000 2   -1 days +23:49:25.913000 3   -1 days +23:59:48.913000 4            00:00:00.820000 dtype: timedelta64[ns] >>> >>> s.dt.total_seconds() 0   -885.696 1   -782.868 2   -634.087 3    -11.087 4      0.820 dtype: float64 

There are other Pandas Series Accessors for String, Categorical, and Sparse data types.

like image 37
wwii Avatar answered Sep 20 '22 17:09

wwii