Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Accessing total_seconds() in pandas data column

I want to create a new column in a pandas data frame that is the elapsed time from the start of the data frame. I am importing a log file into a data frame which has datatime info, but accessing the total_seconds() function in s_df['delta_t'] is not working. It works if I access the individual elements of the column (s_df['delta_t'].iloc[8].total_seconds()), but I want to create a new column with total_seconds() and my attempts are failing.

s_df['t'] = s_df.index  # s_df['t] is a column of datetime
s_df['delta_t'] = ( s_df['t'] - s_df['t'].iloc[0]) # time since start of data frame
s_df['elapsed_seconds'] = # want column s_df['delta_t'].total_seconds()
like image 314
user2994013 Avatar asked Mar 22 '16 14:03

user2994013


People also ask

How do I pull data from a column in pandas?

You can use the loc and iloc functions to access columns in a Pandas DataFrame. Let's see how. If we wanted to access a certain column in our DataFrame, for example the Grades column, we could simply use the loc function and specify the name of the column in order to retrieve it.

How do you access the pandas element?

In order to access the series element refers to the index number. Use the index operator [ ] to access an element in a series. The index must be an integer. In order to access multiple elements from a series, we use Slice operation.


1 Answers

use .dt accessor:

s_df['elapsed_seconds'] = s_df['delta_t'].dt.total_seconds()

Example:

In [82]:
df = pd.DataFrame({'date': pd.date_range(dt.datetime(2010,1,1), dt.datetime(2010,2,1))})
df['delta'] = df['date'] - df['date'].iloc[0]
df

Out[82]:
         date   delta
0  2010-01-01  0 days
1  2010-01-02  1 days
2  2010-01-03  2 days
3  2010-01-04  3 days
4  2010-01-05  4 days
5  2010-01-06  5 days
6  2010-01-07  6 days
7  2010-01-08  7 days
8  2010-01-09  8 days
9  2010-01-10  9 days
10 2010-01-11 10 days
11 2010-01-12 11 days
12 2010-01-13 12 days
13 2010-01-14 13 days
14 2010-01-15 14 days
15 2010-01-16 15 days
16 2010-01-17 16 days
17 2010-01-18 17 days
18 2010-01-19 18 days
19 2010-01-20 19 days
20 2010-01-21 20 days
21 2010-01-22 21 days
22 2010-01-23 22 days
23 2010-01-24 23 days
24 2010-01-25 24 days
25 2010-01-26 25 days
26 2010-01-27 26 days
27 2010-01-28 27 days
28 2010-01-29 28 days
29 2010-01-30 29 days
30 2010-01-31 30 days
31 2010-02-01 31 days

In [83]:
df['total_seconds'] = df['delta'].dt.total_seconds()
df

Out[83]:
         date   delta  total_seconds
0  2010-01-01  0 days              0
1  2010-01-02  1 days          86400
2  2010-01-03  2 days         172800
3  2010-01-04  3 days         259200
4  2010-01-05  4 days         345600
5  2010-01-06  5 days         432000
6  2010-01-07  6 days         518400
7  2010-01-08  7 days         604800
8  2010-01-09  8 days         691200
9  2010-01-10  9 days         777600
10 2010-01-11 10 days         864000
11 2010-01-12 11 days         950400
12 2010-01-13 12 days        1036800
13 2010-01-14 13 days        1123200
14 2010-01-15 14 days        1209600
15 2010-01-16 15 days        1296000
16 2010-01-17 16 days        1382400
17 2010-01-18 17 days        1468800
18 2010-01-19 18 days        1555200
19 2010-01-20 19 days        1641600
20 2010-01-21 20 days        1728000
21 2010-01-22 21 days        1814400
22 2010-01-23 22 days        1900800
23 2010-01-24 23 days        1987200
24 2010-01-25 24 days        2073600
25 2010-01-26 25 days        2160000
26 2010-01-27 26 days        2246400
27 2010-01-28 27 days        2332800
28 2010-01-29 28 days        2419200
29 2010-01-30 29 days        2505600
30 2010-01-31 30 days        2592000
31 2010-02-01 31 days        2678400
like image 150
EdChum Avatar answered Oct 02 '22 06:10

EdChum