I want to create a new column in a pandas data frame that is the elapsed time from the start of the data frame. I am importing a log file into a data frame which has datatime info, but accessing the total_seconds()
function in s_df['delta_t']
is not working. It works if I access the individual elements of the column (s_df['delta_t'].iloc[8].total_seconds()
), but I want to create a new column with total_seconds() and my attempts are failing.
s_df['t'] = s_df.index # s_df['t] is a column of datetime
s_df['delta_t'] = ( s_df['t'] - s_df['t'].iloc[0]) # time since start of data frame
s_df['elapsed_seconds'] = # want column s_df['delta_t'].total_seconds()
You can use the loc and iloc functions to access columns in a Pandas DataFrame. Let's see how. If we wanted to access a certain column in our DataFrame, for example the Grades column, we could simply use the loc function and specify the name of the column in order to retrieve it.
In order to access the series element refers to the index number. Use the index operator [ ] to access an element in a series. The index must be an integer. In order to access multiple elements from a series, we use Slice operation.
use .dt accessor:
s_df['elapsed_seconds'] = s_df['delta_t'].dt.total_seconds()
Example:
In [82]:
df = pd.DataFrame({'date': pd.date_range(dt.datetime(2010,1,1), dt.datetime(2010,2,1))})
df['delta'] = df['date'] - df['date'].iloc[0]
df
Out[82]:
date delta
0 2010-01-01 0 days
1 2010-01-02 1 days
2 2010-01-03 2 days
3 2010-01-04 3 days
4 2010-01-05 4 days
5 2010-01-06 5 days
6 2010-01-07 6 days
7 2010-01-08 7 days
8 2010-01-09 8 days
9 2010-01-10 9 days
10 2010-01-11 10 days
11 2010-01-12 11 days
12 2010-01-13 12 days
13 2010-01-14 13 days
14 2010-01-15 14 days
15 2010-01-16 15 days
16 2010-01-17 16 days
17 2010-01-18 17 days
18 2010-01-19 18 days
19 2010-01-20 19 days
20 2010-01-21 20 days
21 2010-01-22 21 days
22 2010-01-23 22 days
23 2010-01-24 23 days
24 2010-01-25 24 days
25 2010-01-26 25 days
26 2010-01-27 26 days
27 2010-01-28 27 days
28 2010-01-29 28 days
29 2010-01-30 29 days
30 2010-01-31 30 days
31 2010-02-01 31 days
In [83]:
df['total_seconds'] = df['delta'].dt.total_seconds()
df
Out[83]:
date delta total_seconds
0 2010-01-01 0 days 0
1 2010-01-02 1 days 86400
2 2010-01-03 2 days 172800
3 2010-01-04 3 days 259200
4 2010-01-05 4 days 345600
5 2010-01-06 5 days 432000
6 2010-01-07 6 days 518400
7 2010-01-08 7 days 604800
8 2010-01-09 8 days 691200
9 2010-01-10 9 days 777600
10 2010-01-11 10 days 864000
11 2010-01-12 11 days 950400
12 2010-01-13 12 days 1036800
13 2010-01-14 13 days 1123200
14 2010-01-15 14 days 1209600
15 2010-01-16 15 days 1296000
16 2010-01-17 16 days 1382400
17 2010-01-18 17 days 1468800
18 2010-01-19 18 days 1555200
19 2010-01-20 19 days 1641600
20 2010-01-21 20 days 1728000
21 2010-01-22 21 days 1814400
22 2010-01-23 22 days 1900800
23 2010-01-24 23 days 1987200
24 2010-01-25 24 days 2073600
25 2010-01-26 25 days 2160000
26 2010-01-27 26 days 2246400
27 2010-01-28 27 days 2332800
28 2010-01-29 28 days 2419200
29 2010-01-30 29 days 2505600
30 2010-01-31 30 days 2592000
31 2010-02-01 31 days 2678400
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With