I have a column of timedelta in pandas. It is in the format x days 00:00:00. I want to filter out and flag the rows which have a value >=30 minutes. I have no clue how to do that using pandas. I tried booleans and if statements but it didn't work. Any help would be appreciated.
You can use df[df["Courses"] == 'Spark'] to filter rows by a condition in pandas DataFrame. Not that this expression returns a new DataFrame with selected rows. You can also write the above statement with a variable.
To filter rows based on dates, first format the dates in the DataFrame to datetime64 type. Then use the DataFrame. loc[] and DataFrame. query[] function from the Pandas package to specify a filter condition.
You can access various components of the Timedelta or TimedeltaIndex directly using the attributes days,seconds,microseconds,nanoseconds . These are identical to the values returned by datetime. timedelta , in that, for example, the . seconds attribute represents the number of seconds >= 0 and < 1 day.
You can convert timedelta
s to seconds by total_seconds
and compare with scalar:
df = df[df['col'].dt.total_seconds() < 30]
Or compare with Timedelta
:
df = df[df['col'] < pd.Timedelta(30, unit='s')]
Sample:
df = pd.DataFrame({'col':pd.to_timedelta(['25:10:01','00:01:20','00:00:20'])})
print (df)
col
0 1 days 01:10:01
1 0 days 00:01:20
2 0 days 00:00:20
df = df[df['col'].dt.total_seconds() < 30]
print (df)
col
2 00:00:20
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With