I am trying to find the mean duration in a pandas dataframe. I have tried the following code and receive the error:
TypeError: Could not convert 1:10:4200:38:5800:42:142:30:4100:19:22 to numeric
Code:
import pandas as pd
duration=['1:10:42','38:58','42:14','2:30:41','19:22']
dist=[8,5,6,17,3]
dd=list(zip(duration,dist))
df=pd.DataFrame(dd,columns=['duration','dist'])
print(df)
print('')
max_dist=df['dist'].max()
mean_dist=df['dist'].mean()
df['duration'] = df['duration'].apply(lambda x: x if len(str(x)) ==7 else '00:'+str(x))
print(df['duration'])
pd.to_datetime(df['duration'],format='%H:%M:%S').dt.time
max_duration=df['duration'].max()
mean_duration=df['duration'].mean()
print('')
print('max dist =',max_dist,'ave dist =',mean_dist)
print('max duration =',max_duration,'ave duration =',mean_duration)
The max duration returns the correct value. Does the error message mean that the datetime format cannot be used for the mean or is there another way that I'm missing? Any help is appreciated.
Assign after pd.to_timedelta and find mean i.e
df['duration'] = pd.to_timedelta(df['duration'])
print('max duration =',max_duration,'ave duration =',df['duration'].mean())
Output:
max duration = 02:30:41 ave duration = 0 days 01:04:23.400000
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With