I have the following DataFrame:
f_1 f_2 f_3
00:00:00 False False False
00:05:22 True False False
00:06:40 True False False
00:06:41 False False False
00:06:42 False False False
00:06:43 False False False
00:06:44 False False False
00:06:46 False False False
00:06:58 False False False
and I want to compute the total duration of when a Series was True. In this example, the only series that became True for a while was f_1. Currently, I use the following code:
result = pandas.Timedelta(0)
for _, series in falsePositives.iteritems():
previousTime = None
previousValue = None
for currentTime, currentValue in series.iteritems():
if previousValue:
result += (currentTime - previousTime)
previousTime = currentTime
previousValue = currentValue
print result.total_seconds()
Is there a better solution? I reckon there is already a method in Pandas which is doing either this or something similar to this.
I think you can create Series from index by to_series, difference by diff and shift by shift and last get dt.total_seconds.
Last multiple boolean DataFrame by mul and last get sum:
#if necessary convert index to Timedelta
df.index = pd.to_timedelta(df.index)
s = df.index.to_series().diff().shift(-1).dt.total_seconds()
df1 = df.mul(s, 0)
print (df1)
f_1 f_2 f_3
00:00:00 0.0 0.0 0.0
00:05:22 78.0 0.0 0.0
00:06:40 1.0 0.0 0.0
00:06:41 0.0 0.0 0.0
00:06:42 0.0 0.0 0.0
00:06:43 0.0 0.0 0.0
00:06:44 0.0 0.0 0.0
00:06:46 0.0 0.0 0.0
00:06:58 NaN NaN NaN
print (df1.sum())
f_1 79.0
f_2 0.0
f_3 0.0
dtype: float64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With