I would like to analyze statistics per cars which were repairs and which are new. Data sample is:
Name IsItNew ControlDate
Car1 True 31/01/2018
Car2 True 28/02/2018
Car1 False 15/03/2018
Car2 True 16/04/2018
Car3 True 30/04/2018
Car2 False 25/05/2018
Car1 False 30/05/2018
So, I should groupby
by Name and if there is a False
in IsItNew
column I should set False
and the first date, when False
was happened.
I tried groupby
with nunique()
:
df = df.groupby(['Name','IsItNew', 'ControlDate' ])['Name'].nunique()
But, it returns count of unique items in each group.
How can I receive only grouped unique items without any count?
Actual result is:
Name IsItNew ControlDate
Car1 True 31/01/2018 1
False 15/03/2018 1
30/05/2018 1
Car2 True 28/02/2018 1
16/04/2018 1
False 25/05/2018 1
Car3 True 30/04/2018 1
Expected Result is:
Name IsItNew ControlDate
Car1 False 15/03/2018
Car2 False 25/05/2018
Car3 True 30/04/2018
I'd appreciate for any idea. Thanks)
One way to do it would be to GroupBy
the Name
, and aggregate on IsItNew
with two functions. A custom one using any
to check if there are any False
values, and idxmin
, to find the index of the first False
, which you can later on use to index the dataframe on ControlDate
:
df_ = df.groupby('Name').agg({'IsItNew':
{'IsItNew':lambda x: ~(~x).any(),
'ControlDate':'idxmin'}})
.droplevel(0, axis=1)
.reset_index()
df_['ControlDate'] = df.loc[df_['ControlDate'].values, 'ControlDate'].reset_index(drop=True)
xName IsItNew ControlDate
0 Car1 False 15/03/2018
1 Car2 False 25/05/2018
2 Car3 True 30/04/2018
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With