How do I drop duplicates and keep the last timestamp on pandas

Name: How to Drop Duplicates using drop_duplicates() function in Python Pandas
Uploaded: 2022-09-14 09:15:30
Description: How do I drop duplicates and keep the last timestamp on pandasI want to drop duplicates and keep the last

Question

I want to drop duplicates and keep the last timestamp. The duplicates that want to be dropped is customer_id and var_name .Here's my data

    customer_id  value   var_name     timestamp
    1            1       apple        2018-03-22 00:00:00.000        
    2            3       apple        2018-03-23 08:00:00.000
    2            4       apple        2018-03-24 08:00:00.000
    1            1       orange       2018-03-22 08:00:00.000
    2            3       orange       2018-03-24 08:00:00.000
    2            5       orange       2018-03-23 08:00:00.000

So the result will be

    customer_id  value   var_name     timestamp
    1            1       apple        2018-03-22 00:00:00.000        
    2            4       apple        2018-03-24 08:00:00.000
    1            1       orange       2018-03-22 08:00:00.000
    2            3       orange       2018-03-24 08:00:00.000

jezrael · Accepted Answer

I think need sort_values with drop_duplicates:

df = df.sort_values('timestamp').drop_duplicates(['customer_id','var_name'], keep='last')
print (df)
   customer_id  value var_name                timestamp
0            1      1    apple  2018-03-22 00:00:00.000
3            1      1   orange  2018-03-22 08:00:00.000
2            2      4    apple  2018-03-24 08:00:00.000
4            2      3   orange  2018-03-24 08:00:00.000

If dont need sorting - order is important:

df = df.loc[df.groupby(['customer_id','var_name'], sort=False)['timestamp'].idxmax()]
print (df)
   customer_id  value var_name           timestamp
0            1      1    apple 2018-03-22 00:00:00
2            2      4    apple 2018-03-24 08:00:00
3            1      1   orange 2018-03-22 08:00:00
4            2      3   orange 2018-03-24 08:00:00

How do I drop duplicates and keep the last timestamp on pandas

Tags:

python

timestamp

pandas

dataframe

Nabih Bawazir

Video Answer

1 Answers

jezrael

Recent Activity

Donate For Us

How do I drop duplicates and keep the last timestamp on pandas

Tags:

python

timestamp

pandas

dataframe

Nabih Bawazir

Video Answer

1 Answers

jezrael

Related questions

Recent Activity

Donate For Us