Order Pandas DataFrame by groups and Timestamp

Question

I have the below sample DataFrame

             Timestamp Item Char  Value
4  1/7/2020 1:22:22 AM    B  C.B    3.2
0  1/7/2020 1:23:23 AM    A  C.A    1.0
2  1/7/2020 1:23:23 AM    A  C.B    1.3
1  1/7/2020 1:23:24 AM    A  C.A    2.0
5  1/7/2020 1:23:29 AM    B  C.B    3.0
3  1/7/2020 1:25:23 AM    B  C.B    2.0

I would like to add a new column that tells the order an Item appears in the same Char, based on the Timestamp. In particular, I would like to assign 1 to the last value, 2 to the second-last value and so on.

The result should look like as follows

             Timestamp Item Char  Value   Order
0  1/7/2020 1:23:23 AM    A  C.A    1.0   2
1  1/7/2020 1:23:24 AM    A  C.A    2.0   1
2  1/7/2020 1:23:23 AM    A  C.B    1.3   1 
3  1/7/2020 1:22:22 AM    B  C.B    3.2   3
4  1/7/2020 1:23:29 AM    B  C.B    3.0   2
5  1/7/2020 1:25:23 AM    B  C.B    2.0   1

As you see the B item appears several times in the Char C.B. I would assign 1 to the most recent value based on the Timestamp.

My idea is to group the DataFrame by Item and by Char, then order the rows of each group by the Timestamp in descending order, finally assign 1 to the first row, 2 to the second and so on. But I don´t actually know how to do this.

Can you help me out?

Thank you very much!

Shubham Sharma · Accepted Answer

Let's groupby the column Timestamp on Char and Item and compute the rank using method=first, then use sort_values to sort the dataframe based on Char and Item:

df['Order'] = pd.to_datetime(df['Timestamp'])\
              .groupby([df['Char'], df['Item']])\
              .rank(method='first', ascending=False)

df = df.sort_values(['Char', 'Item'], ignore_index=True)

             Timestamp Item Char  Value  Order
0  1/7/2020 1:23:23 AM    A  C.A    1.0    2.0
1  1/7/2020 1:23:24 AM    A  C.A    2.0    1.0
2  1/7/2020 1:23:23 AM    A  C.B    1.3    1.0
3  1/7/2020 1:22:22 AM    B  C.B    3.2    3.0
4  1/7/2020 1:23:29 AM    B  C.B    3.0    2.0
5  1/7/2020 1:25:23 AM    B  C.B    2.0    1.0

mujjiga · Answer

Sort and Transform

 df = df.sort_values(['Timestamp'],ascending=False)
 df['Order'] = df.groupby(['Item', 'Char'])['Value'].transform(
     lambda x: np.arange(1, len(x)+1))

Sammple:

import pandas as pd
from io import StringIO 
data = StringIO("""
,Timestamp,Item,Char,Value
0,1/7/2020 1:22:22 AM,B,C.B,3.2
1,1/7/2020 1:23:23 AM,A,C.A,1.0
2,1/7/2020 1:23:23 AM,A,C.B,1.3
3,1/7/2020 1:23:24 AM,A,C.A,2.0
4,1/7/2020 1:23:29 AM,B,C.B,3.0
5,1/7/2020 1:25:23 AM,B,C.B,2.0
""" )
df = pd.read_csv(data, index_col=0)
df['Timestamp'] = pd.to_datetime(df['Timestamp'])


df = df.sort_values(['Timestamp'],ascending=False)
df['Order'] = df.groupby(['Item', 'Char'])['Value'].transform(
     lambda x: np.arange(1, len(x)+1))
 
print (df.sort_values(['Item', 'Timestamp']))

output:

            Timestamp Item Char  Value  Order
1 2020-01-07 01:23:23    A  C.A    1.0    2.0
2 2020-01-07 01:23:23    A  C.B    1.3    1.0
3 2020-01-07 01:23:24    A  C.A    2.0    1.0
0 2020-01-07 01:22:22    B  C.B    3.2    3.0
4 2020-01-07 01:23:29    B  C.B    3.0    2.0
5 2020-01-07 01:25:23    B  C.B    2.0    1.0

Order Pandas DataFrame by groups and Timestamp

Tags:

python

pandas

Daniel Zito

2 Answers

Shubham Sharma

mujjiga

Recent Activity

Donate For Us

Order Pandas DataFrame by groups and Timestamp

Tags:

python

pandas

Daniel Zito

2 Answers

Shubham Sharma

mujjiga

Related questions

Recent Activity

Donate For Us