I have the below sample DataFrame
             Timestamp Item Char  Value
4  1/7/2020 1:22:22 AM    B  C.B    3.2
0  1/7/2020 1:23:23 AM    A  C.A    1.0
2  1/7/2020 1:23:23 AM    A  C.B    1.3
1  1/7/2020 1:23:24 AM    A  C.A    2.0
5  1/7/2020 1:23:29 AM    B  C.B    3.0
3  1/7/2020 1:25:23 AM    B  C.B    2.0
I would like to add a new column that tells the order an Item appears in the same Char, based on the Timestamp. In particular, I would like to assign 1 to the last value, 2 to the second-last value and so on.
The result should look like as follows
             Timestamp Item Char  Value   Order
0  1/7/2020 1:23:23 AM    A  C.A    1.0   2
1  1/7/2020 1:23:24 AM    A  C.A    2.0   1
2  1/7/2020 1:23:23 AM    A  C.B    1.3   1 
3  1/7/2020 1:22:22 AM    B  C.B    3.2   3
4  1/7/2020 1:23:29 AM    B  C.B    3.0   2
5  1/7/2020 1:25:23 AM    B  C.B    2.0   1
As you see the B item appears several times in the Char C.B. I would assign 1 to the most recent value based on the Timestamp.
My idea is to group the DataFrame by Item and by Char, then order the rows of each group by the Timestamp in descending order, finally assign 1 to the first row, 2 to the second and so on. But I don´t actually know how to do this.
Can you help me out?
Thank you very much!
Let's groupby the column Timestamp on Char and Item and compute the rank using method=first, then use sort_values to sort the dataframe based on Char and Item:
df['Order'] = pd.to_datetime(df['Timestamp'])\
              .groupby([df['Char'], df['Item']])\
              .rank(method='first', ascending=False)
df = df.sort_values(['Char', 'Item'], ignore_index=True)
             Timestamp Item Char  Value  Order
0  1/7/2020 1:23:23 AM    A  C.A    1.0    2.0
1  1/7/2020 1:23:24 AM    A  C.A    2.0    1.0
2  1/7/2020 1:23:23 AM    A  C.B    1.3    1.0
3  1/7/2020 1:22:22 AM    B  C.B    3.2    3.0
4  1/7/2020 1:23:29 AM    B  C.B    3.0    2.0
5  1/7/2020 1:25:23 AM    B  C.B    2.0    1.0
                        Sort and Transform
 df = df.sort_values(['Timestamp'],ascending=False)
 df['Order'] = df.groupby(['Item', 'Char'])['Value'].transform(
     lambda x: np.arange(1, len(x)+1))
Sammple:
import pandas as pd
from io import StringIO 
data = StringIO("""
,Timestamp,Item,Char,Value
0,1/7/2020 1:22:22 AM,B,C.B,3.2
1,1/7/2020 1:23:23 AM,A,C.A,1.0
2,1/7/2020 1:23:23 AM,A,C.B,1.3
3,1/7/2020 1:23:24 AM,A,C.A,2.0
4,1/7/2020 1:23:29 AM,B,C.B,3.0
5,1/7/2020 1:25:23 AM,B,C.B,2.0
""" )
df = pd.read_csv(data, index_col=0)
df['Timestamp'] = pd.to_datetime(df['Timestamp'])
df = df.sort_values(['Timestamp'],ascending=False)
df['Order'] = df.groupby(['Item', 'Char'])['Value'].transform(
     lambda x: np.arange(1, len(x)+1))
 
print (df.sort_values(['Item', 'Timestamp']))
output:
            Timestamp Item Char  Value  Order
1 2020-01-07 01:23:23    A  C.A    1.0    2.0
2 2020-01-07 01:23:23    A  C.B    1.3    1.0
3 2020-01-07 01:23:24    A  C.A    2.0    1.0
0 2020-01-07 01:22:22    B  C.B    3.2    3.0
4 2020-01-07 01:23:29    B  C.B    3.0    2.0
5 2020-01-07 01:25:23    B  C.B    2.0    1.0
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With