I have the below sample DataFrame
Timestamp Item Char Value
4 1/7/2020 1:22:22 AM B C.B 3.2
0 1/7/2020 1:23:23 AM A C.A 1.0
2 1/7/2020 1:23:23 AM A C.B 1.3
1 1/7/2020 1:23:24 AM A C.A 2.0
5 1/7/2020 1:23:29 AM B C.B 3.0
3 1/7/2020 1:25:23 AM B C.B 2.0
I would like to add a new column that tells the order an Item appears in the same Char, based on the Timestamp. In particular, I would like to assign 1 to the last value, 2 to the second-last value and so on.
The result should look like as follows
Timestamp Item Char Value Order
0 1/7/2020 1:23:23 AM A C.A 1.0 2
1 1/7/2020 1:23:24 AM A C.A 2.0 1
2 1/7/2020 1:23:23 AM A C.B 1.3 1
3 1/7/2020 1:22:22 AM B C.B 3.2 3
4 1/7/2020 1:23:29 AM B C.B 3.0 2
5 1/7/2020 1:25:23 AM B C.B 2.0 1
As you see the B item appears several times in the Char C.B. I would assign 1 to the most recent value based on the Timestamp.
My idea is to group the DataFrame by Item and by Char, then order the rows of each group by the Timestamp in descending order, finally assign 1 to the first row, 2 to the second and so on. But I don´t actually know how to do this.
Can you help me out?
Thank you very much!
Let's groupby
the column Timestamp
on Char
and Item
and compute the rank
using method=first
, then use sort_values
to sort the dataframe based on Char
and Item
:
df['Order'] = pd.to_datetime(df['Timestamp'])\
.groupby([df['Char'], df['Item']])\
.rank(method='first', ascending=False)
df = df.sort_values(['Char', 'Item'], ignore_index=True)
Timestamp Item Char Value Order
0 1/7/2020 1:23:23 AM A C.A 1.0 2.0
1 1/7/2020 1:23:24 AM A C.A 2.0 1.0
2 1/7/2020 1:23:23 AM A C.B 1.3 1.0
3 1/7/2020 1:22:22 AM B C.B 3.2 3.0
4 1/7/2020 1:23:29 AM B C.B 3.0 2.0
5 1/7/2020 1:25:23 AM B C.B 2.0 1.0
Sort
and Transform
df = df.sort_values(['Timestamp'],ascending=False)
df['Order'] = df.groupby(['Item', 'Char'])['Value'].transform(
lambda x: np.arange(1, len(x)+1))
Sammple:
import pandas as pd
from io import StringIO
data = StringIO("""
,Timestamp,Item,Char,Value
0,1/7/2020 1:22:22 AM,B,C.B,3.2
1,1/7/2020 1:23:23 AM,A,C.A,1.0
2,1/7/2020 1:23:23 AM,A,C.B,1.3
3,1/7/2020 1:23:24 AM,A,C.A,2.0
4,1/7/2020 1:23:29 AM,B,C.B,3.0
5,1/7/2020 1:25:23 AM,B,C.B,2.0
""" )
df = pd.read_csv(data, index_col=0)
df['Timestamp'] = pd.to_datetime(df['Timestamp'])
df = df.sort_values(['Timestamp'],ascending=False)
df['Order'] = df.groupby(['Item', 'Char'])['Value'].transform(
lambda x: np.arange(1, len(x)+1))
print (df.sort_values(['Item', 'Timestamp']))
output:
Timestamp Item Char Value Order
1 2020-01-07 01:23:23 A C.A 1.0 2.0
2 2020-01-07 01:23:23 A C.B 1.3 1.0
3 2020-01-07 01:23:24 A C.A 2.0 1.0
0 2020-01-07 01:22:22 B C.B 3.2 3.0
4 2020-01-07 01:23:29 B C.B 3.0 2.0
5 2020-01-07 01:25:23 B C.B 2.0 1.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With