Python Partition By

Question

I have the below dataframe df:

| Staff_ID | Join_Date | Time_Stamp |
|----------|-----------|------------|
| 1        | 3/29/2016 | 4/23/2016  |
| 1        | 3/29/2016 | 3/29/2016  |
| 1        | 3/29/2016 | 6/21/2016  |
| 2        | 5/15/2016 | 4/1/2016   |
| 2        | 5/15/2016 | 5/25/2016  |
| 3        | 7/24/2016 | 6/21/2016  |
| 3        | 7/24/2016 | 6/10/2016  |
| 3        | 7/24/2016 | 4/21/2016  |

I want to get the Min and Max "Time_Stamp Date" partition by "Staff_ID" such that the resultant dataframe is as follows:

| Staff_ID | Join_Date | Time_Stamp | Min_Time_Stamp | Max_Time_Stamp |
|----------|-----------|------------|----------------|----------------|
| 1        | 3/29/2016 | 4/23/2016  | 3/29/2016      | 6/21/2016      |
| 1        | 3/29/2016 | 3/29/2016  | 3/29/2016      | 6/21/2016      |
| 1        | 3/29/2016 | 6/21/2016  | 3/29/2016      | 6/21/2016      |
| 2        | 5/15/2016 | 4/1/2016   | 4/1/2016       | 5/25/2016      |
| 2        | 5/15/2016 | 5/25/2016  | 4/1/2016       | 5/25/2016      |
| 3        | 7/24/2016 | 6/21/2016  | 4/21/2016      | 6/21/2016      |
| 3        | 7/24/2016 | 6/10/2016  | 4/21/2016      | 6/21/2016      |
| 3        | 7/24/2016 | 4/21/2016  | 4/21/2016      | 6/21/2016      |

How can I do this in Python ?

Scott Boston · Accepted Answer

Let's use groupby with transform and assign:

g = df.groupby('Staff_ID')['Time_Stamp']
df.assign(Min_Time_Stamp = g.transform(min), Max_Time_Stamp = g.transform(max))

Output:

     Staff_ID    Join_Date    Time_Stamp Max_Time_Stamp Min_Time_Stamp
1   1           3/29/2016    4/23/2016      6/21/2016      3/29/2016  
2   1           3/29/2016    3/29/2016      6/21/2016      3/29/2016  
3   1           3/29/2016    6/21/2016      6/21/2016      3/29/2016  
4   2           5/15/2016    4/1/2016       5/25/2016      4/1/2016   
5   2           5/15/2016    5/25/2016      5/25/2016      4/1/2016   
6   3           7/24/2016    6/21/2016      6/21/2016      4/21/2016  
7   3           7/24/2016    6/10/2016      6/21/2016      4/21/2016  
8   3           7/24/2016    4/21/2016      6/21/2016      4/21/2016

Timings:

@CarlesMitjans method:

10 loops, best of 3: 33.3 ms per loop

@ScottBoston method:

100 loops, best of 3: 5.52 ms per loop

Python Partition By

Tags:

python

timestamp

group-by

partition

Symphony

1 Answers

Timings:

Scott Boston

Recent Activity

Donate For Us

Python Partition By

Tags:

python

timestamp

group-by

partition

Symphony

1 Answers

Timings:

Scott Boston

Related questions

Recent Activity

Donate For Us