Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to select last 5 rows of each unique records in pandas

Using python 3 am trying for each uniqe row in the column 'Name' to get the last 5 records from the column 'Number'. How exactly can this be done in python? My df looks like this:

Name    Number
a   5
a   6
b   7
b   8
a   9
a   10
b   11
b   12
a   9
b   8

I saw same exmples(like this one Get sum of last 5 rows for each unique id ) in SQL but that is time consuming and I would like to learn how to do it in python.

My expected output df would be like this:

Name    1   2   3   4   5
a   5   6   9   10  9
b   7   8   11  12  8
like image 777
Alexandru Costinel Avatar asked Jun 27 '19 12:06

Alexandru Costinel


2 Answers

I think you need something like this:

df_out = df.groupby('Name').tail(5)
df_out.set_index(['Name', df_out.groupby('Name').cumcount() +1])['Number'].unstack()

Output:

      1  2   3   4  5
Name                 
a     5  6   9  10  9
b     7  8  11  12  8
like image 148
Scott Boston Avatar answered Nov 15 '22 05:11

Scott Boston


Looks like you need pivot after a groupby.cumcount()

df1=df.groupby('Name').tail(5)
final=(df1.assign(k=df1.groupby('Name').cumcount()+1)
          .pivot(index='Name', columns='k', values='Number')
          .reset_index().rename_axis(None, axis=1))
print(final)

  Name  1  2   3   4  5
0    a  5  6   9  10  9
1    b  7  8  11  12  8
like image 36
anky Avatar answered Nov 15 '22 04:11

anky