Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Drop last n rows within pandas dataframe groupby

I have a dataframe df where I want to drop last n rows within a group of columns. For example, say df is defined as below the group is of columns a and b:

>>> import pandas as pd
>>> df = pd.DataFrame({'a':['abd']*4 + ['pqr']*5 + ['xyz']*7, 'b':['john']*7 + ['doe']*9, 'c':range(16), 'd':range(1000,1016)})
>>> df
      a     b   c     d
0   abd  john   0  1000
1   abd  john   1  1001
2   abd  john   2  1002
3   abd  john   3  1003
4   pqr  john   4  1004
5   pqr  john   5  1005
6   pqr  john   6  1006
7   pqr   doe   7  1007
8   pqr   doe   8  1008
9   xyz   doe   9  1009
10  xyz   doe  10  1010
11  xyz   doe  11  1011
12  xyz   doe  12  1012
13  xyz   doe  13  1013
14  xyz   doe  14  1014
15  xyz   doe  15  1015
>>> 

Desired output for n=2 is as follows:

>>> df
      a     b   c     d
0   abd  john   0  1000
1   abd  john   1  1001
4   pqr  john   4  1004
9   xyz   doe   9  1009
10  xyz   doe  10  1010
11  xyz   doe  11  1011
12  xyz   doe  12  1012
13  xyz   doe  13  1013
>>>

Desired output for n=3 is as follows:

>>> df
      a     b   c     d
0   abd  john   0  1000
9   xyz   doe   9  1009
10  xyz   doe  10  1010
11  xyz   doe  11  1011
12  xyz   doe  12  1012
>>> 
like image 862
Gerry Avatar asked Jul 13 '20 18:07

Gerry


People also ask

How do I drop the last n rows of a data frame?

We can remove the last n rows using the drop() method. drop() method gets an inplace argument which takes a boolean value. If inplace attribute is set to True then the dataframe gets updated with the new value of dataframe (dataframe with last n rows removed).

How do you drop the last 5 rows in pandas?

Drop Last Row of Pandas DataFrame Using head() Function You can also use df. head(df. shape[0] -1) to remove the last row of pandas DataFrame.

How do you drop 10 rows in pandas?

Using iloc[] to Drop First N Rows of DataFrame Use DataFrame. iloc[] the indexing syntax [n:] with n as an integer to select the first n rows from pandas DataFrame. For example df. iloc[n:] , substitute n with the integer number specifying how many rows you wanted to delete.

How do I drop multiple rows in pandas based on condition?

We can drop single or multiple columns from the dataframe just by passing the name of columns and by setting up the axis =1.


1 Answers

You can use groupby and drop as below:

n = 2
df.drop(df.groupby(['a','b']).tail(n).index, axis=0)
like image 61
nimbous Avatar answered Oct 12 '22 11:10

nimbous