I have a dataframe df
where I want to drop last n
rows within a group of columns. For example, say df
is defined as below the group is of columns a
and b
:
>>> import pandas as pd
>>> df = pd.DataFrame({'a':['abd']*4 + ['pqr']*5 + ['xyz']*7, 'b':['john']*7 + ['doe']*9, 'c':range(16), 'd':range(1000,1016)})
>>> df
a b c d
0 abd john 0 1000
1 abd john 1 1001
2 abd john 2 1002
3 abd john 3 1003
4 pqr john 4 1004
5 pqr john 5 1005
6 pqr john 6 1006
7 pqr doe 7 1007
8 pqr doe 8 1008
9 xyz doe 9 1009
10 xyz doe 10 1010
11 xyz doe 11 1011
12 xyz doe 12 1012
13 xyz doe 13 1013
14 xyz doe 14 1014
15 xyz doe 15 1015
>>>
Desired output for n=2
is as follows:
>>> df
a b c d
0 abd john 0 1000
1 abd john 1 1001
4 pqr john 4 1004
9 xyz doe 9 1009
10 xyz doe 10 1010
11 xyz doe 11 1011
12 xyz doe 12 1012
13 xyz doe 13 1013
>>>
Desired output for n=3
is as follows:
>>> df
a b c d
0 abd john 0 1000
9 xyz doe 9 1009
10 xyz doe 10 1010
11 xyz doe 11 1011
12 xyz doe 12 1012
>>>
We can remove the last n rows using the drop() method. drop() method gets an inplace argument which takes a boolean value. If inplace attribute is set to True then the dataframe gets updated with the new value of dataframe (dataframe with last n rows removed).
Drop Last Row of Pandas DataFrame Using head() Function You can also use df. head(df. shape[0] -1) to remove the last row of pandas DataFrame.
Using iloc[] to Drop First N Rows of DataFrame Use DataFrame. iloc[] the indexing syntax [n:] with n as an integer to select the first n rows from pandas DataFrame. For example df. iloc[n:] , substitute n with the integer number specifying how many rows you wanted to delete.
We can drop single or multiple columns from the dataframe just by passing the name of columns and by setting up the axis =1.
You can use groupby
and drop
as below:
n = 2
df.drop(df.groupby(['a','b']).tail(n).index, axis=0)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With