Use drop() to remove last row of pandas dataframe. Use head() function to remove last row of pandas dataframe.
To delete a row from a DataFrame, use the drop() method and set the index label as the parameter.
Use drop() to remove last N rows of pandas dataframe In pandas, the dataframe's drop() function accepts a sequence of row names that it needs to delete from the dataframe.
Use pandas. DataFrame. drop() method to delete/remove rows with condition(s).
To drop last n rows:
df.drop(df.tail(n).index,inplace=True) # drop last n rows
By the same vein, you can drop first n rows:
df.drop(df.head(n).index,inplace=True) # drop first n rows
DF[:-n]
where n is the last number of rows to drop.
To drop the last row :
DF = DF[:-1]
Since index positioning in Python is 0-based, there won't actually be an element in index
at the location corresponding to len(DF)
. You need that to be last_row = len(DF) - 1
:
In [49]: dfrm
Out[49]:
A B C
0 0.120064 0.785538 0.465853
1 0.431655 0.436866 0.640136
2 0.445904 0.311565 0.934073
3 0.981609 0.695210 0.911697
4 0.008632 0.629269 0.226454
5 0.577577 0.467475 0.510031
6 0.580909 0.232846 0.271254
7 0.696596 0.362825 0.556433
8 0.738912 0.932779 0.029723
9 0.834706 0.002989 0.333436
[10 rows x 3 columns]
In [50]: dfrm.drop(dfrm.index[len(dfrm)-1])
Out[50]:
A B C
0 0.120064 0.785538 0.465853
1 0.431655 0.436866 0.640136
2 0.445904 0.311565 0.934073
3 0.981609 0.695210 0.911697
4 0.008632 0.629269 0.226454
5 0.577577 0.467475 0.510031
6 0.580909 0.232846 0.271254
7 0.696596 0.362825 0.556433
8 0.738912 0.932779 0.029723
[9 rows x 3 columns]
However, it's much simpler to just write DF[:-1]
.
Surprised nobody brought this one up:
# To remove last n rows
df.head(-n)
# To remove first n rows
df.tail(-n)
Running a speed test on a DataFrame of 1000 rows shows that slicing and head
/tail
are ~6 times faster than using drop
:
>>> %timeit df[:-1]
125 µs ± 132 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> %timeit df.head(-1)
129 µs ± 1.18 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> %timeit df.drop(df.tail(1).index)
751 µs ± 20.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Just use indexing
df.iloc[:-1,:]
That's why iloc exists. You can also use head or tail.
stats = pd.read_csv("C:\\py\\programs\\second pandas\\ex.csv")
The Output of stats:
A B C
0 0.120064 0.785538 0.465853
1 0.431655 0.436866 0.640136
2 0.445904 0.311565 0.934073
3 0.981609 0.695210 0.911697
4 0.008632 0.629269 0.226454
5 0.577577 0.467475 0.510031
6 0.580909 0.232846 0.271254
7 0.696596 0.362825 0.556433
8 0.738912 0.932779 0.029723
9 0.834706 0.002989 0.333436
just use skipfooter=1
skipfooter : int, default 0
Number of lines at bottom of file to skip
stats_2 = pd.read_csv("C:\\py\\programs\\second pandas\\ex.csv", skipfooter=1, engine='python')
Output of stats_2
A B C
0 0.120064 0.785538 0.465853
1 0.431655 0.436866 0.640136
2 0.445904 0.311565 0.934073
3 0.981609 0.695210 0.911697
4 0.008632 0.629269 0.226454
5 0.577577 0.467475 0.510031
6 0.580909 0.232846 0.271254
7 0.696596 0.362825 0.556433
8 0.738912 0.932779 0.029723
The nicest solution I've found that doesn't (necessarily?) do a fully copy is
df.drop(df.index[-1], inplace=True)
Of course, you can simply omit inplace=True to create a new dataframe, and you can also easily delete the last N rows by simply taking slices of df.index (df.index[-N:]
to drop the last N rows). So this approach is not only concise but also very flexible.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With