I have a data frame and I want to remove last N rows from it. If I want to remove 5 rows, I currently use the following command, which in my opinion is rather convoluted:
df<- df[-seq(nrow(df),nrow(df)-4),]
How would you accomplish task, is there a convenient function that I can use in R?
In unix, I would use:
tac file | sed '1,5d' | tac
We can remove the last n rows using the drop() method. drop() method gets an inplace argument which takes a boolean value. If inplace attribute is set to True then the dataframe gets updated with the new value of dataframe (dataframe with last n rows removed).
Using iloc[] to Drop First N Rows of DataFrameUse DataFrame. iloc[] the indexing syntax [n:] with n as an integer to select the first n rows from pandas DataFrame. For example df. iloc[n:] , substitute n with the integer number specifying how many rows you wanted to delete.
You can also use DataFrame. drop() method to delete the last n columns. Use axis=1 to specify the columns and inplace=True to apply the change on the existing DataFrame.
head
with a negative index is convenient for this...
df <- data.frame( a = 1:10 ) head(df,-5) # a #1 1 #2 2 #3 3 #4 4 #5 5
p.s. your seq()
example may be written slightly less(?) awkwardly using the named arguments by
and length.out
(shortened to len
) like this -seq(nrow(df),by=-1,len=5)
.
This one takes one more line, but is far more readable:
n<-dim(df)[1] df<-df[1:(n-5),]
Of course, you can do it in one line by sticking the dim
command directly into the re-assignment statement. I assume this is part of a reproducible script, and you can retrace your steps... Otherwise, strongly recommend in such cases to save to a different variable (e.g., df2
) and then remove the redundant copy only after you're sure you got what you wanted.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With