With the dat
below. How can I make a new dataframe subset that includes all values except the first five rows for each IndID? Said differently I want new data frame with the first 5 rows for each IndID excluded.
set.seed(123)
dat <- data.frame(IndID = rep(c("AAA", "BBB", "CCC", "DDD"), each = 10),
Number = sample(1:100,40))
I have seen a number of SO posts that select data, but I am not sure how to remove as mentioned above.
To remove first few rows from each group in R, we can use slice function of dplyr package after grouping with group_by function.
In this article, we will discuss different ways to delete first row of a pandas dataframe in python. Use iloc to drop first row of pandas dataframe. Use drop() to remove first row of pandas dataframe. Use tail() function to remove first row of pandas dataframe.
To remove rows with an in R we can use the na. omit() and <code>drop_na()</code> (tidyr) functions. For example, na. omit(YourDataframe) will drop all rows with an.
We can use dplyr
's slice()
functionality:
dat %>%
group_by(IndID) %>%
slice(6:n())
In base R, tapply()
is handy when used on a sequence of row numbers with tail()
.
idx <- unlist(tapply(1:nrow(dat), dat$IndID, tail, -5))
dat[idx, ]
Note that this will be more efficient with use.names=FALSE
in unlist()
.
With data.table, you can do the following with tail()
.
library(data.table)
setDT(dat)[dat[, tail(.I, -5), by=IndID]$V1]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With