When calling slice(df, i)
in the dplyr package for R, if the row index I ask for doesn't exist (nrows < i
), it appears to return all the rows but the first of the group, like I had called slice(df, -1)
.
For example:
library(dplyr)
c1 <- c("a","b","c")
c2 <- 1:3
df <- data.frame(c1,c2)
slice(df,2)
The result will be as expected:
b 2
But if I call
slice(df, 5)
the result is every row but the first row:
b 2
c 3
This is especially irksome when using group_by()
and THEN calling slice()
on the groups. Is there a logical reason why slice()
is doing this?
It seems like returning row(s) filled with NAs for row indices larger than 'nrows' in groups not "tall enough" to produce the requested slice could be a useful result.
This came up as I was trying to extract a ranked result from each group, but some groups did not have enough data while others did. e.g. "List the 10th highest sales-producing salesperson from each region." But in one of the regions there are only 8 salespersons.
Slice does not work with relational databases because they have no intrinsic notion of row order. If you want to perform the equivalent operation, use filter() and row_number() .
You can use the slice() function from the dplyr package in R to subset rows based on their integer locations.
I'm kinda late to this party but here goes. There is a really simple solution to the error message "Error: incompatible types, expecting a character vector"
just insert ungroup()
prior to your mutate()
function and you should be OK.
But I think its a bug of some type in slice()
. I will file a bug report.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With