Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Weird behavior in dplyr slice for R

Tags:

r

dplyr

When calling slice(df, i) in the dplyr package for R, if the row index I ask for doesn't exist (nrows < i), it appears to return all the rows but the first of the group, like I had called slice(df, -1).

For example:

library(dplyr)

c1 <- c("a","b","c")
c2 <- 1:3
df <- data.frame(c1,c2)

slice(df,2)

The result will be as expected:

b  2

But if I call

slice(df, 5)

the result is every row but the first row:

b  2
c  3

This is especially irksome when using group_by() and THEN calling slice() on the groups. Is there a logical reason why slice() is doing this?

It seems like returning row(s) filled with NAs for row indices larger than 'nrows' in groups not "tall enough" to produce the requested slice could be a useful result.

This came up as I was trying to extract a ranked result from each group, but some groups did not have enough data while others did. e.g. "List the 10th highest sales-producing salesperson from each region." But in one of the regions there are only 8 salespersons.

like image 411
huff Avatar asked May 27 '15 19:05

huff


People also ask

Why is the slice function not working in R?

Slice does not work with relational databases because they have no intrinsic notion of row order. If you want to perform the equivalent operation, use filter() and row_number() .

What does slice () do in R?

You can use the slice() function from the dplyr package in R to subset rows based on their integer locations.


1 Answers

I'm kinda late to this party but here goes. There is a really simple solution to the error message "Error: incompatible types, expecting a character vector"

just insert ungroup() prior to your mutate() function and you should be OK.

But I think its a bug of some type in slice(). I will file a bug report.

like image 95
hackR Avatar answered Sep 20 '22 05:09

hackR