I'm wondering if anyone can explain the behavior of dplyr::slice_min() /dplyr::slice_max() with regards to the with_ties argument. For grouped data, why does the function exclude NA values when with_ties = TRUE but includes NA values when with_ties = FALSE? Reprex below:
library(tidyverse)
tbl <- tibble(ID = rep(c("a","b","c","d"), each = 3),
measure = c(NA, NA, NA, NA, 1, 1, 2, 3, 4, NA, NA, NA))
tbl |>
group_by(ID) |>
slice_max(measure, with_ties = TRUE)
# A tibble: 3 × 2
# Groups: ID [2]
ID measure
<chr> <dbl>
1 b 1
2 b 1
3 c 4
tbl |>
group_by(ID) |>
slice_max(measure, with_ties = FALSE)
# A tibble: 4 × 2
# Groups: ID [4]
ID measure
<chr> <dbl>
1 a NA
2 b 1
3 c 4
4 d NA
This inconsistency seems to have been acknowledged very recently (23rd March 2022) in this GitHub pull request, but the change has not been done yet.
When the with_ties argument was set to FALSE NAs w[h]ere not ignored anymore. This PR fixes that.
The default behavior should be to ignore NAs.
In the meantime, you can still use tidyr::drop_na:
tbl |>
group_by(ID) |>
slice_max(measure, with_ties = FALSE) |>
drop_na()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With