I have started to recieve a warning when using selecting functions within tidyverse
packages.
Example:
library(dplyr)
set.seed(123)
df = data.frame(
"id" = c(rep("G1", 3), rep("G2", 4), rep("G3", 3)),
"total" = sample.int(n = 10),
"C1" = sample.int(n=10),
"C2" = sample.int(n=10),
"C3" = sample.int(n=10))
cols.to.sum = c("C1", "C2")
df.selected = df %>%
dplyr::select(total, cols.to.sum)
Giving:
Note: Using an external vector in selections is ambiguous.
i Use `all_of(cols.to.sum)` instead of `cols.to.sum` to silence this message.
i See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
This message is displayed once per session.
It does not warning if I refactor to:
df.selected = df %>%
dplyr::select(total, all_of(cols.to.sum))
This behaviour has changed from tidyselect_0.2.5
to tidyselect_1.0.0
. There was no warning untill now.
On documentation about this change (https://tidyselect.r-lib.org/reference/faq-external-vector.html) it is stated that this is just a warning but it will turn into an error in the future.
My question here is how to deal such a change regarding the existing code.
Should I refactor every single line of code that uses this selection method to add the all_of()
to external vector reference? That sounds something hard to accomplish when there might be hundreds of pieces in code where a selection has been made this way (it also affects to other functions such as
summarise_at
for example).
Would the only alternative be to stick to tidyselect_0.2.5
to keep running code working?
What is the way to go on changes like this in a package regarding the existing code?
Thanks
any_of: Match any of these characters exactly once. With this particular expression, you can tell the regex engine to match only one out of several characters. Simply place the characters you want to match between square brackets.
Description A backend for the selecting functions of the 'tidyverse'. It makes it easy to implement select-like functions in your own packages in a way that is consistent with other 'tidyverse' interfaces for selection.
select() function in R Language is used to choose whether a column of the data frame is selected or not.
If should is the operative phrase in your first question then it might just be a matter of ensuring that none of your variables are named cols.to.sum
. So long as this is the case, the attributes of using all_of()
are not going to be relevant to your use case and you can keep select
ing as usual.
If you don't want to stick to using an older version of tidyselect
the suppress library might be helpful
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With