Is there any exception handeling mechanism in dplyr's mutate()
? What I mean is a way to catch exceptions and handle them.
Let us suppose that I have a function that throws an error in some cases (in the example if the input is negative), for the sake of simplicity I define the function, but in real life it will be a function in some R package. Let us suppose this function is vectorized:
# function throwing an error
my_func <- function(x){
if(x > 0) return(sqrt(x))
stop('x must be positive')
}
my_func_vect <- Vectorize(my_func)
Now, let's suppose I want to use this function inside mutate()
.
If this function is used inside a mutate()
, it stops at the first error and no result is returned:
library(dplyr)
# dummy data
data <- data.frame(x = c(1, -1, 4, 9))
data %>% mutate(y = my_func_vect(x))
# Error in mutate_impl(.data, dots) : Evaluation error: x must be positive.
Is there a way to catch the error, and do something (e.g. return an NA
) in this case, while getting results for the other elements?
The result I expect is what would be achieved using a loop with tryCatch()
, i.e. something along the lines of:
y <- rep(NA_real_, length(data$x))
for(i in seq_along(data$x)) {
tryCatch({
y[i] <- my_func_vect(data$x[i])
}, error = function(err){})
}
y
# Result is: 1 NA 2 4
We can also make use of purrr
's safely()
or possibly()
functions.
From the purrr
help:
safely: wrapped function instead returns a list with components result and error. One value is always NULL.
quietly: wrapped function instead returns a list with components result, output, messages and warnings.
possibly: wrapped function uses a default value (otherwise) whenever an error occurs.
It doesn't change the fact that you have to apply the function to each row separately.
library(dplyr)
library(purrr)
# function throwing an error
my_func <- function(x){
if(x > 0) return(sqrt(x))
stop('x must be positive')
}
my_func_vect <- Vectorize(my_func)
# dummy data
data <- data.frame(x = c(1, -1, 4, 9))
data %>%
mutate(y = map_dbl(x, ~possibly(my_func_vect, otherwise = NA_real_)(.x)))
#> x y
#> 1 1 1
#> 2 -1 NA
#> 3 4 2
#> 4 9 3
rowwise()
:data %>%
rowwise() %>%
mutate(y = possibly(my_func_vect, otherwise = NA_real_)(x))
#> Source: local data frame [4 x 2]
#> Groups: <by row>
#>
#> # A tibble: 4 x 2
#> x y
#> <dbl> <dbl>
#> 1 1 1
#> 2 -1 NA
#> 3 4 2
#> 4 9 3
The others functions are somewhat more difficult to use and apply in a 'data-frame environment', as they are more suited to work with lists, and returns such.
Created on 2018-05-15 by the reprex package (v0.2.0).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With