Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

tryCatch inside dplyr's mutate?

Is there any exception handeling mechanism in dplyr's mutate()? What I mean is a way to catch exceptions and handle them.

Let us suppose that I have a function that throws an error in some cases (in the example if the input is negative), for the sake of simplicity I define the function, but in real life it will be a function in some R package. Let us suppose this function is vectorized:

# function throwing an error
my_func <- function(x){
  if(x > 0) return(sqrt(x))
  stop('x must be positive')
}

my_func_vect <- Vectorize(my_func)

Now, let's suppose I want to use this function inside mutate().

If this function is used inside a mutate(), it stops at the first error and no result is returned:

library(dplyr)
# dummy data
data <- data.frame(x = c(1, -1, 4, 9))
data %>% mutate(y = my_func_vect(x))
# Error in mutate_impl(.data, dots) : Evaluation error: x must be positive.

Is there a way to catch the error, and do something (e.g. return an NA) in this case, while getting results for the other elements?

The result I expect is what would be achieved using a loop with tryCatch(), i.e. something along the lines of:

y <- rep(NA_real_, length(data$x))
for(i in seq_along(data$x)) {
  tryCatch({
    y[i] <- my_func_vect(data$x[i])
  }, error = function(err){})
}
y
# Result is: 1 NA 2 4
like image 752
byouness Avatar asked May 14 '18 16:05

byouness


1 Answers

We can also make use of purrr's safely() or possibly() functions.

From the purrr help:

safely: wrapped function instead returns a list with components result and error. One value is always NULL.

quietly: wrapped function instead returns a list with components result, output, messages and warnings.

possibly: wrapped function uses a default value (otherwise) whenever an error occurs.

It doesn't change the fact that you have to apply the function to each row separately.

library(dplyr)
library(purrr)

# function throwing an error
my_func <- function(x){
  if(x > 0) return(sqrt(x))
  stop('x must be positive')
}

my_func_vect <- Vectorize(my_func)

# dummy data
data <- data.frame(x = c(1, -1, 4, 9))

With map:

data %>% 
  mutate(y = map_dbl(x, ~possibly(my_func_vect, otherwise = NA_real_)(.x)))
#>    x  y
#> 1  1  1
#> 2 -1 NA
#> 3  4  2
#> 4  9  3

Using rowwise():

data %>%
  rowwise() %>% 
  mutate(y = possibly(my_func_vect, otherwise = NA_real_)(x))
#> Source: local data frame [4 x 2]
#> Groups: <by row>
#> 
#> # A tibble: 4 x 2
#>       x     y
#>   <dbl> <dbl>
#> 1     1     1
#> 2    -1    NA
#> 3     4     2
#> 4     9     3

The others functions are somewhat more difficult to use and apply in a 'data-frame environment', as they are more suited to work with lists, and returns such.

Created on 2018-05-15 by the reprex package (v0.2.0).

like image 128
GGamba Avatar answered Sep 16 '22 21:09

GGamba