Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to locate errors and debug when using purrr

Tags:

r

debugging

purrr

I find it difficult to debug my code when using purrr and some of the map() variants. Especially I have problems locating where my code fails because the error messages do not tell me which row (dataframe) threw the error.

What is a good approach to locating errors when using purrr?

Consider the following example:

library(tidyverse)

# Prepare some data
set.seed(1)
a <- tibble(
  x = rnorm(2),
  y = rnorm(2))

b <- tibble(
  x = rnorm(2),
  y = rnorm(2))

c <- tibble(
  x = rnorm(2),
  y = letters[1:2])

df <- tibble(
  dataframes = list(a,b,c))

df
#> # A tibble: 3 x 1
#>   dataframes      
#>   <list>          
#> 1 <tibble [2 x 2]>
#> 2 <tibble [2 x 2]>
#> 3 <tibble [2 x 2]>

# A simple function 
add_cols <- function(.data){
  .data %>% 
    mutate(
      z = x + y)
}

# Running the function with map() will return an error
df %>% 
  mutate(
    dataframes = map(.x = dataframes, ~add_cols(.x)))
#> Error in x + y: non-numeric argument to binary operator

map() returns an error because you can't add a number and a letter. The error message tells us what went wrong, but not where it went wrong. In this example, it is obvious that the error comes from the third row in df, but imagine the function was much more complicated, and that we were applying it to 1000's of rows. How would you locate the error then?

So far, my approach is to use some version of this monstrosity of a loop. I think the disadvantages of this approach are quite obvious. Please help me find a better way to do this.

for(i in 1:nrow(df)){
  print(paste("Testing row number", i))

  df %>% 
    filter(row_number() == i) %>% 
    unnest(cols = c(dataframes)) %>% 
    add_cols()
}
#> [1] "Testing row number 1"
#> [1] "Testing row number 2"
#> [1] "Testing row number 3"
#> Error in x + y: non-numeric argument to binary operator

I am using Rstudio in case that is relevant to your suggestions.

Created on 2019-10-15 by the reprex package (v0.3.0)

like image 924
Steen Harsted Avatar asked Oct 15 '19 19:10

Steen Harsted


1 Answers

We can use possibly or safely from purrr and specify the otherwise.

library(dplyr)
library(purrr)
out <- df %>%
         mutate(dataframes = map(dataframes, ~ 
                possibly(add_cols, otherwise = 'error here')(.x)))

out$dataframes
#[[1]]
# A tibble: 2 x 3
#       x      y     z
#   <dbl>  <dbl> <dbl>
#1 -0.626 -0.836 -1.46
#2  0.184  1.60   1.78

#[[2]]
# A tibble: 2 x 3
#       x     y       z
#   <dbl> <dbl>   <dbl>
#1  0.330 0.487  0.817 
#2 -0.820 0.738 -0.0821

#[[3]]
#[1] "error here"

which can be located by a simple check

out$dataframes %in%  'error here'
#[1] FALSE FALSE  TRUE

To find the position, wrap with which

like image 195
akrun Avatar answered Nov 15 '22 03:11

akrun