Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How should one implement "is.error()" for R, to identify and parse errors?

I am trying to test if objects are the results of errors. The use case primarily arises via a foreach() loop that produces an error (although, for testing, it seems enough to just assign a simpleError() to a variable), and I'm puzzled about how to identify when that has occurred: how can I test that a given object is, in fact, an error? Once I've determined that it is an error, what else can I extract, besides a message? Perhaps I'm missing something about R's error handling facilities, as it seems necessary to write an error object testing function de novo.

Here are two examples, one using foreach, with the .errorhandling argument set to pass. I have begun to use that as the default for large scale or unattended processing, in the event of an anomaly in a slice of data. Such anomalies are rare, and not worth crashing the entire for loop (especially if that anomaly occurs at the end, which appears to be the default behavior of my murphysListSortingAlgorithm() ;-)). Instead, post hoc detection is desired.

library(foreach)
library(doMC)
registerDoMC(2)
results = foreach(ix = 1:10, .errorhandling = "pass") %dopar%{
    if(ix == 6){
        stop("Perfect")
    } 
    if(ix == 7){
        stop("LuckyPrime")
    } else {
        return(ix)
    }
}

For simplicity, here is a very simple error (by definition):

a = simpleError("SNAFU")

While there does not seem to be a command like is.error(), and commands like typeof() and mode() seem to be pointless, the best I've found is to use class() or attributes(), which give attributes that are indicative of an error. How can I use these in a manner guaranteed to determine that I've got an error and to fully process that error? For instance a$message returns SNAFU, but a$call is NULL. Should I expect to be able to extract anything useful from, say, res[[6]]$call?


Note 1: In case one doesn't have multicore functionality to reproduce the first example, I should point out that results[[6]] isn't the same as simpleError("Perfect"):

> b = simpleError("Perfect")
> identical(results[[6]], b)
[1] FALSE
> results[[6]]
<simpleError in eval(expr, envir, enclos): Perfect>
> b
<simpleError: Perfect>

This demonstrates why I can't (very naively) test if the list element is a vanilla simpleError.

Note 2. I am aware of try and tryCatch, and use these in some contexts. However, I'm not entirely sure how I can use them to post-process the output of, say, a foreach loop. For instance, the results object in the first example: it does not appear to me to make sense to process its elements with a tryCatch wrapper. For the RHS of the operation, i.e. the foreach() loop, I'm not sure that tryCatch will do what I intend, either. I can use it to catch an error, but I suppose I need to get the message and insert the processing at that point. I see two issues: every loop would need to be wrapped with a tryCatch(), negating part of the .errorhandling argument, and I remain unable to later post-process the results object. If that's the only way to do this processing, then it's the solution, but that implies that errors can't be identified and processed in a similar way to many other R objects, such as matrices, vectors, data frames, etc.


Update 1. I've added an additional stop trigger in the foreach loop, to give two different messages to identify and parse, in case this is helpful.

Update 2. I'm selecting Richie Cotton's answer. It seems to be the most complete explanation of what I should look for, though a complete implementation requires several other bits of code (and a recent version of R). Most importantly, he points out that there are 2 types of errors we need to keep in mind, which is especially important in being thorough. See also the comments and answers by others in order to fully develop your own is.error() test function; the answer I've given can be a useful start when looking for errors in a list of results, and the code by Richie is a good starting point for the test functions.

like image 918
Iterator Avatar asked Feb 12 '12 13:02

Iterator


4 Answers

The only two types of errors that you are likely to see in the wild are simpleErrors like you get here, and try-errors that are the result of wrapping some exception throwing code in a call to try. It is possible for someone to create their own error class, though these are rare and should be based upon one of those two classes. In fact (since R2.14.0) try-errors contain a simpleError:

e <- try(stop("throwing a try-error"))
attr(e, "condition")

To detect a simpleError is straightforward.

is_simple_error <- function(x) inherits(x, "simpleError")

The equivalent for try catch errors is

is_try_error <- function(x) inherits(x, "try-error")

So here, you can inspect the results for problems by applying this to your list of results.

the_fails <- sapply(results, is_simple_error)

Likewise, returning the message and call are one-liners. For convenience, I've converted the call to a character string, but you might not want that.

get_simple_error_message <- function(e) e$message
get_simple_error_call <- function(e) deparse(e$call)

sapply(results[the_fails], get_simple_error_message)
sapply(results[the_fails], get_simple_error_call)
like image 128
Richie Cotton Avatar answered Sep 22 '22 12:09

Richie Cotton


From ?simpleError:

Conditions are objects inheriting from the abstract class condition. Errors and warnings are objects inheriting from the abstract subclasses error and warning. The class simpleError is the class used by stop and all internal error signals. Similarly, simpleWarning is used by warning, and simpleMessage is used by message. The constructors by the same names take a string describing the condition as argument and an optional call. The functions conditionMessage and conditionCall are generic functions that return the message and call of a condition.

So class(a) returns:

[1] "simpleError" "error"       "condition"  

So a simple function:

is.condition <- function(x) {
  require(taRifx)
  last(class(x))=="condition"
}

As @flodel notes, replacing the function body with inherits(x,"condition") is more robust.

like image 28
Ari B. Friedman Avatar answered Sep 20 '22 12:09

Ari B. Friedman


Using @flodel's suggestion about inherits(), which gets at the abstract class inheritance mentioned by @gsk3, here's my current solution:

is.error.element <- function(x){
    testError   <- inherits(x, "error")
    if(testError == TRUE){
        testSimple  <- inherits(x, "simpleError")
        errMsg      <- x$message
    } else {
        testSimple  <- FALSE
        errMsg      <- NA
    }
    return(data.frame(testError, testSimple, errMsg, stringsAsFactors = FALSE))
}

is.error <- function(testObject){
    quickTest <- is.error.element(testObject)
    if(quickTest$testError == TRUE){
        return(quickTest)
    } else {
        return(lapply(testObject, is.error.element))
    }
}

Here are results, made pretty via ldply for the results list:

> ldply(is.error(results))
   testError testSimple     errMsg
1      FALSE      FALSE       <NA>
2      FALSE      FALSE       <NA>
3      FALSE      FALSE       <NA>
4      FALSE      FALSE       <NA>
5      FALSE      FALSE       <NA>
6       TRUE       TRUE    Perfect
7       TRUE       TRUE LuckyPrime
8      FALSE      FALSE       <NA>
9      FALSE      FALSE       <NA>
10     FALSE      FALSE       <NA>

> is.error(a)
  testError testSimple errMsg
1      TRUE       TRUE  SNAFU

This still seems rough to me, not least because I haven't extracted a meaningful call value, and the outer function, isError(), might not do well on other structures. I suspect that this could be improved with sapply or another member of the *apply or *ply (plyr) families.

like image 37
Iterator Avatar answered Sep 20 '22 12:09

Iterator


I use try and catch as described in this question: How do I save warnings and errors as output from a function?

The idea is that each item in the loop returns a list with three elements: the return value, any warnings, and any errors. The result is a list of lists that can then be queried to find out not only the values from each item in the loop, but which items in the loop had warnings or errors.

In this example, I would do something like this:

library(foreach)
library(doMC)
registerDoMC(2)
results = foreach(ix = 1:10, .errorhandling = "pass") %dopar%{
  catchToList({
    if(ix == 6){
        stop("Perfect")
    } 
    if(ix == 7){
        stop("LuckyPrime")
    } else {
        ix
    }
  })
}

Then I would process the results like this

> ok <- sapply(results, function(x) is.null(x$error))
> which(!ok)
[1] 6 7
> sapply(results[!ok], function(x) x$error)
[1] "Perfect"    "LuckyPrime"
> sapply(results[ok], function(x) x$value)
[1]  1  2  3  4  5  8  9 10

It would be fairly straightforward to give the result from catchToList a class and overload some accessing functions to make the above syntax easier, but I haven't found a real need for that yet.

like image 34
Aaron left Stack Overflow Avatar answered Sep 22 '22 12:09

Aaron left Stack Overflow