Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

data.table and error handling using try statement

Tags:

r

data.table

I am trying to incorporate a bit of error handling in my R code.

Pseudo-code below:


foo = function(X,Y) {
...

return(ret.df);
}

DT = DT[,ret.df := foo(X,Y), by=key(DT)];

The aim is to check if for some combination of X,Y the function foo raises an error. If it does raise an error then I want to skip that record combination in the ultimate resultant data-frame. I have tried below without much luck:


    DT = DT[ ,  try(ret.df = : foo(X,y)); 
    if(not (class(ref.df) %in% "try-error') ) {
        return(ret.df);
    }, by = key(DT) ];

I can always try and write a wrapper around foo to do the error checking however am looking for a way to write the syntax directly in data.table call. Is this possible?

Thanks for your help in advance!

like image 492
Manoj Avatar asked Jan 13 '14 05:01

Manoj


1 Answers

Here's a dummy function and data :

foo = function(X,Y) {
    if (any(Y==2)) stop("Y contains 2!")
    X*Y
}
DT = data.table(a=1:3, b=1:6)
DT
   a b
1: 1 1
2: 2 2
3: 3 3
4: 1 4
5: 2 5
6: 3 6

Step by step :

> DT[, c := foo(a,b), by=a ]
Error in foo(a, b) : Y contains 2!

Ok, that's by construction. Good.

Aside: notice column c was added, despite the error.

> DT
   a b  c
1: 1 1  1
2: 2 2 NA
3: 3 3 NA
4: 1 4  4
5: 2 5 NA
6: 3 6 NA

Only the first successful group was populated; it stopped at the second group. This is by design. At some point in the future we could add transactions to data.table internally, like SQL, so that if an error happened, any changes could be rolled back. Anyway, just something to be aware of.

To deal with the error you can use {}.

First attempt :

> DT[, c := {
    if (inherits(try(ans<-foo(a,b)),"try-error"))
        NA
    else
        ans
}, by=a]
Error in foo(a, b) : Y contains 2!
Error in `[.data.table`(DT, , `:=`(c, { : 
  Type of RHS ('logical') must match LHS ('integer'). To check and coerce would
  impact performance too much for the fastest cases. Either change the type of
  the target column, or coerce the RHS of := yourself (e.g. by using 1L instead
  of 1)

The error tells us what to do. Let's coerce the type of the RHS (NA) from logical to integer.

> DT[, c:= {
    if (inherits(try(ans<-foo(a,b)),"try-error"))
        NA_integer_
    else
        ans
}, by=a]
Error in foo(a, b) : Y contains 2!

Better, the long error has gone. But why still the error from foo? Let's look at DT just to check.

> DT
   a b  c
1: 1 1  1
2: 2 2 NA
3: 3 3  9
4: 1 4  4
5: 2 5 NA
6: 3 6 18

Oh, so it has worked. The 3rd group has run and values 9 and 18 appear in rows 3 and 6. Looking at ?try reveals the silent argument.

> DT[, c:= {
    if (inherits(try(ans<-foo(a,b),silent=TRUE),"try-error"))
        NA_integer_
    else
        ans
}, by=a]
> # no errors
> DT
   a b  c
1: 1 1  1
2: 2 2 NA
3: 3 3  9
4: 1 4  4
5: 2 5 NA
6: 3 6 18
like image 60
Matt Dowle Avatar answered Oct 14 '22 04:10

Matt Dowle