This question may or may not be inspired by my losing an entire 3-hour geocoding run because one of the values returned an error. Cue the pity (down)votes.
Basically there was an error returned inside a function called by sapply
. I had options(error=recover)
on, but despite browsing through every level available to me, I could not find any place where the results of the (thousands of successful) calls to FUN were stored in memory.
Some of the objects I found while browsing around themselves gave errors when I attempted to examine them, claiming the references were no longer valid. Unfortunately I lost the particular error message.
Here's a quick example which, while it does not replicate the reference error (which I suspect is related to disappearing environments and is probably immaterial), does demonstrate that I cannot see a way to save the data that has already been processed.
Is there such a technique?
Note that I have since realized my error and inserted even more robust error handling than existed before via try
, but I am looking for a way to recover the contents ex post rather than ex ante.
Test function
sapply( seq(10), function(x) {
if(x==5) stop("Error!")
return( "important data" )
} )
Interactive exploration
> sapply( seq(10), function(x) {
+ if(x==5) stop("Error!")
+ return( "important data" )
+ } )
Error in FUN(1:10[[5L]], ...) : Error!
Enter a frame number, or 0 to exit
1: sapply(seq(10), function(x) {
if (x == 5)
stop("Error!")
return("important data")
})
2: lapply(X = X, FUN = FUN, ...)
3: FUN(1:10[[5]], ...)
Selection: 3
Called from: FUN(1:10[[5L]], ...)
Browse[1]> ls()
[1] "x"
Browse[1]> x
[1] 5
Browse[1]>
Enter a frame number, or 0 to exit
1: sapply(seq(10), function(x) {
if (x == 5)
stop("Error!")
return("important data")
})
2: lapply(X = X, FUN = FUN, ...)
3: FUN(1:10[[5]], ...)
Selection: 2
Called from: lapply(X = X, FUN = FUN, ...)
Browse[1]> ls()
[1] "FUN" "X"
Browse[1]> X
[1] 1 2 3 4 5 6 7 8 9 10
Browse[1]> FUN
function(x) {
if(x==5) stop("Error!")
return( "important data" )
}
Browse[1]>
Enter a frame number, or 0 to exit
1: sapply(seq(10), function(x) {
if (x == 5)
stop("Error!")
return("important data")
})
2: lapply(X = X, FUN = FUN, ...)
3: FUN(1:10[[5]], ...)
Selection: 1
Called from: sapply(seq(10), function(x) {
if (x == 5)
stop("Error!")
return("important data")
})
Browse[1]> ls()
[1] "FUN" "simplify" "USE.NAMES" "X"
Browse[1]> X
[1] 1 2 3 4 5 6 7 8 9 10
Browse[1]> USE.NAMES
[1] TRUE
Browse[1]> simplify
[1] TRUE
Browse[1]> FUN
function(x) {
if(x==5) stop("Error!")
return( "important data" )
}
Browser[1]> Q
To be clear, what I was hoping to find was the vector:
[1] "important data" "important data" "important data" "important data"
In other words, the results of the internal loop that had been completed to this point.
Edit: Update with C code
Inside .Internal(lapply())
is the following code:
PROTECT(ans = allocVector(VECSXP, n));
...
for(i = 0; i < n; i++) {
...
tmp = eval(R_fcall, rho);
...
SET_VECTOR_ELT(ans, i, tmp);
}
I want to get at ans
when any call to lapply
fails.
I'm struggling to see why a try()
here isn't the way to go? If the sapply()
fails for whatever reason then you
Why would you want the entire data analysis/processing step to stop just for an error? Which is what you seem to be proposing. Rather than try to recover what has already been done, write your code so that it just carries on, recording the error took place but also gracefully moving onto the next step in the process.
It is a bit convoluted because the example you give is contrived (if you knew what would cause an error you could handle that without a try()
), but bear with me:
foo <- function(x) {
res <- try({
if(x==5) {
stop("Error!")
} else {
"important data"
}
})
if(inherits(res, "try-error"))
res <- "error occurred"
res
}
> sapply( seq(10), foo)
Error in try({ : Error!
[1] "important data" "important data" "important data" "important data"
[5] "error occurred" "important data" "important data" "important data"
[9] "important data" "important data"
Having runs jobs that took weeks to finish on my workstation in the background, I quickly learned to write lots of try()
calls around individual statements rather than big blocks of code so that once an error occurred I could quickly get out of that iteration/step with the least effect on the running job; in other words, if a particular R call failed I returned something that would slot into the object returned by sapply()
(or whatever function) nicely.
For anything more complex, I would probably use lapply()
:
foo2 <- function(x) {
res <- try({
if(x==5) {
stop("Error!")
} else {
lm(rnorm(10) ~ runif(10))
}
})
if(inherits(res, "try-error"))
res <- "error occurred"
res
}
out <- lapply(seq(10), foo2)
str(out, max = 1)
because you are going to want the list rather than try to simplify more complex objects down to something simple:
> out <- lapply(seq(10), foo2)
Error in try({ : Error!
> str(out, max = 1)
List of 10
$ :List of 12
..- attr(*, "class")= chr "lm"
$ :List of 12
..- attr(*, "class")= chr "lm"
$ :List of 12
..- attr(*, "class")= chr "lm"
$ :List of 12
..- attr(*, "class")= chr "lm"
$ : chr "error occurred"
$ :List of 12
..- attr(*, "class")= chr "lm"
$ :List of 12
..- attr(*, "class")= chr "lm"
$ :List of 12
..- attr(*, "class")= chr "lm"
$ :List of 12
..- attr(*, "class")= chr "lm"
$ :List of 12
..- attr(*, "class")= chr "lm"
That said, I'd probably have done this via a for()
loop, filling in a preallocated list as I iterated.
You never assigned the intermediate values to anything. I don't understand why you think there should be any entrails to divine. You need to record the values somehow:
res <- sapply( seq(10), function(x) { z <- x
on.exit(res <<- x);
if(x==5) stop("Error!")
} )
Error in FUN(1:10[[5L]], ...) : Error!
res
#[1] 5
This on.exit
method is illustrated on the ?par
page as a way of restoring par settings when plotting has gone wrong. (I was not able to get it to work with on.exit(res <- x)
.
Maybe I'm not understanding and this will certainly slow you down but what about a global assignment each time?
safety <- vector()
sapply( seq(10), function(x) {
if(x==5) stop("Error!")
assign('safety', c(safety, x), envir = .GlobalEnv)
return( "important data" )
} )
Yields:
> safety <- vector()
> sapply( seq(10), function(x) {
+ if(x==5) stop("Error!")
+ assign('safety', c(safety, x), envir = .GlobalEnv)
+ return( "important data" )
+ } )
Error in FUN(1:10[[5L]], ...) : Error!
> safety
[1] 1 2 3 4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With