I'd like to know if there is a Pythonic way for handling errors in long-running functions that can have errors in part that do not affect the ability of the function to continue.
As an example, consider a function that given a list of URLs, it recursively retrieves the resource and all linked resources under the path of the top level URLs. It stores the retrieved resources in a local filesystem with a directory structure mirroring the URL structure. Essentially this is a basic recursive wget for a list of pages.
There are quite a number of points where this function could fail:
A failure on retrieving or saving any one resource only affects the function's ability to continue to process that resource and any child resources that may be linked from it, but it is possible to continue to retrieve other resources.
A simple model of error handling is that on the first error, an appropriate exception is raised for the caller to handle. The problem with this is that it terminates the function and does not allow it to continue. The error could possibly be fixed and the function restarted from the beginning but this would cause work to be redone, and any permanent errors may mean we never complete.
A couple of alternatives I have in mind are:
In Python discussions, I've often noted certain approaches described as Pythonic or non-Pythonic. I'd like to know if there are any particularly Pythonic approaches to handling the type of scenario described above.
Does Python have any batteries included that model more sophisticated error handling than the terminate model of exception handling, or do the more complex batteries included use a model of error handling that I should copy to stay Pythonic?
Note: Please do not focus on the example. I'm not looking to solve problems in that particular space, but it seemed like a good example that most people here would have an understanding of.
I don't think there's a particularly clear "Pythonic/non-Pythonic" distinction at the level you're talking about here.
One of the big reasons there's no "one-size-fits-all" solution in this domain, is that the exact semantics you want are going to be problem specific.
Even supporting an error handler doesn't necessarily cover all of those desired behaviours - a simple per-failure error handler can't easily provide abort-and-rollback semantics, or generate a single exception at the end. (It's not impossible - you just have to mess around with tricks like passing bound methods or closures as your error handlers)
So the best you can do is take an educated guess at typical usage scenarios and desirable behaviours in the face of errors, and design your API accordingly.
A fully general solution would accept an on-error handler that is given each failure as it happens, and a final "errors occurred" handler that gives the caller a chance to decide how multiple errors are handled (with some protocol to allow data to be passed from the individual error handlers to the final batch error handler).
However, providing such a general solution is likely to be an API design failure. The designer of the API shouldn't be afraid to have an opinion on how their API should be used, and how errors should be handled. The main thing to keep in mind is to not overengineer your solution:
Perhaps the best you're going to get as a general principle is that "Pythonic" error handling will be as simple as possible, but no simpler. But at that point, the word is just being used as a synonym for "good code", which isn't really its intent.
On the other hand, it is slightly easier to talk about what actual forms non-Pythonic error handling might take:
def myFunction(an_arg, error_handler)
# Do stuff
if err_occurred:
if isinstance(err, RuntimeError):
error_handler.handleRuntimeError()
elif isinstance(err, IOError):
error_handler.handleIOError()
The Pythonic idiom is that error handlers, if supported at all, are just simple callables. Give them the information they need to decide how to handle the situation, rather than try to decide too much on their behalf. If you want to make it easier to implement common aspects of the error handling, then provide a separate helper class with a __call__
method that does the dispatch, so people can decide whether or not they want to use it (or how much they want to override when they do use it). This isn't completely Python-specific, but it is something that folks coming from languages that make it annoyingly difficult to pass arbitrary callables around (such as Java, C, C++) may get wrong. So complex error handling protocols would definitely be a way to head into "non-Pythonic error handling" territory.
The other problem in the above non-Pythonic code is that there's no default handler provided. Forcing every API user to make a decision they may not yet be equipped to make is just poor API design. But now we're back in general "good code"/"bad code" territory, so Pythonic/non-Pythonic really shouldn't be used to describe the difference.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With