Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cython function pointers and exceptions

I am trying to wrap an existing C library using cython. The library uses callbacks which I would like to redirect to execute python code. Lets say that the corresponding line in the header is the following:

typedef RETCODE (*FUNC_EVAL)(int a, int b, void* func_data);

where the return code is used to signal an error. The API to create a corresponding C struct is as follows:

RETCODE func_create(Func** fstar,
                    FUNC_EVAL func_eval,
                    void* func_data);

I added a cython header / implementation files. The header contains the typedef:

  ctypedef RETCODE (*FUNC_EVAL)(int a,
                                int b, 
                                void* func_data)

The implementation contains a wrapper function:

cdef RETCODE func_eval(int a,
                       int b,
                       void* func_data):
  (<object> func_data).func_eval(a, b)
  return OKAY;

I can pass this function to the func_create cython wrapper just fine.

However, I want to make sure that exceptions in the python code are reported back to the C library by returning an ERROR value as a return code. So I added the following:

cdef RETCODE func_eval(int a,
                       int b,
                       void* func_data) except ERROR:
  (<object> func_data).func_eval(a, b)
  return OKAY;

However, now cython terminates with the following error message:

  Cannot assign type 'RETCODE (*)(int, int, void *) except ERROR' to 'FUNC_EVAL'

Am I using the except ... statement wrong?

like image 488
hfhc2 Avatar asked Dec 29 '18 20:12

hfhc2


1 Answers

That is Cython trying to prevent you from making subtle mistakes.

First, let's recall, how error handling works in CPython: There is a global error state (per thread) which is set when an error/exception occures. This state has the information about the type of exception, backtrace and so on. The convention is, that in addition to setting the global error state, a function signals its failure via a special return-value, so the error-state doesn't have to be checked after every function call.

Once a failure is detected in a function, the following must happens:

  • if this function "knows" how to handle this error (e.g. "except"-clause), so it has to clear the global error state before continueing.
  • if this function doesn't "know" how to handle his error, it has to abort and to return the failure signal.

An important thing: If function doesn't report an occured error it should clear the error-state, otherwise the python interpreter is in an incosistent state and subtle errors can happen: For example Cython cdef-functions with except? depends on the right error-state (how different Cython's except-clauses work see for example this SO-answer).

Now, back to your cdef-function.

  • If it is declared without except, the Cython takes care of the global state: if an error occures, the state is cleared (and a warning is written to standard-error) before function returns a default value.
  • If the function is declared with except 1, the caller of the function has to take care of clearing of the error state.

So the question is: Does the caller of a FUNC_EVAL-functors clear the Python's error state in case of an error?

  • if yes, wrap function pointer type as ctypedef... (*FUNC_EVAL)(...) except 1 to make it clear to Cython, that the caller will be able to handle the error.
  • if no (more probable), you will have to take care of Python's error state in the cdef-function.

In the "No"-case, the most straight forward way would be to use try: ... except: ... in the cdef-function, i.e.

cdef RETCODE func_eval(int a,
                       int b,
                       void* func_data):
  try:
    (<object> func_data).func_eval(a, b)
  except Exception:
     return ERROR
  return OKAY

One might be concerned, that using try... except... will add overhead even for the case where no exception is raised. And this is true. However, you are already calling some Python functionality, so this additonal overhead will not kill the performance.

My quick experiments have shown, that you might lose up to 30% if there are no calculation at all in the called Python functionality (see the experiments in the appendix of the answer). But the above is an extreme case, usually you will loose much less, so I wouldn't try to optimize it, unless the profiler shows it is really a problem.

If you define ERROR=0 and 'OKAY=1, so you can use the implementation detail, that Cython sets the result to0` when it clears error. However, it seems to be a slippery road to take.


Measurement of the overhead:

%%cython -a
cdef extern from *:
    """
    typedef int (*FUN)(void);
    void call(FUN f){
       f();
    }
    """
    ctypedef int (*FUN)()
    void call(FUN f)

def dummy():
    pass

cdef int cython_handling():
    dummy()
    return 1

cdef int manual_handling():
    try:
        dummy()
    except Exception:
        return 0
    return 1

def check_cython():
    cdef int i
    for i in range(1000):
        call(cython_handling)

def check_manually():
    cdef int i
    for i in range(1000):
        call(manual_handling)

And now:

%timeit check_cython()
# 21.6 µs ± 164 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit check_manually()
# 27 µs ± 493 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
like image 199
ead Avatar answered Sep 23 '22 14:09

ead