Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why don't cython compile logic or to `||` expression?

Tags:

python

c

cython

For example, here is an or expression:

c  = f1 == 0 or f1 - f0 > th

Here is the compiled C code:

__pyx_t_24 = (__pyx_v_f1 == 0);
if (!__pyx_t_24) {
} else {
  __pyx_t_23 = __pyx_t_24;
  goto __pyx_L5_bool_binop_done;
}
__pyx_t_24 = ((__pyx_v_f1 - __pyx_v_f0) > __pyx_v_th);
__pyx_t_23 = __pyx_t_24;
__pyx_L5_bool_binop_done:;
__pyx_v_c = __pyx_t_23;

Why not output this?

__pyx_v_c = (__pyx_v_f1 == 0) || ((__pyx_v_f1 - __pyx_v_f0) > __pyx_v_th)

is the goto version faster than ||?

like image 218
HYRY Avatar asked Aug 07 '15 01:08

HYRY


2 Answers

Using the following two files:

test.c:

int main(int argc, char** argv) {
    int c, f0, f1, th;
    int hold, hold1;
    f0 = (int) argv[1];
    f1 = (int) argv[2];
    th = (int) argv[3];
    hold1 = (f0 == 0);
    if (!hold1) {

    } else {
        hold = hold1;
        goto done;
    }
    hold1 = (f1 - f0 > th);
    hold = hold1;
    done: c = hold;
    return c;
}

test2.c:

int main(int argc, char** argv) {
    int c, f0, f1, th;
    f0 = (int) argv[1];
    f1 = (int) argv[2];
    th = (int) argv[3];
    c = (f1 == 0) || (f1 - f0 > th);
    return c;
}

I had to assign f0, f1 and th to something so that the compiler would not just return 1 as the C specification states that ints are initialized to 0 and f1 == 0 would yield true, and thus the whole boolean statement would yield true and the assembly would be:

main:
.LFB0:
    .cfi_startproc
.L2:
    movl    $1, %eax
    ret
    .cfi_endproc

Compiling using GCC with -S -O2 flags (optimization enabled), both test.s and test2.s become:

main:
.LFB0:
    .cfi_startproc
    movl    8(%rsi), %edx
    movq    16(%rsi), %rdi
    movl    $1, %eax
    movq    24(%rsi), %rcx
    testl   %edx, %edx
    je  .L2
    subl    %edx, %edi
    xorl    %eax, %eax
    cmpl    %ecx, %edi
    setg    %al
.L2:
    rep
    ret
    .cfi_endproc

So it unless you disable optimizations, in which the one with goto would have about 50% more instructions, the result would be the same.

The reason the output C code is ugly is because of the way the interpreter visits the nodes in the AST. When an or node is visited, the interpreter first evaluates the first parameter and then the second. If the boolean expression was much more complex, it would be much more easier to parse. Imagine calling a lambda function that returns a boolean (I'm not sure if Cython supports this); the interpreter would have to follow the structure:

hold = ... evaluate the lambda expression...
if (hold) {
    result = hold;
    goto done; // short circuit
}
hold = ... evaluate the second boolean expression...
done:
...

It would be a daunting task to optimize during the interpreting phase and thus Cython won't even bother.

like image 188
M. Shaw Avatar answered Oct 26 '22 17:10

M. Shaw


If my understanding of C is correct (and I last used C many years ago , so it may be rusty) the '||' ( OR ) operator in C only return Boolean values ( that is 0 for False or 1 for True ) . If this is correct , then it's not about if goto is faster or slower .

The || would give different results than the goto code . This is because of how 'or' in Python works , let's take an example -

c = a or b

In the above statement first as value is evaluated , If it is a true like value that value is returned from the or expression (not true or 1 , but as value) , if the value is false like ( false like values in Python are 0 , empty string , empty lists, False , etc) then b's value is evaluated and it is returned . Please note 'or' returns the last evaluated value , and not True(1) or False(0).

This is basically useful when you want to set default values like -

s = d or 'default value'
like image 38
Anand S Kumar Avatar answered Oct 26 '22 16:10

Anand S Kumar