What is the fastest integer division supporting division by zero no matter what the result is?

Tags:

Summary:

I'm looking for the fastest way to calculate

(int) x / (int) y

without getting an exception for y==0. Instead I just want an arbitrary result.

Background:

When coding image processing algorithms I often need to divide by an (accumulated) alpha value. The most simple variant is plain C code with integer arithmetic. My problem is that I typically get a division by zero error for result pixels with alpha==0. However this are exactly the pixels where the result doesn't matter at all: I don't care about color values of pixels with alpha==0.

Details:

I'm looking for something like:

result = (y==0)? 0 : x/y;

result = x / MAX( y, 1 );

x and y are positive integers. The code is executed a huge number of times in a nested loop, so I'm looking for a way to get rid of the conditional branching.

When y does not exceed the byte range, I'm happy with the solution

unsigned char kill_zero_table[256] = { 1, 1, 2, 3, 4, 5, 6, 7, [...] 255 }; [...] result = x / kill_zero_table[y];

But this obviously does not work well for bigger ranges.

I guess the final question is: Whats the fastest bit twiddling hack changing 0 to any other integer value, while leaving all other values unchanged?

Clarifications

I'm not 100% sure that branching is too expensive. However, different compilers are used, so I prefer benchmarking with little optimizations (which is indeed questionable).

For sure, compilers are great when it comes to bit twiddling, but I can't express the "don't care" result in C, so the compiler will never be able to use the full range of optimizations.

Code should be fully C compatible, main platforms are Linux 64 Bit with gcc & clang and MacOS.

376

asked May 27 '13 16:05

philipp

1 Answers

Inspired by some of the comments I got rid of the branch on my Pentium and gcc compiler using

int f (int x, int y) {         y += y == 0;         return x/y; }

The compiler basically recognizes that it can use a condition flag of the test in the addition.

As per request the assembly:

.globl f     .type   f, @function f:     pushl   %ebp     xorl    %eax, %eax     movl    %esp, %ebp     movl    12(%ebp), %edx     testl   %edx, %edx     sete    %al     addl    %edx, %eax     movl    8(%ebp), %edx     movl    %eax, %ecx     popl    %ebp     movl    %edx, %eax     sarl    $31, %edx     idivl   %ecx     ret

As this turned out to be such a popular question and answer, I'll elaborate a bit more. The above example is based on programming idiom that a compiler recognizes. In the above case a boolean expression is used in integral arithmetic and the use of condition flags are invented in hardware for this purpose. In general condition flags are only accessible in C through using idiom. That is why it so hard to make a portable multiple precision integer library in C without resorting to (inline) assembly. My guess is that most decent compilers will understand the above idiom.

Another way of avoiding branches, as also remarked in some of the above comments, is predicated execution. I therefore took philipp's first code and my code and ran it through the compiler from ARM and the GCC compiler for the ARM architecture, which features predicated execution. Both compilers avoid the branch in both samples of code:

Philipp's version with the ARM compiler:

f PROC         CMP      r1,#0         BNE      __aeabi_idivmod         MOVEQ    r0,#0         BX       lr

Philipp's version with GCC:

f:         subs    r3, r1, #0         str     lr, [sp, #-4]!         moveq   r0, r3         ldreq   pc, [sp], #4         bl      __divsi3         ldr     pc, [sp], #4

My code with the ARM compiler:

f PROC         RSBS     r2,r1,#1         MOVCC    r2,#0         ADD      r1,r1,r2         B        __aeabi_idivmod

My code with GCC:

f:         str     lr, [sp, #-4]!         cmp     r1, #0         addeq   r1, r1, #1         bl      __divsi3         ldr     pc, [sp], #4

All versions still need a branch to the division routine, because this version of the ARM doesn't have hardware for a division, but the test for y == 0 is fully implemented through predicated execution.

answered Oct 09 '22 02:10

Bryan Olivier

Related questions
                            
                                What makes this usage of pointers unpredictable?
                            
                                How do I check for C++11 support?
                            
                                Splitting templated C++ classes into .hpp/.cpp files--is it possible?
                            
                                How to disallow temporaries
                            
                                How to calculate a time difference in C++
                            
                                Placement of the asterisk in pointer declarations
                            
                                Why does code mutating a shared variable across threads apparently NOT suffer from a race condition?
                            
                                What's the best way to do a backwards loop in C/C#/C++?
                            
                                Why are C character literals ints instead of chars?
                            
                                Dual emission of constructor symbols
                            
                                Recommended way to insert elements into map [duplicate]
                            
                                Does C++20 mandate source code being stored in files?
                            
                                How to specify preference of library path?
                            
                                error::make_unique is not a member of ‘std’
                            
                                How to fix GCC compilation error when compiling >2 GB of code?
                            
                                Function for C++ struct
                            
                                What should go into an .h file?
                            
                                How does virtual inheritance solve the "diamond" (multiple inheritance) ambiguity?
                            
                                Replace substring with another substring C++
                            
                                How is std::function implemented?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the fastest integer division supporting division by zero no matter what the result is?

Tags:

c++

c

optimization

divide-by-zero

philipp

People also ask

1 Answers

Bryan Olivier

Recent Activity

Donate For Us