I'm looking for some C code for signed saturated 64-bit addition that compiles to efficient x86-64 code with the gcc optimizer. Portable code would be ideal, although an asm solution could be used if necessary.
static const int64 kint64max = 0x7fffffffffffffffll;
static const int64 kint64min = 0x8000000000000000ll;
int64 signed_saturated_add(int64 x, int64 y) {
bool x_is_negative = (x & kint64min) != 0;
bool y_is_negative = (y & kint64min) != 0;
int64 sum = x+y;
bool sum_is_negative = (sum & kint64min) != 0;
if (x_is_negative != y_is_negative) return sum; // can't overflow
if (x_is_negative && !sum_is_negative) return kint64min;
if (!x_is_negative && sum_is_negative) return kint64max;
return sum;
}
The function as written produces a fairly lengthy assembly output with several branches. Any tips on optimization? Seems like it ought to be be implementable with just an ADD
with a few CMOV
instructions but I'm a little bit rusty with this stuff.
This may be optimized further but here is a portable solution. It does not invoked undefined behavior and it checks for integer overflow before it could occur.
#include <stdint.h>
int64_t sadd64(int64_t a, int64_t b)
{
if (a > 0) {
if (b > INT64_MAX - a) {
return INT64_MAX;
}
} else if (b < INT64_MIN - a) {
return INT64_MIN;
}
return a + b;
}
This is a solution that continues in the vein that had been given in one of the comments, and has been used in ouah's solution, too. here the generated code should be without conditional jumps
int64_t signed_saturated_add(int64_t x, int64_t y) {
// determine the lower or upper bound of the result
int64_t ret = (x < 0) ? INT64_MIN : INT64_MAX;
// this is always well defined:
// if x < 0 this adds a positive value to INT64_MIN
// if x > 0 this subtracts a positive value from INT64_MAX
int64_t comp = ret - x;
// the condition is equivalent to
// ((x < 0) && (y > comp)) || ((x >=0) && (y <= comp))
if ((x < 0) == (y > comp)) ret = x + y;
return ret;
}
The first looks as if there would be a conditional move to do, but because of the special values my compiler gets off with an addition: in 2's complement INT64_MIN
is INT64_MAX+1
.
There is then only one conditional move for the assignment of the sum, in case anything is fine.
All of this has no UB, because in the abstract state machine the sum is only done if there is no overflow.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With