Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hacks for clamping integer to 0-255 and doubles to 0.0-1.0?

Are there any branch-less or similar hacks for clamping an integer to the interval of 0 to 255, or a double to the interval of 0.0 to 1.0? (Both ranges are meant to be closed, i.e. endpoints are inclusive.)

I'm using the obvious minimum-maximum check:

int value = (value < 0? 0 : value > 255? 255 : value);

but is there a way to get this faster -- similar to the "modulo" clamp value & 255? And is there a way to do similar things with floating points?

I'm looking for a portable solution, so preferably no CPU/GPU-specific stuff please.

like image 691
Franz D. Avatar asked Jul 25 '15 21:07

Franz D.


3 Answers

This is a trick I use for clamping an int to a 0 to 255 range:

/**
 * Clamps the input to a 0 to 255 range.
 * @param v any int value
 * @return {@code v < 0 ? 0 : v > 255 ? 255 : v}
 */
public static int clampTo8Bit(int v) {
    // if out of range
    if ((v & ~0xFF) != 0) {
        // invert sign bit, shift to fill, then mask (generates 0 or 255)
        v = ((~v) >> 31) & 0xFF;
    }
    return v;
}

That still has one branch, but a handy thing about it is that you can test whether any of several ints are out of range in one go by ORing them together, which makes things faster in the common case that all of them are in range. For example:

/** Packs four 8-bit values into a 32-bit value, with clamping. */
public static int ARGBclamped(int a, int r, int g, int b) {
    if (((a | r | g | b) & ~0xFF) != 0) {
        a = clampTo8Bit(a);
        r = clampTo8Bit(r);
        g = clampTo8Bit(g);
        b = clampTo8Bit(b);
    }
    return (a << 24) + (r << 16) + (g << 8) + (b << 0);
}
like image 52
Boann Avatar answered Nov 09 '22 07:11

Boann


Note that your compiler may already give you what you want if you code value = min (value, 255). This may be translated into a MIN instruction if it exists, or into a comparison followed by conditional move, such as the CMOVcc instruction on x86.

The following code assumes two's complement representation of integers, which is usually a given today. The conversion from Boolean to integer should not involve branching under the hood, as modern architectures either provide instructions that can directly be used to form the mask (e.g. SETcc on x86 and ISETcc on NVIDIA GPUs), or can apply predication or conditional moves. If all of those are lacking, the compiler may emit a branchless instruction sequence based on arithmetic right shift to construct a mask, along the lines of Boann's answer. However, there is some residual risk that the compiler could do the wrong thing, so when in doubt, it would be best to disassemble the generated binary to check.

int value, mask;

mask = 0 - (value > 255);  // mask = all 1s if value > 255, all 0s otherwise
value = (255 & mask) | (value & ~mask);

On many architectures, use of the ternary operator ?: can also result in a branchless instruction sequences. The hardware may support select-type instructions which are essentially the hardware equivalent of the ternary operator, such as ICMP on NVIDIA GPUs. Or it provides CMOV (conditional move) as in x86, or predication as on ARM, both of which can be used to implement branch-less code for ternary operators. As in the previous case, one would want to examine the disassembled binary code to be absolutely sure the resulting code is without branches.

int value;

value = (value > 255) ? 255 : value;

In case of floating-point operands, modern floating-point units typically provide FMIN and FMAX instructions which map straight to the C/C++ standard math functions fmin() and fmax(). Alternatively fmin() and fmax() may be translated into a comparison followed by a conditional move. Again, it would be prudent to examine the generated code to make sure it is branchless.

double value;

value = fmax (fmin (value, 1.0), 0.0);
like image 5
njuffa Avatar answered Nov 09 '22 07:11

njuffa


I use this thing, 100% branchless.

int clampU8(int val)
{
    val &= (val<0)-1;  // clamp < 0
    val |= -(val>255); // clamp > 255
    return val & 0xFF; // mask out
}
like image 1
Anonymous Avatar answered Nov 09 '22 09:11

Anonymous