Compact a hex number

Question

Is there a clever (ie: branchless) way to "compact" a hex number. Basically move all the 0s all to one side?

eg:

0x10302040 -> 0x13240000

or

0x10302040 -> 0x00001324

I looked on Bit Twiddling Hacks but didn't see anything.

It's for a SSE numerical pivoting algorithm. I need to remove any pivots that become 0. I can use _mm_cmpgt_ps to find good pivots, _mm_movemask_ps to convert that in to a mask, and then bit hacks to get something like the above. The hex value gets munged in to a mask for a _mm_shuffle_ps instruction to perform a permutation on the SSE 128 bit register.

zch · Accepted Answer

To compute mask for _pext:

mask = arg;
mask |= (mask << 1) & 0xAAAAAAAA | (mask >> 1) & 0x55555555;
mask |= (mask << 2) & 0xCCCCCCCC | (mask >> 2) & 0x33333333;

First do bit-or on pairs of bits, then on quads. Masks prevent shifted values from overflowing to other digits.

After computing mask this way or harold's way (which is probably faster) you don't need the full power of _pext, so if targeted hardware doesn't support it you can replace it with this:

for(int i = 0; i < 7; i++) {
    stay_mask = mask & (~mask - 1);
    arg = arg & stay_mask | (arg >> 4) & ~stay_mask;
    mask = stay_mask | (mask >> 4);
}

Each iteration moves all nibbles one digit to the right if there is some space. stay_mask marks bits that are in their final positions. This uses somewhat less operations than Hacker's Delight solution, but might still benefit from branching.

harold · Answer

Supposing we can use _pext_u32, the issue then is computing a mask that has an F for every nibble that isn't zero. I'm not sure what the best approach is, but you can compute the OR of the 4 bits of the nibble and then "spread" it back out to F's like this:

// calculate horizontal OR of every nibble
x |= x >> 1;
x |= x >> 2;
// clean up junk
x &= 0x11111111;
// spread
x *= 0xF;

Then use that as the mask of _pext_u32.

_pext_u32 can be emulated by this (taken from Hacker's Delight, figure 7.6)

unsigned compress(unsigned x, unsigned m) {
   unsigned mk, mp, mv, t; 
   int i; 

   x = x & m;           // Clear irrelevant bits. 
   mk = ~m << 1;        // We will count 0's to right. 

   for (i = 0; i < 5; i++) {
      mp = mk ^ (mk << 1);             // Parallel prefix. 
      mp = mp ^ (mp << 2); 
      mp = mp ^ (mp << 4); 
      mp = mp ^ (mp << 8); 
      mp = mp ^ (mp << 16); 
      mv = mp & m;                     // Bits to move. 
      m = m ^ mv | (mv >> (1 << i));   // Compress m. 
      t = x & mv; 
      x = x ^ t | (t >> (1 << i));     // Compress x. 
      mk = mk & ~mp; 
   } 
   return x; 
}

But that's a bit of a disaster. It's probably better to just resort to branching code then.

Compact a hex number

Tags:

c++

bit-manipulation

sse

Jay Lemmon

2 Answers

zch

harold

Recent Activity

Donate For Us

Compact a hex number

Tags:

c++

bit-manipulation

sse

Jay Lemmon

2 Answers

zch

harold

Related questions

Recent Activity

Donate For Us