Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the best way to represent a short bit string?

I want to represent a string of up to around 120 bits, and speed is critical. I need to be able to build a bitstring by repeated snoc operations, and then to consume it with repeated uncons operations. One idea is to steal the implementation of Word128 from data-dword and use something like this to build:

empty = 1
snoc xs x = (xs `shiftL` 1) .|. x

But the unconsing seems to get a bit ugly, having to first countLeadingZeros and shift left to eliminate them before being able to read off the elements by shifting and masking the high bits.

Is there some more pleasant way that's at least as fast, or some faster way that's not too much more unpleasant?


Context

Phil Ruffwind has proposed a version of lens's at for Data.Map, but all implementations thus far are substantially slower than the naive implementation lens currently uses when key comparison is cheap. If I could produce a very cheap representation of the path to an entry while looking it up, and then consume it very efficiently with a specialized version of insert or delete, then maybe I could make this worthwhile.

like image 957
dfeuer Avatar asked May 01 '16 21:05

dfeuer


1 Answers

I am not sure if this qualifies. I fear that I'm re-implementing countLeadingZeros in some form...

Anyway, the idea is to snoc bits from the left, shifting right. Then, we can "count" the trailing zeros of x using x-1 and a XOR. The result of the "count" is a mask "00..01..11" which, roughly, is a unary representation of the trailing zeros. We do not convert this unary to binary since we have no need to: with some bit-level work, we can uncons.

Untested and unproven code follows.

import Data.Word
import Data.Bits
import Text.Printf

type T = Word64     -- can be adapted to any WordN

-- for pretty printing
pr :: T -> String
pr x = printf "%064b\n" x

empty :: T
empty = shiftL 1 63

snoc :: T -> T -> T
snoc x xs = shiftR xs 1 .|. (shiftL x 63)

-- returns (head, tail)
-- head is not normalized (0 or 1), only (0 or /=0)
uncons :: T -> (T, T)
uncons xs = 
   let -- example
       -- 0101001100000000000   xs  
       y = (xs `xor` (xs - 1))
       -- 0000000111111111111   y
       z = shiftR y 1 + 1
       -- 0000000100000000000   z
       z' = shiftL z 1
       -- 0000001000000000000   z'
   in (xs .&. z' , (xs .&. complement z) .|. z' )
like image 98
chi Avatar answered Sep 23 '22 01:09

chi