Haskell bitfields and bit level protocols

Question

I am working on generating and parsing an existing binary format (Xilinx FPGA bitfiles) using Haskell. At the moment my data structures and operations look like this:

getCode = fromIntegral.fromEnum
getName = toEnum.fromIntegral

--          1 1 1 1 1 1
--          5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
-- MOD_REG  -------------------------------
--          0 0 0 0 0 0 0 0 0 N B B B 1 1 1 
--                            M M M M
--                              2 1 0

data NewMode = NoNewMode | NewMode deriving (Show, Eq, Enum)
data Bootmode = SerialM | SpiM | BpiUp | InternalM | ReservedMode 
                | Jtag  | ParallelS | SerialS 
                deriving (Show, Eq, Enum)
modeCode :: NewMode -> Bootmode -> Word16
modeCode newmode bootmode = (shiftL (getCode newmode) 6) .|. 
                            (shiftL (getCode bootmode) 3) .|. 0x7 
codeMode :: Word16 -> (NewMode, Bootmode)
codeMode w = (getName $ shiftR w 6 .&. 0x0001, 
              getName $ shiftR w 3 .&. 0x0007)

For each different configuration register word present in the device I have written a very similar set of lines (more examples at the bottom). Only the amount of shift and the number of bits in the AND mask change. I have a feeling that it should be somehow possible to eliminate this repetition that is annoying to type out and an easy source for hard to find bugs.

My first intuition was to add a type class "Bitfield" that each individual register word (or rather the datatype representing one) in the bitfile would be an instance of and that would allow me to only type out a representation of the structure of the word and from that I could somehow have default implementations for generating and parsing. I couldn't figure out how to bend the standard class system to do this, but is there some combination of type system extensions/generics/existentials/ghc-extras that would eventually allow me to replace those generating and parsing functions in the code with something like

class Bitfield t where
  representation :: something
  toBits :: t -> Int
  fromBits :: Int -> t
  toBits = something (using representation)
  fromBits = something (using representation)


instance Bitfield ModReg where
  representation = something

and afterwards simply have toBits and fromBits in my use? Somehow this looks almost exactly like the Ghc.Generics tutorial example with serialization of arbitrary datatypes to binary. Still I didn't quite manage to apply it to my case.

Below are some more examples of generate and parse functions for other registers to show the repetition I am talking about. In the actual full code there are many more. Also see how the bit positions and field lengths are embedded in the functions and repeated in each, doubling the number of opportunities to make mistakes that the compiler cannot catch.

--          1 1 1 1 1 1
--          5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
-- CTL_REG  -------------------------------
--          0 0 0 0 0 0 0 0 E 0 S S P I 0 G 
--                          M   B B E C   T
--                              1 0 R P   S

data EnMboot = DisMboot | EnMboot deriving (Show, Eq, Enum)
data Sbits = ReadWrite | IcapOnly | CrcOnly deriving (Show, Eq, Enum)
data Persist = NoPersist | Persist deriving (Show, Eq, Enum)
data Icap = NoIcap | Icap deriving (Show, Eq, Enum)
data GtsUserB = IoHighZ | IoActive deriving (Show, Eq, Enum)
ctlCode :: EnMboot -> Sbits -> Persist -> Icap -> GtsUserB -> Word16
ctlCode enmboot sbits persist icap gtsuserb = 
  (shiftL (getCode enmboot) 7) .|.
  (shiftL (getCode sbits) 4) .|.
  (shiftL (getCode persist) 3) .|.
  (shiftL (getCode icap) 2) .|.
  (getCode gtsuserb)
codeCtl :: Word16 -> (EnMboot,Sbits,Persist,Icap,GtsUserB)
codeCtl w = 
  (getName $ shiftR w 7 .&. 0x0001,
   getName $ shiftR w 4 .&. 0x0003,
   getName $ shiftR w 3 .&. 0x0001,
   getName $ shiftR w 2 .&. 0x0001,
   getName $ w .&. 0x0001)


--          1 1 1 1 1 1
--          5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
-- COR_REG1 -------------------------------
--          D 0 1 1 1 1 1 1 0 0 0 C D D S S 
--          A                     R P D C C      
--                                C     1 0   

data DriveAwake = OpenDrainAwake | DriveAwake deriving (Show, Eq, Enum)
data CrcBypass = CrcEnabled | CrcDisabled deriving (Show, Eq, Enum)
data DonePipe = NoDonePipe | DonePipe deriving (Show, Eq, Enum)
data DriveDone = OpenDrainDone | DriveDone  deriving (Show, Eq, Enum)
data SsClkSrc = Cclk | UserClk | JtagClk deriving (Show, Eq, Enum)
cor1Code :: DriveAwake -> CrcBypass -> DonePipe -> DriveDone -> 
            SsClkSrc -> Word16
cor1Code driveawake crcbypass donepipe drivedone ssclksrc =
  (shiftL (getCode driveawake) 15) .|.
  0x2F00 .|.
  (shiftL (getCode crcbypass) 4) .|.
  (shiftL (getCode donepipe) 3) .|.
  (shiftL (getCode drivedone) 2) .|.
  (getCode ssclksrc)
codeCor1 :: Word16 -> (DriveAwake,CrcBypass,DonePipe,DriveDone,SsClkSrc)
codeCor1 w = 
  (getName $ shiftR w 15 .&. 0x0001,
   getName $ shiftR w 4 .&. 0x0001,
   getName $ shiftR w 3 .&. 0x0001,
   getName $ shiftR w 2 .&. 0x0001,
   getName $ w .&. 0x0003)

Cirdec · Accepted Answer

We're going to make our own library for reading from and writing to things that are Bits. It will be structured much like the binary package or the example code for serializing generics. We won't be taking advantage of generics because there's too much extra information beyond that available from the type that we need in order to know how to read and write the values. We are going to read the data via a monadic reader that we will get for free from the Free monad.

{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE DeriveFunctor #-}

import Data.Word
import Data.Bits

import Control.Monad.Free
import Control.Applicative

import Data.Monoid

Reading

To get the monad for free, we need to define a base functor for reading from bits. GetF is our base functor. The only operation we perform is Geting (reading) a number of bits. This will give us some value that we know will have both Bits and Integral instances, from which we need to determine what to do next.

data GetF next = Get Int (forall b. (Bits b, Integral b) => b -> next)
    deriving (Functor)

type Get = Free GetF

We get all of the instances we need for Get for free.

Before continuing, we are going to adopt the convention of reading from and writing to the least significant bit end of the type. The next bit to read is always bit 0 and the last bit written was always bit 0.

To run a get computation, we need the following little interpreter. If we have a Pure result we return it. When we are instructed to Get bits to read we mask that number of the bits and run the function to figure out what to do next. We then run the resulting Get with that many bits shifted off the right.

runGet :: (Bits b, Integral b) => b -> Get a -> a
runGet bits (Pure a) = a
runGet bits (Free (Get l f)) = runGet (shiftR bits l) $ f (bits .&. oneBits l)

oneBits fills the specified number of least significant bits with 1s.

oneBits :: Bits b => Int -> b
oneBits n | n <= 0 = zeroBits
oneBits n          = let (q, r) = n `quotRem` 2
                         bq     = oneBits q
                     in bit 0 .|. shiftL (bq .|. shiftL bq q) r

Writing

When we put (write) bits, we need to provide the number of bits to write and the bits for any type that has both Bits and Integeral instances.

data Put = Put Int (forall b. (Bits b, Integral b) => b)

When we are paranoid building a Put, we'll mask the bits when we build it to make sure there aren't extra bits being put outside the length.

mkPut :: Int -> (forall b. (Bits b, Integral b) => b) -> Put
mkPut l bits = Put l (bits .&. oneBits l)

The only instance we need for Put is a Monoid so we can write one thing after another.

instance Monoid Put where
    mempty = Put 0 0
    Put l1 bits1 `mappend` Put l2 bits2 = Put (l1 + l2) (bits1 .|. shiftL bits2 l1)

Helpers

We're going to write a few helper functions for building Gets and Puts. Most of the data you are encoding or decoding are Enums of various bit lengths. getEnum will build the Get to go from Integral Bits to an Enum. It's essentially your getName wrapped up along with how many bits to get. putEnum wraps the bits to put up along with their length.

getEnum :: Enum e => Int -> Get e
getEnum l = Free (Get l (Pure . toEnum . fromIntegral))

putEnum :: Enum e => Int -> e -> Put
putEnum l x = mkPut l (fromIntegral . fromEnum $ x)

You also need to skip bits when reading some structures. getSkip skips bits without doing anything with them. putSkip puts the same number of 0 bits; putSkip1 puts the same number of 1 bits.

getSkip :: Int -> Get ()
getSkip l = Free (Get l (const (Pure ())))

putSkip :: Int -> Put
putSkip l = Put l 0

putSkip1 :: Int -> Put
putSkip1 l = Put l (oneBits l)

MOD_REG

At the very beginning, we chose to read from and write to the least significant bit. Due to this choice of convention, we're going to make the data types with the least significant field first. Here's ModReg, which represents a MOD_REG structure. The boot mode, which is stored in less significant bits, is the first field in the structure.

data ModReg = ModReg {bootmode :: Bootmode, newMode :: NewMode} deriving (Show, Eq)
data Bootmode = SerialM | SpiM | BpiUp | InternalM | ReservedMode 
                | Jtag  | ParallelS | SerialS 
                deriving (Show, Eq, Enum)
data NewMode = NoNewMode | NewMode deriving (Show, Eq, Enum)

I'm going to add a type class for things that can be written to or read from Bits, not because we want to use the type class, but simply so I don't need to come up with a name for all of these.

class Encodeable a where
    put :: a -> Put
    get :: Get a

We can now read and write the ModReg structure least significant bits first. The trick with the ModReg constructor in the second line is why I put the fields in least significant bit first order.

instance Encodeable ModReg where
    put mr =           putSkip1 3 <>  putEnum 3 (bootmode mr) <>  putEnum 1 (newMode mr)
    get    = ModReg <$ getSkip  3 <*> getEnum 3               <*> getEnum 1

Running Example

For a complete, running example it will be nice to be able to pretty-print the bits in a Bits. We'll print them with the most significant bit first.

import Data.List (intercalate)

showBitsN :: Bits b => Int -> b -> String
showBitsN n b = "[" ++ intercalate " " (map (\x -> if testBit b x then "1" else "0") [n,n-1..0]) ++ "]"

showBits :: FiniteBits b => b -> String
showBits b = showBitsN (finiteBitSize b) b

Our example will make a ModReg with the Jtag flag 1 0 1 in bits 3 through 5 and the NewMode flag 1 in bit 6. We'll convert it to a Word16 then convert it back again.

main = do 
    let mr = ModReg Jtag NewMode 
    print mr
    let x = runPut (put mr) :: Word16
    putStrLn $ showBits x
    let mr' = runGet x get :: ModReg
    print mr'

This results in the expected output

ModReg {bootmode = Jtag, newMode = NewMode}
111
[0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 1]
ModReg {bootmode = Jtag, newMode = NewMode}

If we instead put two ModRegs in a row into a Word32 we'll get a little surprise.

main = do 
    let (mr1, mr2) = (ModReg Jtag NewMode, ModReg BpiUp NoNewMode)
    let x = runPut (put mr1 <> put mr2) :: Word32
    print x
    putStrLn $ showBits x
    let mr' = runGet x (get >>= \a -> get >>= \b -> return (a, b)) :: (ModReg, ModReg)
    print mr'

Instead of two Word16s next to each other, all of the set bits fit into less than half of the space.

3055
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 1 1 0 1 1 1 1]
(ModReg {bootmode = Jtag, newMode = NewMode},ModReg {bootmode = BpiUp, newMode = NoNewMode})

If we want to use our get and put for ModReg this way, we need to add the skips for the most significant bits.

instance Encodeable ModReg where
    put mr =           putSkip1 3 <>  putEnum 3 (bootmode mr) <>  putEnum 1 (newMode mr) <> putSkip 9
    get    = ModReg <$ getSkip  3 <*> getEnum 3               <*> getEnum 1              <* getSkip 9

Now a ModReg is written 16 bits wide.

1507439
[0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 1]
(ModReg {bootmode = Jtag, newMode = NewMode},ModReg {bootmode = BpiUp, newMode = NoNewMode})

Haskell bitfields and bit level protocols

Tags:

haskell

Antti Siponen

1 Answers

Reading

Writing

Helpers

MOD_REG

Running Example

Cirdec

Recent Activity

Donate For Us

Haskell bitfields and bit level protocols

Tags:

haskell

Antti Siponen

1 Answers

Reading

Writing

Helpers

MOD_REG

Running Example

Cirdec

Related questions

Recent Activity

Donate For Us