I am working on generating and parsing an existing binary format (Xilinx FPGA bitfiles) using Haskell. At the moment my data structures and operations look like this:
getCode = fromIntegral.fromEnum
getName = toEnum.fromIntegral
-- 1 1 1 1 1 1
-- 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
-- MOD_REG -------------------------------
-- 0 0 0 0 0 0 0 0 0 N B B B 1 1 1
-- M M M M
-- 2 1 0
data NewMode = NoNewMode | NewMode deriving (Show, Eq, Enum)
data Bootmode = SerialM | SpiM | BpiUp | InternalM | ReservedMode
| Jtag | ParallelS | SerialS
deriving (Show, Eq, Enum)
modeCode :: NewMode -> Bootmode -> Word16
modeCode newmode bootmode = (shiftL (getCode newmode) 6) .|.
(shiftL (getCode bootmode) 3) .|. 0x7
codeMode :: Word16 -> (NewMode, Bootmode)
codeMode w = (getName $ shiftR w 6 .&. 0x0001,
getName $ shiftR w 3 .&. 0x0007)
For each different configuration register word present in the device I have written a very similar set of lines (more examples at the bottom). Only the amount of shift and the number of bits in the AND mask change. I have a feeling that it should be somehow possible to eliminate this repetition that is annoying to type out and an easy source for hard to find bugs.
My first intuition was to add a type class "Bitfield" that each individual register word (or rather the datatype representing one) in the bitfile would be an instance of and that would allow me to only type out a representation of the structure of the word and from that I could somehow have default implementations for generating and parsing. I couldn't figure out how to bend the standard class system to do this, but is there some combination of type system extensions/generics/existentials/ghc-extras that would eventually allow me to replace those generating and parsing functions in the code with something like
class Bitfield t where
representation :: something
toBits :: t -> Int
fromBits :: Int -> t
toBits = something (using representation)
fromBits = something (using representation)
instance Bitfield ModReg where
representation = something
and afterwards simply have toBits and fromBits in my use? Somehow this looks almost exactly like the Ghc.Generics tutorial example with serialization of arbitrary datatypes to binary. Still I didn't quite manage to apply it to my case.
Below are some more examples of generate and parse functions for other registers to show the repetition I am talking about. In the actual full code there are many more. Also see how the bit positions and field lengths are embedded in the functions and repeated in each, doubling the number of opportunities to make mistakes that the compiler cannot catch.
-- 1 1 1 1 1 1
-- 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
-- CTL_REG -------------------------------
-- 0 0 0 0 0 0 0 0 E 0 S S P I 0 G
-- M B B E C T
-- 1 0 R P S
data EnMboot = DisMboot | EnMboot deriving (Show, Eq, Enum)
data Sbits = ReadWrite | IcapOnly | CrcOnly deriving (Show, Eq, Enum)
data Persist = NoPersist | Persist deriving (Show, Eq, Enum)
data Icap = NoIcap | Icap deriving (Show, Eq, Enum)
data GtsUserB = IoHighZ | IoActive deriving (Show, Eq, Enum)
ctlCode :: EnMboot -> Sbits -> Persist -> Icap -> GtsUserB -> Word16
ctlCode enmboot sbits persist icap gtsuserb =
(shiftL (getCode enmboot) 7) .|.
(shiftL (getCode sbits) 4) .|.
(shiftL (getCode persist) 3) .|.
(shiftL (getCode icap) 2) .|.
(getCode gtsuserb)
codeCtl :: Word16 -> (EnMboot,Sbits,Persist,Icap,GtsUserB)
codeCtl w =
(getName $ shiftR w 7 .&. 0x0001,
getName $ shiftR w 4 .&. 0x0003,
getName $ shiftR w 3 .&. 0x0001,
getName $ shiftR w 2 .&. 0x0001,
getName $ w .&. 0x0001)
-- 1 1 1 1 1 1
-- 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
-- COR_REG1 -------------------------------
-- D 0 1 1 1 1 1 1 0 0 0 C D D S S
-- A R P D C C
-- C 1 0
data DriveAwake = OpenDrainAwake | DriveAwake deriving (Show, Eq, Enum)
data CrcBypass = CrcEnabled | CrcDisabled deriving (Show, Eq, Enum)
data DonePipe = NoDonePipe | DonePipe deriving (Show, Eq, Enum)
data DriveDone = OpenDrainDone | DriveDone deriving (Show, Eq, Enum)
data SsClkSrc = Cclk | UserClk | JtagClk deriving (Show, Eq, Enum)
cor1Code :: DriveAwake -> CrcBypass -> DonePipe -> DriveDone ->
SsClkSrc -> Word16
cor1Code driveawake crcbypass donepipe drivedone ssclksrc =
(shiftL (getCode driveawake) 15) .|.
0x2F00 .|.
(shiftL (getCode crcbypass) 4) .|.
(shiftL (getCode donepipe) 3) .|.
(shiftL (getCode drivedone) 2) .|.
(getCode ssclksrc)
codeCor1 :: Word16 -> (DriveAwake,CrcBypass,DonePipe,DriveDone,SsClkSrc)
codeCor1 w =
(getName $ shiftR w 15 .&. 0x0001,
getName $ shiftR w 4 .&. 0x0001,
getName $ shiftR w 3 .&. 0x0001,
getName $ shiftR w 2 .&. 0x0001,
getName $ w .&. 0x0003)
We're going to make our own library for reading from and writing to things that are Bits
. It will be structured much like the binary package or the example code for serializing generics. We won't be taking advantage of generics because there's too much extra information beyond that available from the type that we need in order to know how to read and write the values. We are going to read the data via a monadic reader that we will get for free from the Free
monad.
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE DeriveFunctor #-}
import Data.Word
import Data.Bits
import Control.Monad.Free
import Control.Applicative
import Data.Monoid
To get the monad for free, we need to define a base functor for reading from bits. GetF
is our base functor. The only operation we perform is Get
ing (reading) a number of bits. This will give us some value that we know will have both Bits
and Integral
instances, from which we need to determine what to do next.
data GetF next = Get Int (forall b. (Bits b, Integral b) => b -> next)
deriving (Functor)
type Get = Free GetF
We get all of the instances we need for Get
for free.
Before continuing, we are going to adopt the convention of reading from and writing to the least significant bit end of the type. The next bit to read is always bit 0
and the last bit written was always bit 0
.
To run a get computation, we need the following little interpreter. If we have a Pure
result we return it. When we are instructed to Get
bits to read we mask that number of the bits and run the function to figure out what to do next. We then run the resulting Get
with that many bits shifted off the right.
runGet :: (Bits b, Integral b) => b -> Get a -> a
runGet bits (Pure a) = a
runGet bits (Free (Get l f)) = runGet (shiftR bits l) $ f (bits .&. oneBits l)
oneBits
fills the specified number of least significant bits with 1
s.
oneBits :: Bits b => Int -> b
oneBits n | n <= 0 = zeroBits
oneBits n = let (q, r) = n `quotRem` 2
bq = oneBits q
in bit 0 .|. shiftL (bq .|. shiftL bq q) r
When we put (write) bits, we need to provide the number of bits to write and the bits for any type that has both Bits
and Integeral
instances.
data Put = Put Int (forall b. (Bits b, Integral b) => b)
When we are paranoid building a Put
, we'll mask the bits when we build it to make sure there aren't extra bits being put outside the length.
mkPut :: Int -> (forall b. (Bits b, Integral b) => b) -> Put
mkPut l bits = Put l (bits .&. oneBits l)
The only instance we need for Put
is a Monoid
so we can write one thing after another.
instance Monoid Put where
mempty = Put 0 0
Put l1 bits1 `mappend` Put l2 bits2 = Put (l1 + l2) (bits1 .|. shiftL bits2 l1)
We're going to write a few helper functions for building Get
s and Put
s. Most of the data you are encoding or decoding are Enum
s of various bit lengths. getEnum
will build the Get
to go from Integral
Bits
to an Enum
. It's essentially your getName
wrapped up along with how many bits to get. putEnum
wraps the bits to put up along with their length.
getEnum :: Enum e => Int -> Get e
getEnum l = Free (Get l (Pure . toEnum . fromIntegral))
putEnum :: Enum e => Int -> e -> Put
putEnum l x = mkPut l (fromIntegral . fromEnum $ x)
You also need to skip bits when reading some structures. getSkip
skips bits without doing anything with them. putSkip
puts the same number of 0
bits; putSkip1
puts the same number of 1
bits.
getSkip :: Int -> Get ()
getSkip l = Free (Get l (const (Pure ())))
putSkip :: Int -> Put
putSkip l = Put l 0
putSkip1 :: Int -> Put
putSkip1 l = Put l (oneBits l)
At the very beginning, we chose to read from and write to the least significant bit. Due to this choice of convention, we're going to make the data types with the least significant field first. Here's ModReg
, which represents a MOD_REG
structure. The boot mode, which is stored in less significant bits, is the first field in the structure.
data ModReg = ModReg {bootmode :: Bootmode, newMode :: NewMode} deriving (Show, Eq)
data Bootmode = SerialM | SpiM | BpiUp | InternalM | ReservedMode
| Jtag | ParallelS | SerialS
deriving (Show, Eq, Enum)
data NewMode = NoNewMode | NewMode deriving (Show, Eq, Enum)
I'm going to add a type class for things that can be written to or read from Bits
, not because we want to use the type class, but simply so I don't need to come up with a name for all of these.
class Encodeable a where
put :: a -> Put
get :: Get a
We can now read and write the ModReg
structure least significant bits first. The trick with the ModReg
constructor in the second line is why I put the fields in least significant bit first order.
instance Encodeable ModReg where
put mr = putSkip1 3 <> putEnum 3 (bootmode mr) <> putEnum 1 (newMode mr)
get = ModReg <$ getSkip 3 <*> getEnum 3 <*> getEnum 1
For a complete, running example it will be nice to be able to pretty-print the bits in a Bits
. We'll print them with the most significant bit first.
import Data.List (intercalate)
showBitsN :: Bits b => Int -> b -> String
showBitsN n b = "[" ++ intercalate " " (map (\x -> if testBit b x then "1" else "0") [n,n-1..0]) ++ "]"
showBits :: FiniteBits b => b -> String
showBits b = showBitsN (finiteBitSize b) b
Our example will make a ModReg
with the Jtag
flag 1 0 1
in bits 3 through 5 and the NewMode
flag 1
in bit 6. We'll convert it to a Word16
then convert it back again.
main = do
let mr = ModReg Jtag NewMode
print mr
let x = runPut (put mr) :: Word16
putStrLn $ showBits x
let mr' = runGet x get :: ModReg
print mr'
This results in the expected output
ModReg {bootmode = Jtag, newMode = NewMode}
111
[0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 1]
ModReg {bootmode = Jtag, newMode = NewMode}
If we instead put two ModReg
s in a row into a Word32
we'll get a little surprise.
main = do
let (mr1, mr2) = (ModReg Jtag NewMode, ModReg BpiUp NoNewMode)
let x = runPut (put mr1 <> put mr2) :: Word32
print x
putStrLn $ showBits x
let mr' = runGet x (get >>= \a -> get >>= \b -> return (a, b)) :: (ModReg, ModReg)
print mr'
Instead of two Word16
s next to each other, all of the set bits fit into less than half of the space.
3055
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 1 1 0 1 1 1 1]
(ModReg {bootmode = Jtag, newMode = NewMode},ModReg {bootmode = BpiUp, newMode = NoNewMode})
If we want to use our get
and put
for ModReg
this way, we need to add the skip
s for the most significant bits.
instance Encodeable ModReg where
put mr = putSkip1 3 <> putEnum 3 (bootmode mr) <> putEnum 1 (newMode mr) <> putSkip 9
get = ModReg <$ getSkip 3 <*> getEnum 3 <*> getEnum 1 <* getSkip 9
Now a ModReg
is written 16 bits wide.
1507439
[0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 1]
(ModReg {bootmode = Jtag, newMode = NewMode},ModReg {bootmode = BpiUp, newMode = NoNewMode})
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With