Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Monadic parsing of binary data in Haskell

Tags:

haskell

monads

I am pretty new to Haskell, and my first project is to parse captured WLAN packets. A common pattern in parsing such packet is that a header field will define the structure of the remaining bytes. As a generic example, a packet can be formatted like this:

header + [payload A | payload B | ..] 

where a flag field (can be a bitmap) in the header specifies what payload(s) are included in the packet. For a specific example of this format, please check out radiotap.

A similar thread suggests to just use a sequence of parse operations like this:

parseAll = do
    hdr <- parseHeader
    pa <- parsePayloadA
    pb <- parsePayloadB

However it seems cannot be applied in my case as the existence of payload A and B is defined by the header. In other word, the control flow of data parsing needs to follow a prior parsing result. I'd like to understand that if there is a generic way to parse binary data with this kind of pattern?

like image 560
liu3tao Avatar asked Mar 21 '16 03:03

liu3tao


2 Answers

Note that parseAll uses some kind of monadic parser library (as can be seen from the usage of the do notation and the binds). The power of monads is exactly that your choice of parsePayloadA and parsePayloadB can depend on hdr: you have the full power of Haskell to inspect hdr.

So basically you can do something like

parseAll = do
    hdr <- parseHeader
    payload <- case somethingInTheHdr hdr of
       ThisIsAnA -> do
         a <- parsePayloadA
         return (PayloadA a)
       ThisIsAB -> do
         b <- parsePayloadB
         return (PayloadB b)
    -- can use body here, e.g.
    return (Packet hdr payload)

You have this power specifically because in the type of monadic bind:

(>>=) :: m a -> (a -> m b) -> m b

the arrow in a -> m b is the real Haskell function arrow, giving you all the power you need.

like image 191
Cactus Avatar answered Nov 15 '22 10:11

Cactus


This is exactly the purpose of the Monad interface: to encode the dependency of a computation on the result of a preceding computation.

In your case it'll be something like the following:

parseAll = do
  shouldThePayloadABeParsed <- parseHeader
  if shouldThePayloadABeParsed
    then do
      pa <- parsePayloadA
      ...
    else do
      pb <- parsePayloadB
      ...

For your purposes I recommend reading the "Taste of State: Parsers are Easy" post to get an understanding of how this kind of parsing works in Haskell, and the "binary-parser" package to do the parsing.

like image 44
Nikita Volkov Avatar answered Nov 15 '22 09:11

Nikita Volkov