Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ByteStrings in Haskell: should I use Put or Builder?

I'm confused as to what the Put monad offers over using Builder directly, in Data.Binary. I read the Binary Generation section of Dealing with Binary data, and it seems to assume that you should use Put, but it's pretty short doesn't explain why.

Data.Binary.Put

The Put monad. A monad for efficiently constructing lazy bytestrings.

type Put = PutM ()

Put merely lifts Builder into a Writer monad, applied to ().

Data.Binary.Builder

Efficient construction of lazy byte strings.


What is the point of a Writer monad applied to ()?

I can see that Put is (a type synonym to) a monad whereas Builder is not, but I don't really get why Put would be needed.

In my case, I'm rendering a 3D scene and writing each pixel as a 3 bytes, and then adding on the PPM format's header to the beginning (will use PNG later).

Binary seems like it is meant to be instantiated for types that can be serialized and deserialized to and from binary data. This isn't exactly what I'm doing, but it felt natural to instantiate Binary for my colour type

instance (Binary a) => Binary (Colour a) where
    put (Colour r g b) = put r >> put g >> put b
    get = Colour <$> get <*> get <*> get

This makes it easy to put a Colour Word8 into 24 bits. But then I also have to tack on the header, and I'm not sure how I should do that.

Is Builder meant to be hidden behind the scenes, or does it depend? Is the Binary class only for (de)serializing data, or for all binary generation purposes?

like image 440
mk12 Avatar asked Jul 16 '12 20:07

mk12


3 Answers

First of all note the conceptual difference. Builders are for efficient building of bytestring streams, while the PutM monad is really for serialization. So the first question you should ask yourself is whether you are actually serializing (to answer that ask yourself whether there is a meaningful and exact opposite operation – deserialization).

In general I would go with Builder for the convenience it provides. However, not the Builder from the binary package, but in fact from the blaze-builder package. It's a monoid and has many predefined string generators. It is also very composable. Finally it's very fast and can in fact be fine-tuned.

Last but not least if you really want speed, convenience and elegant code you will want to combine this with one of the various stream processor libraries around like conduit, enumerator or pipes.

like image 97
ertes Avatar answered Nov 07 '22 19:11

ertes


I can see that Put is a monad whereas Builder is not, but I don't really get why Put would be needed.

To be precise, PutM is the Monad. It's needed for convenience, and to give you fewer opportunities for errors. Writing code in monadic or applicative style is often much more convenient than carrying all the temporaries around explicitly, and with the plumbing done in the Monad instance, you can't accidentally use the wrong Builder in the middle of your function.

You can do everything you do with PutM using only Builder, but usually it's more work to write the code.

But then I also have to tack on the header, and I'm not sure how I should do that.

I don't know the PPM format, so I have no idea how to construct the header. But after constructing it, you can simply use putByteString or putLazyByteString to tack it on.

like image 21
Daniel Fischer Avatar answered Nov 07 '22 18:11

Daniel Fischer


I'm not sure to what extent this is accurate, but my understanding has always been that the presentation of Put as you see it is largely an abuse of do-notation so that you can write code like this:

putThing :: Thing -> Put
putThing (Thing thing1 thing2) = do
  putThing1 thing1
  putThing2 thing2

We are not using the "essence" of Monad (in particular, we never bind the result of anything) but we gain a convenient and clean syntax for concatenation. However, the aesthetic advantages over the purely monoidal alternative:

putThing :: Thing -> Builder
putThing (Thing thing1 thing2) = mconcat [
  putThing thing1,
  putThing thing2]

are fairly minimal, in my view.

(Note that Get, by contrast, genuinely is a Monad and benefits from being so in clear ways).

like image 34
Ben Millwood Avatar answered Nov 07 '22 18:11

Ben Millwood