Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to serialize little-endian PODs painlessly?

Let's say I have the following type:

data WaveFormatChunk = WaveFormatChunk { 
    compression :: Word16,
    channels :: Word16,
    sampleRate :: Word32,
    averageBps :: Word32,
    blockAlign :: Word16,
    significantBits :: Word16
    } deriving (Show)

Is there a way to just dump all of that into a ByteString (or a similar structure) wholesale (in a manner of ye olde C structs)? If not and I have to write a function that separately puts all of them to a list, are there at least functions to make sticking the value into a, say, Word8 list easy? Something like putWordBBxe, except for byte strings or list (though I'm most likely severely mistaken since I haven't properly read into Monads yet, it seems to me that Get/Put are mostly used with streams).

Data.Binary isn't quite what I'm looking for, it seems more useful for just dumping data on disc than storing it in a specific format with a specific (and "wrong") endianness.

like image 316
jaymmer - Reinstate Monica Avatar asked Feb 17 '13 11:02

jaymmer - Reinstate Monica


2 Answers

Data.Binary will let you serialize the structure to a bytestring, using explicitly little-endian operators.

{-# OPTIONS_GHC -funbox-strict-fields #-}
{-# LANGUAGE RecordWildCards #-}

import Data.Binary
import Data.Binary.Put

import qualified Data.ByteString.Char8 as C
import qualified Data.ByteString.Lazy  as L

data WaveFormatChunk =
        WaveFormatChunk { 
            compression     :: !Word16,
            channels        :: !Word16,
            sampleRate      :: !Word32,
            averageBps      :: !Word32,
            blockAlign      :: !Word16,
            significantBits :: !Word16
        } 

instance Binary WaveFormatChunk where
    put (WaveFormatChunk{..}) = do
        putWord16le compression 
        putWord16le channels
        putWord32le sampleRate
        putWord32le averageBps
        putWord16le blockAlign
        putWord16le significantBits

    get = undefined

main = C.putStr $ C.concat $ L.toChunks $ encode test
  where
    test = WaveFormatChunk {
            compression     = 0xcafe
          , channels        = 0xface
          , sampleRate      = 0xdeadbeef
          , averageBps      = 0xf01dab1e
          , blockAlign      = 0x5566
          , significantBits = 0xb01d
          }

Will yield:

 $ ./A | od -x
 0000000 cafe face beef dead ab1e f01d 5566 b01d

So you have precise byte-level control of the representation. You can also get the same effect from the cereal package, if you're not interested in streaming.

like image 133
Don Stewart Avatar answered Nov 03 '22 06:11

Don Stewart


There is another, completely different approach. Instead of having such a structure you can define a ByteString wrapper:

import Data.ByteString (ByteString)

newtype WaveFormatChunk =
    WaveFormatChunk {
      getWaveFormatChunk :: ByteString
    }

Writing this to a file is straightforward. To modify such a structure you can use lenses:

data Compression = {- ... -}

compression :: Lens' WaveFormatChunk Compression

or if you prefer:

compression :: Lens' WaveFormatChunk Word16

The lenses act like safe interpreters of the individual byte groups. However, there are three problems: First of all you should employ a test framework for that one, because it's easy to get the lenses wrong. Secondly every change requires a new copy of the ByteString. Depending on what you do this can be slower or faster than your original approach.

My personal recommendation is to go with a regular high level Haskell datatype and use proper serialization. As pointed out by others the instances are pretty easy to write.

like image 38
ertes Avatar answered Nov 03 '22 05:11

ertes