Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Poor performance parsing binary file in haskell

I have a set of binary records packed into a file and I am reading them using Data.ByteString.Lazy and Data.Binary.Get. With my current implementation an 8Mb file takes 6 seconds to parse.

import qualified Data.ByteString.Lazy as BL
import Data.Binary.Get

data Trade = Trade { timestamp :: Int, price :: Int ,  qty :: Int } deriving (Show)

getTrades = do
  empty <- isEmpty
  if empty
    then return []
    else do
      timestamp <- getWord32le          
      price <- getWord32le
      qty <- getWord16le          
      rest <- getTrades
      let trade = Trade (fromIntegral timestamp) (fromIntegral price) (fromIntegral qty)
      return (trade : rest)

main :: IO()
main = do
  input <- BL.readFile "trades.bin" 
  let trades = runGet getTrades input
  print $ length trades

What can I do to make this faster?

like image 515
Dave Anderson Avatar asked Mar 05 '12 12:03

Dave Anderson


1 Answers

Refactoring it slightly (basically a left-fold) gives much better performance and lowers GC overhead quite a bit parsing a 8388600 byte file.

{-# LANGUAGE BangPatterns #-}
module Main (main) where

import qualified Data.ByteString.Lazy as BL
import Data.Binary.Get

data Trade = Trade
  { timestamp :: {-# UNPACK #-} !Int
  , price     :: {-# UNPACK #-} !Int 
  , qty       :: {-# UNPACK #-} !Int
  } deriving (Show)

getTrade :: Get Trade
getTrade = do
  timestamp <- getWord32le
  price     <- getWord32le
  qty       <- getWord16le
  return $! Trade (fromIntegral timestamp) (fromIntegral price) (fromIntegral qty)

countTrades :: BL.ByteString -> Int
countTrades input = stepper (0, input) where
  stepper (!count, !buffer)
    | BL.null buffer = count
    | otherwise      =
        let (trade, rest, _) = runGetState getTrade buffer 0
        in stepper (count+1, rest)

main :: IO()
main = do
  input <- BL.readFile "trades.bin"
  let trades = countTrades input
  print trades

And the related runtime stats. Even though the allocation numbers are close, the GC and max heap size are quite a bit different between revisions.

All examples here were built with GHC 7.4.1 -O2.

The original source, run with +RTS -K1G -RTS due to excessive stack space usage:

     426,003,680 bytes allocated in the heap
     443,141,672 bytes copied during GC
      99,305,920 bytes maximum residency (9 sample(s))
             203 MB total memory in use (0 MB lost due to fragmentation)

  Total   time    0.62s  (  0.81s elapsed)

  %GC     time      83.3%  (86.4% elapsed)

Daniel's revision:

     357,851,536 bytes allocated in the heap
     220,009,088 bytes copied during GC
      40,846,168 bytes maximum residency (8 sample(s))
              85 MB total memory in use (0 MB lost due to fragmentation)

  Total   time    0.24s  (  0.28s elapsed)

  %GC     time      69.1%  (71.4% elapsed)

And this post:

     290,725,952 bytes allocated in the heap
         109,592 bytes copied during GC
          78,704 bytes maximum residency (10 sample(s))
               2 MB total memory in use (0 MB lost due to fragmentation)

  Total   time    0.06s  (  0.07s elapsed)

  %GC     time       5.0%  (6.0% elapsed)
like image 74
Nathan Howell Avatar answered Sep 17 '22 12:09

Nathan Howell