What is causing my Wavefront Obj parser to consume 10x more memory than the input?

Question

I'm trying to write a parser for the Wavefront .obj file format, which is a really dumb line based format. Hopefully that wikipedia article should summarise how it works, but essentially you have lines that record entries in vertex, normal and other arrays. Finally, a face definition is a triple (or more) of indices into these separate arrays.

My parser is

{-# LANGUAGE BangPatterns #-}
{-# LANGUAGE OverloadedStrings #-}
module WavefrontObj where

import Control.Applicative
import Data.Attoparsec.Text
import Data.Foldable (toList)
import Data.List (foldl')
import Data.List.Split
import Data.Monoid
import Data.NonEmpty ((!:))
import Data.Sequence (Seq)
import Geometry
import Graphics.GL
import Linear
import Linear.Affine
import qualified Data.Sequence as Seq

data Obj =
  Obj {objVertices :: !(Seq (V3 Double))
      ,objNormals :: !(Seq (V3 Double))
      ,objFaces :: !(Seq (V2 (V3 Int)))}
  deriving (Show)

instance Monoid Obj where
  mempty = Obj mempty mempty mempty
  Obj a b c `mappend` Obj x y z =
    Obj (a <> x)
        (b <> y)
        (c <> z)

parseLine :: Parser Obj
parseLine =
  vertex <|> normal <|> face <|>
  (mempty <$ skipWhile (not . isEndOfLine))
  where vertex =
          do string "v"
             skipSpace
             v <- v3
             return $!
               Obj (Seq.singleton v) mempty mempty
        normal =
          do string "vn"
             skipSpace
             v <- v3
             return $!
               Obj mempty (Seq.singleton v) mempty
        face =
          do string "f"
             skipSpace
             let v =
                   (,) <$> decimal <* char '/' <* decimal <* char '/' <*> decimal
             (v1,n1) <- v
             skipSpace
             (v2,n2) <- v
             skipSpace
             (v3,n3) <- v
             mv4 <-
               optional (do skipSpace
                            v)
             return $!
               Obj mempty
                   mempty
                   (Seq.singleton
                      (V2 (V3 v1 v2 v3)
                          (V3 n1 n2 n3)) <>
                    case mv4 of
                      Just (v4,n4) ->
                        Seq.singleton
                          (V2 (V3 v1 v3 v4)
                              (V3 n1 n3 n4))
                      Nothing -> mempty)
        v3 =
          do x <- double
             skipSpace
             y <- double
             skipSpace
             z <- double
             return $! V3 x y z

parseObj :: Parser Obj
parseObj = go mempty
  where go !acc =
          do !l <- parseLine
             acc' <- return $! acc <> l
             endOfLine *> go acc' <|> acc' <$ endOfInput

Running this with a 36MB obj file parses sucessfully, but

  21,342,866,200 bytes allocated in the heap
   1,263,590,520 bytes copied during GC
     290,617,624 bytes maximum residency (10 sample(s))
      56,958,112 bytes maximum slop
             547 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     41177 colls,     0 par    0.711s   0.708s     0.0000s    0.0008s
  Gen  1        10 colls,     0 par    0.241s   0.241s     0.0241s    0.1669s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    5.071s  (  5.077s elapsed)
  GC      time    0.952s  (  0.949s elapsed)
  RP      time    0.000s  (  0.000s elapsed)
  PROF    time    0.000s  (  0.000s elapsed)
  EXIT    time    0.020s  (  0.020s elapsed)
  Total   time    6.055s  (  6.046s elapsed)

  %GC     time      15.7%  (15.7% elapsed)

  Alloc rate    4,208,709,362 bytes per MUT second

  Productivity  84.3% of total user, 84.4% of total elapsed

While the productivity is good, 6 seconds to open an .obj file with 547MB total memory usage seems excessive. I've uploaded a heap profile here. The .prof file is

    Fri Jun  5 22:36 2015 Time and Allocation Profiling Report  (Final)

       Deferred +RTS -p -RTS

    total time  =        5.17 secs   (5173 ticks @ 1000 us, 1 processor)
    total alloc = 13,142,553,520 bytes  (excludes profiling overheads)

COST CENTRE      MODULE                %time %alloc

parseLine.v3     WavefrontObj           69.8   69.5
parseLine.face.v WavefrontObj            8.4   13.7
parseLine.face   WavefrontObj            7.0    6.1
timeLog          Deferred                3.3    1.2
parseLine        WavefrontObj            2.8    1.7
parseLine.normal WavefrontObj            2.5    3.6
readTextDevice   Data.Text.Internal.IO   2.2    0.1
parseLine.vertex WavefrontObj            1.7    2.6
mappend          WavefrontObj            1.4    1.1


                                                                           individual     inherited
COST CENTRE              MODULE                          no.     entries  %time %alloc   %time %alloc

MAIN                     MAIN                             85           0    0.0    0.0   100.0  100.0
 main                    Deferred                        171           0    0.0    0.0   100.0  100.0
  timeLog                Deferred                        173           0    3.3    1.2   100.0  100.0
   parseObj              WavefrontObj                    178           0    0.0    0.0    94.4   98.7
    parseObj.go          WavefrontObj                    179     1080806    0.6    0.5    94.4   98.7
     parseLine           WavefrontObj                    181           0    2.8    1.7    93.8   98.2
      parseLine.v3       WavefrontObj                    190           0   69.7   69.5    70.8   70.3
       mappend           WavefrontObj                    191      667027    1.0    0.8     1.0    0.8
      parseLine.face     WavefrontObj                    187           0    7.0    6.1    16.9   21.3
       parseLine.face.v  WavefrontObj                    195           0    8.4   13.7     9.7   15.2
        parseLine.normal WavefrontObj                    196           0    0.9    1.3     1.2    1.5
         mappend         WavefrontObj                    197      127769    0.3    0.3     0.3    0.3
       parseLine.normal  WavefrontObj                    193           0    0.0    0.0     0.0    0.0
       mappend           WavefrontObj                    188      286011    0.1    0.0     0.1    0.0
      parseLine.normal   WavefrontObj                    185           0    1.5    2.3     1.6    2.3
       parseLine.v3      WavefrontObj                    192           0    0.1    0.0     0.1    0.0
      parseLine.vertex   WavefrontObj                    183           0    1.7    2.6     1.7    2.6
   readTextDevice        Data.Text.Internal.IO           174       18260    2.2    0.1     2.2    0.1
 CAF                     Deferred                        169           0    0.0    0.0     0.0    0.0
  main                   Deferred                        170           1    0.0    0.0     0.0    0.0
   timeLog               Deferred                        172           1    0.0    0.0     0.0    0.0
 CAF                     WavefrontObj                    166           0    0.0    0.0     0.0    0.0
  parseLine              WavefrontObj                    180           1    0.0    0.0     0.0    0.0
   parseLine.v3          WavefrontObj                    189           1    0.0    0.0     0.0    0.0
   parseLine.face        WavefrontObj                    186           1    0.0    0.0     0.0    0.0
    parseLine.face.v     WavefrontObj                    194           1    0.0    0.0     0.0    0.0
   parseLine.normal      WavefrontObj                    184           1    0.0    0.0     0.0    0.0
   parseLine.vertex      WavefrontObj                    182           1    0.0    0.0     0.0    0.0
  mempty                 WavefrontObj                    176           1    0.0    0.0     0.0    0.0
  parseObj               WavefrontObj                    175           1    0.0    0.0     0.0    0.0
   parseObj.go           WavefrontObj                    177           1    0.0    0.0     0.0    0.0
 CAF                     Data.Attoparsec.Text.Internal   153           0    0.0    0.0     0.0    0.0
 CAF                     Data.Scientific                 152           0    0.0    0.0     0.0    0.0
 CAF                     Data.Text.Array                 150           0    0.0    0.0     0.0    0.0
 CAF                     Data.Text.Internal              148           0    0.0    0.0     0.0    0.0
 CAF                     GHC.Err                         135           0    0.0    0.0     0.0    0.0
 CAF                     GHC.IO.Handle.FD                132           0    0.0    0.0     0.0    0.0
 CAF                     GHC.IO.Handle.Internals         131           0    0.0    0.0     0.0    0.0
 CAF                     GHC.Conc.Signal                 125           0    0.0    0.0     0.0    0.0
 CAF                     GHC.IO.Encoding                 121           0    0.0    0.0     0.0    0.0
 CAF                     GHC.IO.FD                       120           0    0.0    0.0     0.0    0.0
 CAF                     GHC.Conc.Sync                   108           0    0.0    0.0     0.0    0.0
 CAF                     GHC.IO.Encoding.Iconv           106           0    0.0    0.0     0.0    0.0
 CAF                     GHC.Integer.Type                 92           0    0.0    0.0     0.0    0.0

I have a guess that seeing as my hotspot is the v3 parser, this might just be something to do with using the double parser.

John F. Miller · Accepted Answer

I believe your problem is in the definition of Obj:

data Obj =
  Obj {objVertices :: !(Seq (V3 Double))
      ,objNormals :: !(Seq (V3 Double))
      ,objFaces :: !(Seq (V2 (V3 Int)))}
  deriving (Show)

Within your parser you only ever use one of the three fields, but you evaluate to WHNF and allocate space for an empty sequence for each Obj This will almost triple the space of an Obj in memory compared to its size in a flat text file. You also always have a Seq.singleton instance (also evaluated to WHNF) for every element. You are paying a price in time and memory for this.

You may say, "But all my time is spent in v3," and you would be correct. However all that time is spent in the allocation of memory, which (I think?) includes the cost of running the garbage collector. Statistically you are most likely to catch GC cycles where you do the most allocation.

My suggestions:

See what happens if you take the strictness out of Obj it may not work as a complete solution, but it will show you if that is the problem for the cost of deleting three !. If I'm wrong you've lost very little.
Convert Obj to a sum type instead of a product type.
Use a more conventional parser. Parsec is awesome when it comes to clean express ability and the complexity of the grammars it can handle, but is just not in the same class perfromance-wise as other more limited parsing techniques.

Possible Sum Type for Obj:

data Obj =
    Empty 
  | Vertex (V3 Double)
  | Normal (V3 Double)
  | Face   (V2 (V3 Int))
  | Obj {objVertices :: !(Seq (V3 Double))
        ,objNormals  :: !(Seq (V3 Double))
        ,objFaces    :: !(Seq (V2 (V3 Int)))}

Instance Monoid Obj where
  mempty = Empty
  mappend Empty x = x
  mappend x Empty = x
  mappend (Face v) (obj@Obj{objFaces = vs}) = obj{objFaces = v<|vs}
  mappend (obj@Obj{objFaces = vs}) (Face v) = obj{objFaces = vs |> v}
  ...

bartavelle · Answer

In my experience, the answer to line based formats parsing is to use BS.lines and BS.words from Data.ByteString.Char8. It isn't really pretty, but no parser combinator approach while be quite as fast or memory efficient. Something like :

parseLine :: BS.ByteString -> [Either Xxx Obj]
parseLine = map prs . BS.lines

prs :: BS.ByteString -> Either Xxx Obj
prs l = case BS.words l of
         ["v", x, y, z] -> do
             v <- V3 <$> parseDouble x <*> parseDouble y <*> parseDouble z
             return $ Obj (Seq.singleton v) mempty mempty
         ...
         _ -> Left "blah"

That's for performance. For memory usage you'll probably want to use primitive vectors, and unpacked data types generally. It doesn't seem to be the case in your example, but you also need to check how the libraries you are using implement their data types. For example UTCTime from the time package uses a lot of memory.

Final tip: I usually parameter my data types with the type of "string" that they use. My parser functions return Foo ByteString, and I convert to Foo Text the subset that I will keep in memory and manipulate.

What is causing my Wavefront Obj parser to consume 10x more memory than the input?

Tags:

performance

memory-leaks

parsing

haskell

ocharles

2 Answers

John F. Miller

bartavelle

Recent Activity

Donate For Us

What is causing my Wavefront Obj parser to consume 10x more memory than the input?

Tags:

performance

memory-leaks

parsing

haskell

ocharles

2 Answers

John F. Miller

bartavelle

Related questions

Recent Activity

Donate For Us