Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I avoid memory problems when writing to file using the Writer monad?

Tags:

haskell

I am building some moderately large DIMACS files, however with the method used below the memory usage is rather large compared to the size of the files generated, and on some of the larger files I need to generate I run in to out of memory problems.

import Control.Monad.State.Strict
import Control.Monad.Writer.Strict
import qualified Data.ByteString.Lazy.Char8 as B
import Control.Monad
import qualified Text.Show.ByteString as BS
import Data.List

main = printDIMACS "test.cnf" test

test = do
  xs <- freshs 100000
  forM_ (zip xs (tail xs))
    (\(x,y) -> addAll [[negate x, negate y],[x,y]])

type Var = Int
type Clause = [Var]

data DIMACSS = DS{
  nextFresh :: Int,
  numClauses :: Int
} deriving (Show)

type DIMACSM a = StateT DIMACSS (Writer B.ByteString) a

freshs :: Int -> DIMACSM [Var] 
freshs i = do
  next <- gets nextFresh
  let toRet = [next..next+i-1]
  modify (\s -> s{nextFresh = next+i}) 
  return toRet

fresh :: DIMACSM Int
fresh = do
  i <- gets nextFresh
  modify (\s -> s{nextFresh = i+1}) 
  return i

addAll :: [Clause] -> DIMACSM ()
addAll c = do
  tell 
    (B.concat . 
    intersperse (B.pack " 0\n") . 
    map (B.unwords . map BS.show) $ c)
  tell (B.pack " 0\n")
  modify (\s -> s{numClauses = numClauses s + length c})

add h = addAll [h]

printDIMACS :: FilePath -> DIMACSM a -> IO ()
printDIMACS file f = do
  writeFile file ""
  appendFile file (concat ["p cnf ", show i, " ", show j, "\n"])
  B.appendFile file b
   where
     (s,b) = runWriter (execStateT f (DS 1 0))
     i = nextFresh s - 1
     j = numClauses s

I would like to keep the monadic building of clauses since it is very handy, but I need to overcome the memory problem. How do I optimize the above program so that it doesn't use too much memory?

like image 282
HaskellElephant Avatar asked Oct 30 '12 14:10

HaskellElephant


2 Answers

If you want good memory behavior, you need to make sure that you write out the clauses as you generate them, instead of collecting them in memory and dumping them as such, either using lazyness or a more explicit approach such as conduits, enumerators, pipes or the like.

The main obstacle to that approach is that the DIMACS format expects the number of clauses and variables in the header. This prevents the naive implementation from being sufficiently lazy. There are two possibilities:

The pragmatic one is to write the clauses first to a temporary location. After that the numbers are known, so you write them to the real file and append the contents of the temporary file.

The prettier approach is possible if the generation of clauses has no side effects (besides the effects offered by your DIMACSM monad) and is sufficiently fast: Run it twice, first throwing away the clauses and just calculating the numbers, print the header line, run the generator again; now printing the clauses.

(This is from my experience with implementing SAT-Britney, where I took the second approach, because it fitted better with other requirements in that context.)

Also, in your code, addAll is not lazy enough: The list c needs to be retained even after writing (in the MonadWriter sense) the clauses. This is another space leak. I suggest you implement add as the primitive operation and then addAll = mapM_ add.

like image 123
Joachim Breitner Avatar answered Nov 03 '22 13:11

Joachim Breitner


As explained in Joachim Breitner's answer the problem was that DIMACSM was not lazy enough, both because the strict versions of the monads was used and because the number of variables and clauses are needed before the ByteString can be written to the file. The solution is to use the lazy versions of the Monads and execute them twice. It turns out that it is also necessary to have WriterT be the outer monad:

import Control.Monad.State
import Control.Monad.Writer

...

type DIMACSM a = WriterT B.ByteString (State DIMACSS) a

...

printDIMACS :: FilePath -> DIMACSM a -> IO ()
printDIMACS file f = do
  writeFile file ""
  appendFile file (concat ["p cnf ", show i, " ", show j, "\n"])
  B.appendFile file b
   where
     s = execState (execWriterT f) (DS 1 0)
     b = evalState (execWriterT f) (DS 1 0)
     i = nextFresh s - 1
     j = numClauses s
like image 33
HaskellElephant Avatar answered Nov 03 '22 11:11

HaskellElephant