Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to number lines read from a file using conduits?

I'm a Haskell beginner trying to wrap my head around the conduit library.

I've tried something like this, but it does not compile:

import Data.Conduit
import Data.Conduit.Binary as CB
import Data.ByteString.Char8 as BS

numberLine :: Monad m => Conduit BS.ByteString m BS.ByteString
numberLine = conduitState 0 push close
  where
    push lno input = return $ StateProducing (lno + 1) [BS.pack (show lno ++ BS.unpack input)]
    close state = return state

main = do
  runResourceT $ CB.sourceFile "wp.txt" $= CB.lines $= numberLine $$ CB.sinkFile "test.txt"

It seems that the state in conduitState must be of the same type as the conduit's input type. Or at least that's what I understand from the error message:

$ ghc --make exp.hs
[1 of 1] Compiling Main             ( exp.hs, exp.o )

exp.hs:8:27:
    Could not deduce (Num [ByteString]) arising from the literal `0'
    from the context (Monad m)
      bound by the type signature for
                 numberLine :: Monad m => Conduit ByteString m ByteString
      at exp.hs:(8,1)-(11,30)
    Possible fix:
      add (Num [ByteString]) to the context of
        the type signature for
          numberLine :: Monad m => Conduit ByteString m ByteString
      or add an instance declaration for (Num [ByteString])
    In the first argument of `conduitState', namely `0'
    In the expression: conduitState 0 push close
    In an equation for `numberLine':
        numberLine
          = conduitState 0 push close
          where
              push lno input
                = return
                  $ StateProducing (lno + 1) [pack (show lno ++ unpack input)]
              close state = return state

How can this be done using conduits? I want to read lines from a file and append a line number to each line.

like image 599
donatello Avatar asked Dec 05 '25 13:12

donatello


2 Answers

Yes, it can be done. I prefer to use the helper functions in Data.Conduit.List and also avoid Data.ByteString.Char8 if at all possible. I'm assuming your file is UTF-8 encoded.

import Data.Conduit
import Data.Conduit.Binary as CB
import Data.Conduit.List as Cl
import Data.Conduit.Text as Ct
import Data.Monoid ((<>))
import Data.Text as T

numberLine :: Monad m => Conduit Text m Text
numberLine = Cl.concatMapAccum step 0 where
  format input lno = T.pack (show lno) <> T.pack " " <> input <> T.pack "\n"
  step input lno = (lno+1, [format input lno])

main :: IO ()
main =
  runResourceT
     $ CB.sourceFile "wp.txt"
    $$ Ct.decode Ct.utf8
    =$ Ct.lines
    =$ numberLine
    =$ Ct.encode Ct.utf8
    =$ CB.sinkFile "test.txt"
like image 162
Nathan Howell Avatar answered Dec 08 '25 17:12

Nathan Howell


close state = return state

Herein lies the type error. Your close function should have type (state -> m [output]) (as per the docs). In your case state = Int (you may want to add type annotations to make sure it selects Int) and output = BS.ByteString, so probably just return the empty list, since at the point of closing the conduit, you haven't really saved any ByteStrings to produce or anything like that.

close _ = return []

Especially note from the docs for that argument:

The state need not be returned, since it will not be used again

like image 35
Dan Burton Avatar answered Dec 08 '25 16:12

Dan Burton



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!