Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Haskell enumerator: analog to iteratees `enumWith` operator?

Earlier today I wrote a small test app for iteratees that composed an iteratee for writing progress with an iteratee for actually copying data. I wound up with values like these:

-- NOTE: this snippet is with iteratees-0.8.5.0
-- side effect: display progress on stdout
displayProgress :: Iteratee ByteString IO ()

-- side effect: copy the bytestrings of Iteratee to Handle
fileSink :: Handle -> Iteratee ByteString IO ()

writeAndDisplayProgress :: Handle -> Iteratee ByteString IO ()
writeAndDisplayProgress handle = sequence_ [fileSink handle, displayProgress]

In looking at the enumerator library, I don't see an analog of sequence_ or enumWith. All I want to do is compose two iteratees so they act as one. I could discard the result (it's going to be () anyway) or keep it, I don't care. (&&&) from Control.Arrow is what I want, only for iteratees rather than arrows.

I tried these two options:

-- NOTE: this snippet is with enumerator-0.4.10
run_ $ enumFile source $$ sequence_ [iterHandle handle, displayProgress]
run_ $ enumFile source $$ sequence_ [displayProgress, iterHandle handle]

The first one copies the file, but doesn't show progress; the second one shows progress, but doesn't copy the file, so obviously the effect of the built-in sequence_ on enumerator's iteratees is to run the first iteratee until it terminates and then run the other, which is not what I want. I want to be running the iteratees in parallel rather than serially. I feel like I'm missing something obvious, but in reading the wc example for the enumerator library, I see this curious comment:

-- Exactly matching wc's output is too annoying, so this example
-- will just print one line per file, and support counting at most
-- one statistic per run

I wonder if this remark indicates that combining or composing iteratees within the enumerations framework isn't possible out of the box. What's the generally-accepted right way to do this?

Edit:

It seems as though there is no built-in way to do this. There's discussion on the Haskell mailing list about adding combinators like enumSequence and manyToOne but so far, there doesn't seem to be anything actually in the enumerator package that furnishes this capability.

like image 809
Daniel Lyons Avatar asked Jul 14 '11 05:07

Daniel Lyons


1 Answers

It seems to me like rather than trying to have two Iteratees consume the sequence in parallel, it would be better to feed the stream through an identity Enumeratee that simply counts the bytes passing it.

Here's a simple example that copies a file and prints the number of bytes copied after each chunk.

import System.Environment
import System.IO
import Data.Enumerator
import Data.Enumerator.Binary (enumFile, iterHandle)
import Data.Enumerator.List (mapAccumM)
import qualified Data.ByteString as B

printBytes :: Enumeratee B.ByteString B.ByteString IO ()
printBytes = flip mapAccumM 0 $ \total bytes -> do
    let total' = total + B.length bytes
    print total'
    return (total', bytes)

copyFile s t = withBinaryFile t WriteMode $ \h -> do
    run_ $ (enumFile s $= printBytes) $$ iterHandle h

main = do
    [source, target] <- getArgs
    copyFile source target
like image 109
hammar Avatar answered Oct 23 '22 07:10

hammar