Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use the conduit drop function in a pipeline?

I have a simple task - read a bunch of lines out of a file and do something with each one of them. Except the first one - which are some headings to be ignored.

So I thought I'd try out conduits.

printFile src = runResourceT $ CB.sourceFile src =$= 
    CT.decode CT.utf8 =$= CT.lines =$= CL.mapM_ putStrLn

Cool.

So now I just want to drop the first line off ... and there seems to be a function for that -

printFile src = runResourceT $ CB.sourceFile src =$= 
    CT.decode CT.utf8 =$= CT.lines =$= drop 1 =$= CL.mapM_ putStrLn

Hmm - but now I notice drop has type signature Sink a m (). Someone suggested to me that I can use the Monad instance for pipes and use drop to effectfully drop some elements - so I tried this:

drop' :: Int -> Pipe a a m ()
drop' n = do
  CL.drop n
  x <- await
  case x of 
    Just v -> yield v
    Nothing -> return ()

Which doesn't type check because the monad instance for pipes only applies to pipes of the same type - Sinks have Void as their output, so I can't use it like this.

I took a quick look at pipes and pipes-core and I notice that pipes-core has the function as I expected it to be, where as pipes is a minimal library but the documentation shows how it would be implemented.

So I'm confused - maybe there's a key concept I'm missing .. I saw the function

sequence ::  Sink input m output -> Conduit input m output

But that doesn't seem to be the right idea, as the output value is ()

CL.sequence (CL.drop 1) :: Conduit a m ()    

I'll probably just go back and use lazy-io as I don't really need any streaming - but I'd be interested to see the proper way to do it.

like image 961
Oliver Avatar asked May 31 '12 13:05

Oliver


1 Answers

Firstly, the simple answer:

... =$= CT.lines =$= (CL.drop 1 >> CL.mapM_ putStrLn)

The longer explanation: there are really two different ways you can implement drop. Either way, it will first drop n elements from the input. There are two choices about what it does next:

  • Says it's done
  • Start outputting all of the remaining items from the input stream

The former behavior is what a Sink would perform (and what our drop actually does) while the latter is the behavior of a Conduit. You can in fact generate the latter from the former through monadic composition:

dropConduit n = CL.drop n >> CL.map id

Then you can use dropConduit as you describe at the beginning. This is a good way of demonstrating the difference between monadic composition and fusing; the former allows two functions to operate on the same input stream, while the latter allows one function to feed a stream to the other.

I haven't benchmarked, but I'm fairly certain that monadic composition will be a bit more efficient.

like image 168
Michael Snoyman Avatar answered Sep 20 '22 17:09

Michael Snoyman