Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the pros and cons of Enumerators vs. Conduits vs. Pipes?

I'd like to hear from someone with a deeper understanding than myself what the fundamental differences are between Enumerators, Conduits, and Pipes as well as the key benefits and drawbacks. Some discussion's already ongoing but it'd be nice to have a high-level overview.

like image 495
Luke Hoersten Avatar asked Apr 02 '12 21:04

Luke Hoersten


2 Answers

Enumerators/Iteratees as an abstraction were invented by Oleg Kiselyov. They provide a clean way of doing IO with predictable (low) resource requirements. The current Enumerators package is pretty close to Oleg's original work.

Conduits were created for the Yesod web framework. My understanding is that they were designed to be blazingly fast. Early versions of the library were highly stateful.

Pipes focus on elegance. They have just one type instead of several, form monad (transformer) and category instances, and are very "functional" in design.

If you like categorical explanations: the Pipe type is just the free monad over the following ungodly simple functor

data PipeF a b m r = M (m r) | Await (a -> r) | Yield b r
instance Monad m => Functor (PipeF a b m) where
   fmap f (M mr) = M $ liftM mr
   fmap f (Await g) = Await $ f . g
   fmap f (Yield b p) = Yield b (f p)
--Giving:
newtype Pipe a b m r = Pipe {unPipe :: Free (PipeF a b m) r}
  deriving (Functor, Applicative, Monad)

--and
instance MonadTrans (Pipe a b) where
   lift = Pipe . inj . M

In the actual pipe definition these are baked in, but the simplicity of this definition is amazing. Pipes form a category under the operation (<+<) :: Monad m => Pipe c d m r -> Pipe a b m r -> Pipe a d m r which takes whatever the first pipe yields and feeds it to the awaiting second pipe.

It looks like Conduits is moving to be more Pipe like (using CPS instead of state, and switching to a single type) while Pipes are gaining support for better error handling, and perhaps the return of separate types for generators and consumers.

This area is moving quickly. I've been hacking on an experimental variant of the Pipe library with these features, and know other people are as well (see the Guarded Pipes package on Hackage), but suspect that Gabriel (the author of Pipes) will figure them out before I do.

My recommendations: if you are using Yesod, use Conduits. If you want a mature library use Enumerator. If you primarily care about elegance, use Pipe.

like image 112
Philip JF Avatar answered Oct 24 '22 09:10

Philip JF


After writing applications with all three libraries, I think the biggest difference I've seen is in how resource finalization is handled. For example, Pipes breaks resource finalization out into separate types of Frames and Stacks.

There also still seems to be some debate about how to not only finalize the input resource, but also potentially the output resource. For example, if you're reading from a DB and writing to a file, the connection for the DB needs to be closed as well as the output file needing to be flushed and closed. Things get hairy when deciding how to handle exceptions and failure cases along the pipeline.

Another more subtle difference seems to be how the return value of the enumerator pipeline is handled and computed.

A lot of these differences and potential inconsistencies have been exposed by the use of the Monad and Category implementations for Pipes and now are making their way into Conduits.

like image 7
Luke Hoersten Avatar answered Oct 24 '22 08:10

Luke Hoersten