I'd like to hear from someone with a deeper understanding than myself what the fundamental differences are between Enumerators, Conduits, and Pipes as well as the key benefits and drawbacks. Some discussion's already ongoing but it'd be nice to have a high-level overview.
Enumerators/Iteratees as an abstraction were invented by Oleg Kiselyov. They provide a clean way of doing IO with predictable (low) resource requirements. The current Enumerators package is pretty close to Oleg's original work.
Conduits were created for the Yesod web framework. My understanding is that they were designed to be blazingly fast. Early versions of the library were highly stateful.
Pipes focus on elegance. They have just one type instead of several, form monad (transformer) and category instances, and are very "functional" in design.
If you like categorical explanations: the Pipe
type is just the free monad over the following ungodly simple functor
data PipeF a b m r = M (m r) | Await (a -> r) | Yield b r
instance Monad m => Functor (PipeF a b m) where
fmap f (M mr) = M $ liftM mr
fmap f (Await g) = Await $ f . g
fmap f (Yield b p) = Yield b (f p)
--Giving:
newtype Pipe a b m r = Pipe {unPipe :: Free (PipeF a b m) r}
deriving (Functor, Applicative, Monad)
--and
instance MonadTrans (Pipe a b) where
lift = Pipe . inj . M
In the actual pipe definition these are baked in, but the simplicity of this definition is amazing. Pipes form a category under the operation (<+<) :: Monad m => Pipe c d m r -> Pipe a b m r -> Pipe a d m r
which takes whatever the first pipe yields
and feeds it to the awaiting second pipe.
It looks like Conduits
is moving to be more Pipe
like (using CPS instead of state, and switching to a single type) while Pipes are gaining support for better error handling, and perhaps the return of separate types for generators and consumers.
This area is moving quickly. I've been hacking on an experimental variant of the Pipe library with these features, and know other people are as well (see the Guarded Pipes package on Hackage), but suspect that Gabriel (the author of Pipes) will figure them out before I do.
My recommendations: if you are using Yesod, use Conduits. If you want a mature library use Enumerator. If you primarily care about elegance, use Pipe.
After writing applications with all three libraries, I think the biggest difference I've seen is in how resource finalization is handled. For example, Pipes breaks resource finalization out into separate types of Frames and Stacks.
There also still seems to be some debate about how to not only finalize the input resource, but also potentially the output resource. For example, if you're reading from a DB and writing to a file, the connection for the DB needs to be closed as well as the output file needing to be flushed and closed. Things get hairy when deciding how to handle exceptions and failure cases along the pipeline.
Another more subtle difference seems to be how the return value of the enumerator pipeline is handled and computed.
A lot of these differences and potential inconsistencies have been exposed by the use of the Monad and Category implementations for Pipes and now are making their way into Conduits.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With