Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do monad transformers, generally speaking, arise out of adjunctions?

In Adjoint functors determine monad transformers, but where's lift?, Simon C has shown us the construction...

newtype Three u f m a = Three { getThree :: u (m (f a)) }

... which, as the answers there discuss, can be given an instance Adjunction f u => MonadTrans (Three u f) (adjunctions provides it as AdjointT). Any Hask/Hask adjunction thus leads to a monad transformer; in particular, StateT s arises in this manner from the currying adjunction between (,) s and (->) s.

My follow-up question is: does this construction generalise to other monad transformers? Is there a way to derive, say, the other transformers from the transformers package from suitable adjunctions?


Meta remarks: my answer here was originally written for Simon C's question. I opted to spin it off into a self-answered question because, upon rereading that question, I noticed my purported answer had more to do with the discussion in the comments over there than with the question body itself. Two other closely related questions, to which this Q&A arguably is also a follow-up, are Is there a monad that doesn't have a corresponding monad transformer (except IO)? and Is the composition of an arbitrary monad with a traversable always a monad?

like image 392
duplode Avatar asked Jun 23 '19 18:06

duplode


1 Answers

The three constructions in this answer are also available, in a reproducible form, in this Gist.

Simon C's construction...

newtype Three u f m a = Three { getThree :: u (m (f a)) }

... relies on f and u being adjoint Hask endofunctors. While that works out in the case of StateT, there are two connected issues we have to deal with if we are to make it more general:

  • Firstly, we need to find suitable adjunctions for the "feature monads" the transformers will be built upon; and

  • Secondly, if such an adjunction takes us away from Hask, we will have to somehow workaround the fact that it won't be possible to use the Hask monad m directly.

There are quite a few interesting adjunctions we might experiment with. In particular, there are two adjunctions that are available for every monad: the Kleisli adjunction, and the Eilenberg-Moore adjunction (for a fine categorical presentation of them, see Emily Riehl, Category Theory In Context, section 5.2). In the categorical excurse that takes up the first half or so of this answer, I will focus on the Kleisli adjunction, simply because it is more comfortable to wiggle about in pseudo-Haskell.

(By pseudo-Haskell, I mean there will be rampant abuse of notation in what follows. To make it easier on the eyes, I will use some ad hoc conventions: |-> means a mapping between things that aren't necessarily types; similarly, : means something that resembles a type signature; ~> means a non-Hask morphism; curly and angled brackets highlight objects in selected non-Hask categories; . also means functor composition; and F -| U means F and U are adjoint functors.)

Kleisli adjunction

If g is a Hask Monad, there is a corresponding Kleisli adjunction FK g -| UK g between FK g, which takes us to the g's Kleisli category...

-- Object and morphism mappings.
FK g : a          |-> {a}
       f : a -> b |-> return . f : {a} ~> {b} ~ a -> g b
-- Identity and composition in Kleisli t are return and (<=<)

... and UK g, which takes us back to Hask:

UK g : {a}            |-> g a
       f : {a} -> {b} |-> join . fmap f : g a -> g b  -- that is, (>>= f)

-- The adjunction isomorphism:
kla : (FK g a ~> {b}) -> (a -> UK g {b})
kra : (a -> UK g {b}) -> (FK g a ~> {b})
-- kla and kra mirror leftAdjunct and rightAdjunct from Data.Functor.Adjunction.
-- The underlying Haskell type is a -> g b on both sides, so we can simply have:
kla = id
kra = id

Along the lines of Simon C's Three, let's have g as the feature monad, on which the transformer will be built upon. The transformer will somehow incorporate the effects of another Hask monad, m, which I will sometimes refer to as the "base monad", following customary Haskell terminology.

If we attempt to squeeze m between FK g and UK g, we run into the second issue mentioned above: we would need a Kleisli-g endofunctor, rather than a Hask one. There is little else to do but make it up. By that, I mean we can define a functor for functors (more specifically, a functor between the two categories of endofunctors) which will hopefully turn m into something we can use. I will call this "higher" functor it HK g. Applying it to m should give something like this:

-- Keep in mind this is a Kleisli-g endofunctor.
HK g m : {a}                |-> {m a}
         f : {a} ~> {b}     |-> kmap f : {m a} ~> {m b} ~ m a -> g (m b)
-- This is the object mapping, taking functors to functors.
-- The morphism mapping maps natural transformations, a la Control.Monad.Morph:
         t : ∀x. m x -> n x |-> kmorph t : ∀x. {m x} ~> {n x} ~ ∀x. m x -> g (n x)
-- I won't use it explicitly, but it is there if you look for it.

Kleisli upon Kleisli

(Note: long-winded categorical twiddling ahead. If you're in a hurry, feel free to skim up to the "In summary" subsection.)

UK g . HK g m . FK g will be a Hask endofunctor, the counterpart to Three construction. We further want it to be a monad on Hask. We can ensure that by setting up HK g m as a monad on the Kleisli-g category. That means we need to figure out counterparts to fmap, return and join on Kleisli-g:

kmap : {a} ~> {b} |-> {m a} ~> {m b}
       (a -> g b)  ->  m a -> g (m b)

kreturn : {a} ~> {m a}
          a -> g (m a)

kjoin : {m (m a)} ~> {m a}
        m (m a) -> g (m a) 

For kreturn and kjoin, let's try the simplest things that could possibly work:

kreturn :: (Monad g, Monad m) => a -> g (m a)
kreturn = return . return 

kjoin :: (Monad g, Monad m) => m (m a) -> g (m a)
kjoin = return . join

kmap is somewhat trickier. fmap @m will give out m (g a) instead of g (m a), so we'd need a way to pull the g layer outside. As it happens, there is a convenient way to do that, but it requires g to be a Distributive functor:

kmap :: (Monad g, Distributive g, Monad m) => (a -> g b)  ->  m a -> g (m b)
kmap f = distribute . fmap f  -- kmap = collect

Laws and distributivity conditions

Those guesses, of course, mean nothing unless we can show they are lawful:

-- Functor laws for kmap
kmap return = return
kmap g <=< kmap f = kmap (g <=< f)

-- Naturality of kreturn
kmap f <=< kreturn = kreturn <=< f

-- Naturality of kjoin
kjoin <=< kmap (kmap f) = kmap f <=< kjoin

-- Monad laws
kjoin <=< kreturn = return
kjoin <=< kmap kreturn = return
kjoin <=< kmap kjoin = kjoin <=< kjoin

Working it out shows the four conditions for composing monads with a distributive law are sufficient to ensure the laws hold:

-- dist :: t (g a) -> g (t a)
-- I'm using `dist` instead of `distribute` and `t` instead of `m` here for the
-- sake of notation neutrality. 
dist . fmap (return @g) = return @g                 -- #1
dist . return @t = fmap (return @t)                 -- #2
dist . fmap (join @g) = join @g . fmap dist . dist  -- #3
dist . join @t = fmap (join @t) . dist . fmap dist  -- #4
-- In a nutshell: dist must preserve join and return for both monads.

In our case, condition #1 gives kmap identity, kjoin right identity and kjoin associativity; #2 gives kreturn naturality; #3, functor composition; #4, kjoin naturality (kjoin left identity doesn't depend on any of the four conditions). The final sanity check is establishing what it takes for the conditions themselves to hold. In the specific case of distribute, its very strong naturality properties mean the four conditions necessarily hold for any lawful Distributive, so we're good to go.

Wrapping it up

The whole UK g . HK g m . FK g monad can be derived from the pieces we already have by splitting HK g m into a Kleisli adjunction, which is entirely analogous to the Kleisli adjunction we began with, except we start from Klesili-g rather than Hask:

HK g m = UHK g m . FHK g m

FHK g m : {a}        |-> <{a}>
      f : {a} ~> {b} |-> fmap return . f : <{a}> ~> <{b}> ~ a -> g (m b)
-- kreturn <=< f = fmap (return @m) . f
-- Note that m goes on the inside, so that we end up with a morphism in Kleisli g.

UHK g m : <{a}>          |-> {m a}
      f : <{a}> ~> <{b}> |-> fmap join . distribute . fmap f : {m a} ~> {m b} ~ m a -> g (m b)
-- kjoin <=< kmap f = fmap (join @m) . distribute . fmap f

-- The adjunction isomorphism:
hkla : (FHK g m {a} ~> <{b}>) -> ({a} ~> UHK g m <{b}>)
hkra : ({a} ~> UHK g m <{b}>) -> (FHK g m {a} ~> <{b}>)
-- Just like before, we have:
hkla = id
hkra = id

-- And, for the sake of completeness, a Kleisli composition operator:
-- g <~< f = kjoin <=< kmap g <=< f
(<~<) :: (Monad g, Distributive g, Monad m)
    => (b -> g (m c)) -> (a -> g (m b)) -> (a -> g (m c))
g <~< f = fmap join . join . fmap (distribute . fmap g) . f

Now that we have two adjunctions at hand, we can compose them, leading to the transformer adjunction and, at long last, to return and join for the transformer:

-- The composition of the three morphism mappings in UK g . HK g m . FK g
-- tkmap f = join . fmap (kjoin <=< kmap (kreturn <=< return . f))
tkmap :: (Monad g, Distributive g, Monad m) => (a -> b) -> g (m a) -> g (m b)
tkmap = fmap . fmap

-- Composition of two adjunction units, suitably lifted through the functors.
-- tkreturn = join . fmap (hkla hkid) . kla kid = join . fmap kreturn . return
tkreturn :: (Monad g, Monad m) => a -> g (m a)
tkreturn = return . return

-- Composition of the adjunction counits, suitably lifted through the functors.
-- tkjoin = join . fmap (kjoin <=< kmap (hkra kid <~< (kreturn <=< kra id)))
--    = join . fmap (kjoin <=< kmap (return <~< (kreturn <=< id)))
tkjoin :: (Monad g, Distributive g, Monad m) => g (m (g (m a))) -> g (m a)
tkjoin = fmap join . join . fmap distribute

(For a categorical explanation of the composition of units and counits, see Emily Riehl, Category Theory In Context, proposition 4.4.4.)

As for lift, kmap (return @g) sounds like a sensible definition. That amounts to distribute . fmap return (compare with the lift from Benjamin Hodgson's answer to Simon C's question), which by condition #1 becomes simply:

tklift :: m a -> g (m a)
tklift = return

The MonadLift laws, which mean tklift must be a monad morphism, do hold, with the join law hinging on distributivity condition #1:

tklift . return = tkreturn
tklift . join = tkjoin . tkmap tklift . tklift

In summary

The Kleisli adjunction allows us to construct a transfomer from any Distributive monad by composing it on the outside of any other monad. Putting it all together, we have:

-- This is still a Three, even though we only see two Hask endofunctors.
-- Or should we call it FourK?
newtype ThreeK g m a = ThreeK { runThreeK :: g (m a) }

instance (Functor g, Functor m) => Functor (ThreeK g m) where
    fmap f (ThreeK m) = ThreeK $ fmap (fmap f) m

instance (Monad g, Distributive g, Monad m) => Monad (ThreeK g m) where
    return a = ThreeK $ return (return a)
    m >>= f = ThreeK $ fmap join . join . fmap distribute 
        $ runThreeK $ fmap (runThreeK . f) m

instance (Monad g, Distributive g, Monad m) => Applicative (ThreeK g m) where
    pure = return
    (<*>) = ap

instance (Monad g, Distributive g) => MonadTrans (ThreeK g) where
    lift = ThreeK . return

The quintessential example of Distributive is the function functor. Composing it on the outside of another monad gives out ReaderT:

newtype KReaderT r m a = KReaderT { runKReaderT :: r -> m a }
    deriving (Functor, Applicative, Monad) via ThreeK ((->) r) m
    deriving MonadTrans via ThreeK ((->) r)

The ThreeK instances perfectly agree with the canonical ReaderT ones.

Flipped transformers and the Eilenberg-Moore adjunction

In the derivation above, we stacked the base monad Klesli adjunction atop the feature monad adjunction. We could conceivably do it the other way around, and start from the base monad adjunction. The crucial change that would happen would come about when defining kmap. As the base monad can, in principle, be any monad, we wouldn't want to slap a Distributive constraint on it so that it can be pulled outwards, like we did with g in the derivation above. A better fit for our game plan would be, dually, requiring Traversable from the feature monad, so that it can be pushed inside with sequenceA. This will lead to a transformer that composes the feture monad on the inside, rather than on the outside.

Below is the overall feature-on-the-inside construction. I called it ThreeEM because it can also be obtained by using Eilenberg-Moore adjunctions (instead of Kleisli ones) and stacking them with the base monad on the top, as in Simon C's Three. This fact probably has to do with the duality between the Eilenberg-Moore and Klesili adjunctions.

newtype ThreeEM t m a = ThreeEM { runThreeEM :: m (t a) }

instance (Functor t, Functor m) => Functor (ThreeEM t m) where
    fmap f (ThreeEM m) = ThreeEM $ fmap (fmap f) m

instance (Monad t, Traversable t, Monad m) => Monad (ThreeEM t m) where
    return a = ThreeEM $ return (return a)
    m >>= f = ThreeEM $ fmap join . join . fmap sequenceA 
      $ runThreeEM $ fmap (runThreeEM . f) m

instance (Monad t, Traversable t, Monad m) => Applicative (ThreeEM t m) where
    pure = return
    (<*>) = ap

-- In terms of of the Kleisli construction: as the bottom adjunction is now the
-- base monad one, we can use plain old fmap @m instead of kmap to promote return. 
instance (Monad t, Traversable t) => MonadTrans (ThreeEM t) where
    lift = ThreeEM . fmap return

Common transformers that arise in this fashion include MaybeT and ExceptT.

There is one potential pitfall with this construction. sequenceA has to follow the distributivity conditions so that the instances are lawful. Its Applicative constraint, however, makes its naturality propertes a lot weaker than those of distribute, and so the conditions don't all hold for free:

  • Condition #1 does hold: it is a consequence of the identity and naturality laws of Traversable.

  • Condition #2 also holds: sequenceA preserves natural transformations on the traversable functor as long as those transformations preserve toList. If we regard return as a natural transformation from Identity, that immediately holds case.

  • Condition #3, however, is not guaranteed. It would hold if join @m, taken as a natural transformation from Compose m m, preserved (<*>), but that might not be the case. If sequenceA actually sequences effects (that is, if the traversable can hold more than one value), any differences arising from the order in which join and (<*>) are performed in the base monad will lead to the condition being violated. That, incidentally, is part of the notorious "ListT done wrong" problem: the ListT in transformers, built in accordance with this construction, is only lawful if used with commutative base monads.

  • Finally, condition #4 only holds if join @t, taken as a natural transformation from Compose t t, preserves toList (that is, if it doesn't drop, duplicate, or rearrange elements). One consequence is that this construction won't work for feature monads whose join "takes the diagonal" of the nested structure (as is generally the case for monads that also are Distributive instances), even if we try to paper over condition #3 by restricting ourselves to commutative base monads.

Those restrictions mean the construction isn't quite as widely applicable as one might like. Ultimately, the Traversable constraint is too broad. What we really need is probably to have the feature monad as an affine traversable (that is, a container that holds at most one element -- see this post by Oleg Grenrus for some lens-flavoured discussion); as far as I'm aware of, there is no canonical Haskell class for that, though.

Other possibilities

The constructions described thus far require the feature monad to be Distributive or Traversable, respectively. The overarching strategy, though, is not at all specific to the Kleisli and Eilenberg-Moore adjunctions, so it is conceivable to attempt it while using other adjunctions. The fact that the currying adjunction leads to StateT via Simon C's Three/AdjointT, even though State is neither Distributive nor Traversable, might suggest such attempts could be fruitful. I'm not optimistic about that, however.

In a related discussion elsewhere, Benjamin Hodgson conjectures that all adjunctions inducing a monad lead to the same transformer. That sounds very plausible, considering that all such adjunctions are related through unique functors to both the Kleisli and the Eilenberg-Moore adjunctions (on that, see Category Theory in Context, proposition 5.2.12). Case in point: if we attempt List with the ThreeK construction but using the free/forgetful adjunction to the category of monoids instead of Kleisli-[], we end up with the m [] transformer the ThreeEM/feature-on-the-inside construction would give us, down to the "ListT done wrong problem" of needing join to be an applicative homomorphism.

What about State and its transformer-producing third adjunction, then? Though I haven't worked it out in detail, I suspect it is more appropriated to think of distribute and sequenceA, as used in the constructions here, as belonging to the right and left adjoints, respectively, rather than to the whole feature monad. In the case of distribute, that would amount to looking beyond the Haskell type signature...

distribute :: (Distribute g, Functor m) => m (g a) -> g (m a)

... to see a natural transformation between Kleisli-g-to-Hask functors:

distribute  : m . UK g |-> UK g . HK g m

If I am right about that, it will be possible to turn this answer around and reinterpret the Three/AdjointT construction in terms of the Kleisli adjunction of the feature monad. If that is the case, State doesn't tell us much at all about other feature monads that are neither Distributive nor Traversable.

ListT done right

It is also worth noting that not all transformers come about from blending monadic effects through the composition of adjunctions in the way have seen here. In transformers, ContT and [SelectT do not follow the pattern; however, I'd say they are too wacky to be discussed in this context ("not a functor on the category of monads", as the docs point out). A better example is provided by the various "ListT done right" implementations, which avoid the unlawfulness problems associated with sequenceA (as well as the loss of streaming problems) by enmeshing the base monad effects in a way that doesn't require sequencing them in the bind of the transformer. Here is a sketch of an implementation, for illustrative purposes:

-- A recursion-schemes style base functor for lists.
data ListF a b = Nil | Cons a b
    deriving (Eq, Ord, Show, Functor)

-- A list type might be recovered by recursively filling the functorial
-- position in ListF.
newtype DemoList a = DemoList { getDemoList :: ListF a (DemoList a) }

-- To get the transformer, we compose the base monad on the outside of ListF.
newtype ListT m a = ListT { runListT :: m (ListF a (ListT m a)) }
    deriving (Functor, Applicative, Alternative) via WrappedMonad (ListT m)

-- Appending through the monadic layers. Note that mplus only runs the effect
-- of the first ListF layer; everything eslse can be consumed lazily.
instance Monad m => MonadPlus (ListT m) where
    mzero = ListT $ return Nil
    u `mplus` v = ListT $ runListT u >>= \case
        Nil -> runListT v
        Cons a u' -> return (Cons a (u' `mplus` v))

-- The effects are kept apart, and can be consumed as they are needed.
instance Monad m => Monad (ListT m) where
    return a = ListT $ pure (Cons a mzero)
    u >>= f = ListT $ runListT u >>= \case
        Nil -> return Nil
        Cons a v -> runListT $ f a `mplus` (v >>= f)

instance MonadTrans ListT where
    lift m = ListT $ (\a -> Cons a mzero) <$> m

In this ListT, the base monad effects are neither on the inside nor on the outside of the list. Rather, they are bolted on the spine of the list, made tangible by defining the type in terms of ListF.

Related transformers that are built in a similar way include the free-monad transformer FreeT, as well as the core monad transformers from effectful streaming libraries (it is no coincidence that the "ListT done right" link I included above points to the pipes documentation).

Can this kind of transformer be somehow related to the adjunction stacking strategy describe here? I haven't looked sufficiently hard at the matter to tell; it looks like an interesting question to ponder about.

like image 122
duplode Avatar answered Nov 05 '22 10:11

duplode