Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

If `zip` were a method of a lawful type class, of which then?

why the question   One may say that zip is a method of Applicative, the usual instance being ZipList. I am unhappy with it because it is unsafe. I am unhappy with Align too, because it is, by virtue of being all-encompassing, overly complicated and not specific enough for usual cases.

lawful classes   Some type classes in Haskell may be dubbed lawful. This means that they come with equalities that must hold — the laws for a class. It is ordinary that these laws come from category theoretic conceptualization of an aspect of programming. For example, Monad is a conceptualization of computations (whatever is meant by that) via the eponymous category theory device.

overlaying things   The usual operation to want to do with boxes of things is to lay them on top of each other, and if they are monoid, they will meld.

Examples:

  • Arithmetics with Maybe.
  • Addition of matrices.

not enough laws   The conceptualization of this concept is via monoidal functors, and the corresponding Applicative type class. There is, however, an annoying complication in that there are very often two ways to define the Applicative that both appear suitable. Why so? I propose that the answer is "not enough laws".

Examples:

  • For arithmetics:

    • The Sum monoid is the actual "endo-monoid". It is only legal for kin things. You cannot sum mass and force, for instance.
    • The Product monoid takes numbers of dimension a & b to a number of dimension c. Multiplying mass and force is legal and gets us to warmth.

    So, the right choice of monoid may be inferred from types.

  • For lists:

    • The usual direct sum of lists is the more safe one. It works with any finite number of elements trivially, and with co-finite number thereof with a "diagonal process" definition such as LogicT.
    • The ZipList definition is clearly unsafe. It is defined to, given two lists of distinct length, crop the longer one to the length of the shorter.
    • Length indexed vectors are the device that allows for a safe definition of zip, by demanding a proof that the given lists are of same length.
  • For matrices:

    • The usual addition of matrices has the (very reasonable) requirement of dimension homogeneity, the same as with length indexed vectors mentioned above. Since matrices are habitually used in various real world simulations, such as 3D graphics, once matrices begin to get cropped or zero-padded, people would complain quite immediately, so a ZipMatrix definition along the lines of ZipList above does not appear attractive.
    • The stranger Kronecker multiplication is reminiscent of the direct product of lists. And it admits the definition of Monad, too.

two cases   From these examples, it reveals itself that there are two distinct ideas mixed up in the thing we call a "monoid" or a "monoidal functor", and the distinction is very important for programming (unlike, perhaps, pure theory) because it would clean up confusion, remove unsafeties and, primarily, because there are, in each case, two completely unrelated algorithms to run.


I am thinking that maybe invertibility (also called "strength") of the monoidal functor is what matters. But the results of the Sum and the Product monoidal operation on Peano naturals are indistinguishable. (I am unsure whether they can be considered monoidal endofunctors.) So, I am turned to a guess that the changing of types is the hallmark. Multiplication of physical quantities does not type check as a Monoid, even!

P.S. There presents itself an instance of Monad for length indexed vectors over cartesian products and for matrices over Kronecker multiplication, with some sort of fold zip as join.

like image 671
Ignat Insarov Avatar asked Jan 27 '23 13:01

Ignat Insarov


1 Answers

Exact zipping (as the safe package calls it) can be expressed through the Representable class. There is a fair amount of theory associated with Representable. For our current purposes, we can focus on...

A Functor f is Representable if tabulate and index witness an isomorphism to (->) x.

... and:

Representable endofunctors over the category of Haskell types are isomorphic to the reader monad and so inherit a very large number of properties for free.

Since Representable functors are isomorphic to functions from some type (e.g. an homogeneous pair is isomorphic to Bool -> a, and an infinite stream is isomorphic to Nat -> a), exact zipping can be achieved by zipping the functions pointwise. That is what mzipRep, the default implementation for MonadZip's mzip, does:

mzipRep :: Representable f => f a -> f b -> f (a, b)
mzipRep as bs = tabulate (index as &&& index bs)

While MonadZip is a rather awkward class (it is primarily part of the implementation of the MonadComprehensions extension), it has a relevant law, which I will restate it in non-monadic terms:

Information preservation: if () <$ u = () <$ v then munzip (mzip u v) = (u, v)

In other words, if u and v have the same shape, then mzip does not drop information (and so it can be undone by munzip). As Representable implies there being just one possible shape, it allows us to drop the condition, thus getting exact zipping.


Tangential notes:

The ZipList definition is clearly unsafe. It is defined to, given two lists of distinct length, crop the longer one to the length of the shorter.

I'd say that depends on what you want to use zipping for. Sometimes you will want or need exact zipping, and sometimes you won't (for instance, consider the commonplace trick of attaching indices to a list with zip [0..]); and sometimes padding rather than trimming will be what makes sense (cf. leftaroundabout's comment). That is why I prefer calling exact zipping "exact", rather than "safe".

There is, however, an annoying complication in that there are very often two ways to define the Applicative that both appear suitable. Why so? I propose that the answer is "not enough laws".

I very much disagree with the view that a class is underspecified if it allows more than one instance for some data type. I'd rather say that e.g. lists with the cartesian product applicative and lists with the zipping applicative are different structures, characterised by the relevant morphisms -- it just happens that they can be represented in Haskell through the same data type.

like image 58
duplode Avatar answered Mar 03 '23 11:03

duplode