Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is using mapM/sequence considered good practice?

Tags:

haskell

Consider following example:

safeMapM f xs = safeMapM' xs []
    where safeMapM' []     acc = return $ reverse acc
          safeMapM' (x:xs) acc = do y <- f x
                                    safeMapM' xs (y:acc)

mapM return largelist      -- Causes stack space overflow on large lists
safeMapM return largelist  -- Seems to work fine

Using mapM on large lists causes a stack space overflow while safeMapM seems to work fine (using GHC 7.6.1 with -O2). However I was not able to find a function similar to safeMapM in the Haskell standard libraries.

Is it still considered good practice to use mapM (or sequence for that matter)?
If so, why is it considered to be good practice despite the danger of stack space overflows?
If not which alternative do you suggest to use?

like image 478
jonnydee Avatar asked Mar 21 '13 11:03

jonnydee


2 Answers

As Niklas B., the semantics of mapM are those of an effectful right fold, and it terminates successfully in more cases than a flipped version. In general, mapM makes more sense, as it is rare that we would want to do a result-yielding map on an enormous list of data. More commonly, we'll want to evaluate such a list for effects, and in that case mapM_ and sequence_, which throw away the results, are typically what are recommended.

Edit: in other words, despite the issue raised in the question, yes, mapM and sequence are commonly used and typically considered good practice.

like image 187
sclv Avatar answered Nov 18 '22 22:11

sclv


If so, why is it considered to be good practice despite the danger of stack space overflows? If not which alternative do you suggest to use?

If you want to process the list elements as they are generated, use either pipes or conduit. Both will never build up an intermediate list.

I'll show the pipes way, since that is my library. I'll first begin with an infinite list of numbers generated in the IO monad from user input:

import Control.Proxy

infiniteInts :: (Proxy p) => () -> Producer p Int IO r
infiniteInts () = runIdentityP $ forever $ do
    n <- lift readLn
    respond n

Now, I want to print them as they are generated. That requires defining a downstream handler:

printer :: (Proxy p) => () -> Consumer p Int IO r
printer () = runIdentityP $ forever $ do
    n <- request ()
    lift $ print n

Now I can connect the Producer and Consumer using (>->), and run the result using runProxy:

>>> runProxy $ infiniteInts >-> printer
4<Enter>
4
7<Enter>
7
...

That will then read Ints from the user and echo them back to the console as they are generated without saving more than a single element in memory.

So usually if you want an effectful computation that generates a stream of elements and consumes them immediately, you don't want mapM. Use a proper streaming library.

If you want to learn more about pipes, then I recommend reading the tutorial.

like image 24
Gabriella Gonzalez Avatar answered Nov 18 '22 22:11

Gabriella Gonzalez